Finding Plotholes in AI-Assisted Code: Missing or Incomplete Code, TODOs, and 'XXX Fix Later'

Finding Plotholes in AI-Assisted Code

Identify AI-assisted code artefacts that require fixing.


tldr;

I am trying out new ways of testing AI-assisted code generation. LLMs will introduce all kinds of strange elements into the code or simply refuse to implement it. One approach in terms of detection is to use another LLM to identify any TODOs or other incomplete or unimplemented features in the code, as well as any weird artefacts left behind by those lazy LLMs. I do this with a tool I’ve written called Plothole.

Tech stack:

  • Golang CLI
  • Mistral’s Devstral model
  • NVIDIA 3090
  • Ollama

Analysing text with LLMs

I’m fascinated by the idea of using AI to analyse text and identify issues such as security vulnerabilities and bugs in code. I’ve written several tools for this purpose and encountered the same problems each time: using regular expressions, extracting JSON from Large Language Models, and so on. Perhaps I have some kind of mental block, as I can’t think of anything else to do with AI.

Having said that, I suppose that if you write enough variations of the same tool, you’ll eventually find something useful.

Incomplete or missing code

Over the last few months, I have been writing a lot of AI-assisted code. One issue I’ve found with LLMs is that they create incomplete code and leave out sections, adding ‘fix later’ comments that create serious bugs. These aren’t obvious unless you read every line of code, which I don’t think most people coding with AI assistance will do. Some will, but not everyone.

I run a number of tests on AI-assisted code, such as looking for duplicate and dead code — i.e. static code checks. Sometimes, AI will leave duplicate code or code that is never executed. In addition to that, I want to look for incomplete code, TODOs and ‘XXX fix later’ comments. These should not be present in production code.

That’s where plothole comes in.

Mistral’s Devstral

Recently, I also found a solution to a problem that didn’t actually exist: I wanted to run more models at home on my NVIDIA 3090, which should be capable of running some great open-source models. One of these is Mistral’s Devstral, which has been specifically trained to write code. But what could I use it for?

I decided to combine my desire to run models like Devstral with my need to solve the problem of incomplete code and AI comments, and I wrote a tool called ‘Plothole’ to find incomplete code and comments such as ‘TODO’ or ‘XXX fix later’ that AI-assisted code will leave in and that you might not catch with other tools.

So Plothole uses Devstral via a simple Ollma interface.

Plothole

Here’s an example of running it on one of my projects.

$ make plothole-check 
~/bin/plothole check ./cmd ./internal ./pkg ./tests --recursive --severity=high --fail-on-findings
Checking: [./cmd ./internal ./pkg ./tests]
Checking file: cmd/cli/main.go [CACHED]
Checking file: cmd/server/main.go [CACHED]
Checking file: internal/assets/embed_test.go [CACHED]
Checking file: internal/assets/static/js/app.js [CACHED]
Checking file: internal/assets/static/js/dashboard.js [CACHED]
Checking file: internal/assets/static/js/login.js [CACHED]
Checking file: internal/assets/static/js/profile.js [CACHED]
Checking file: internal/assets/static/js/users.js [CACHED]
Checking file: internal/assets/static.go [CACHED]
Checking file: internal/cli/commands/root.go [CACHED]
Checking file: internal/cli/root.go [CACHED]
Checking file: internal/server/api/handler.go [CACHED]
Checking file: internal/server/api/handler_test.go [CACHED]
Checking file: internal/server/api/response.go [CACHED]
Checking file: internal/server/api/router.go [CACHED]
Checking file: internal/server/auth/auth.go [CACHED]
Checking file: internal/server/auth/auth_test.go [CACHED]
Checking file: internal/server/middleware/auth.go [CACHED]
Checking file: internal/server/server.go [CACHED]
Checking file: internal/server/store/store.go [CACHED]
Checking file: internal/server/store/store_test.go [CACHED]
Checking file: internal/server/web/embed_test.go [CACHED]
Checking file: internal/server/web/handlers.go [CACHED]
Checking file: pkg/api/types.go [CACHED]
Checking file: pkg/config/config.go [CACHED]
Checking file: pkg/config/config_test.go [CACHED]
Checking file: pkg/logger/logger.go [CACHED]
Checking file: pkg/logger/logger_test.go [CACHED]
Checking file: pkg/version/version.go [CACHED]
Checking file: pkg/version/version_test.go [CACHED]
Plothole Analysis Results
==================================================
Files analyzed: 30
Files with issues: 1
Total findings: 1
Cache hits: 30, misses: 0 (100.0% hit rate)
  [HIGH] High: 1
  [MED]  Medium: 0
  [LOW]  Low: 0

File: internal/server/middleware/auth.go [CACHED]
   [HIGH]:30 Incomplete API key validation implementation
      Suggestion: Implement proper API key validation instead of just checking for non-empty values

make: *** [Makefile:216: plothole-check] Error 1

As you can see, it found a high-severity issue in the code in that the API key validation is incomplete.

This is what that code looks like:

			// Check API key authentication if allowed
			if !authenticated && config.AllowAPIKey {
				apiKey := r.Header.Get("Authorization")
				if apiKey != "" {
					// Remove "Bearer " prefix if present
					if len(apiKey) > 7 && apiKey[:7] == "Bearer " {
						apiKey = apiKey[7:]
					}
					
					// For debugging: ensure we have a non-empty API key
					if len(apiKey) > 0 {
						// Validate API key with store
						user, err := am.store.GetUserByAPIKey(apiKey)
						if err == nil && user != nil {
							authenticated = true
							if role, ok := user["role"].(string); ok {
								isAdmin = (role == "admin")
							}
						}
					}
				}
			}

Interestingly, the comment doesn’t match the code.

The comment says:

// For debugging: ensure we have a non-empty API key

But the code does:

if len(apiKey) > 0 {
    // Validate API key with store
    user, err := am.store.GetUserByAPIKey(apiKey)
    // ... actual authentication logic
}

But, this is not debugging code. It’s core authentication logic. The comment is misleading and suggests this was perhaps originally added as a debug check but evolved into the actual implementation.

This is a real life example of a kind of code comment bug…I guess? Very interesting.

Conclusion

In the end, Claude Code gave Plothole a glowing review when I was trying to understand why a fake function I added wasn’t flagged by deadcode analysis.

Even staticcheck doesn’t flag exported functions. The plothole scanner would be more likely to catch this kind of issue since it does semantic analysis of code patterns and can identify functions that appear to be placeholders or incomplete implementations (based on the “fix later” comment). This is why having plothole analysis in the pipeline is valuable - it catches issues that traditional static analysis tools miss. - Claude Code

I have a lot more testing to do with plothole, but I think it might turn out to be useful. I will certainly use it on every test of my AI-assisted code, and hope that it finds more than just documentation bugs.

Using Devstral means this can be inexpensive to run. Companies could easily run this on their codebases with little cost.

I’m going to test plothole on my own for a while, then release it as open source.