What AI Coding Assistants Are — And Aren’t — Good For
By now, everybody in the tech industry has tried an AI coding assistant, be it GitHub Copilot or just plain-old ChatGPT. Let me start by acknowledging that it’s truly impressive that you can feed these tools a statement of preconditions and postconditions and working (or at least working-ish) code comes out the other end. These tools will certainly change the way that most programmers do their jobs in the future.
A typical example input prompt to an AI coding assistant might go something like, “Write me a function in Java that takes an Excel file from the local file system, converts it to CSV, and posts it to the S3 bucket that I specify.” With a statement like that, you can usually get quite good results, particularly given that the various parts (if not the entirety) of that requirements statement have already been writtten and posted to the Internet thousands of times.
This is not just a party trick. Of course some people have jumped all the way from there to the conclusion that large numbers programming jobs are put in near-term jeopardy by AI code generators. One problem I see with this conclusion is that
AI coding assistants are addressing the least interesting parts of modern application development.
Coding assistants do a fantastic job of automating less novel, more precedented and boilerplate functions. And they often do it in ways that avoid mistakes typically made by less experienced programmers, particularly around error-handling and the usage of language idioms.
With the possible exception of error-handling, those aren’t the biggest problems I see in code reviews.
Take the shiny new code I asked my AI coding assistant to generate above. It is a “function”, or at best a “script”. Usually what we’re working on are “applications”. So we need to place this code somewhere in context. That’s where the essence of modern software engineering lies, and where most code review issues come from. For example, when doing a code review for a non-trivial new function, we’re often questioning:
What object, package, file, or folder does this function belong in?
Does it follow the naming and stylistic conventions of the rest of the code base?
Does it make best use of the libraries and frameworks already in place in the project?
Does it have proper unit and/or integration tests in the context of this application?
Is it even written to be testable in the first place?
In light of this and depending on how carefully the programmer vets the AI-generated code before sending it for review,
It’s possible that AI-generated code may lead to more review-phase churn than hand-written code would have.
One objection to this analysis is that I could have asked for something much higher-level, say, a fully formed MERN-stack application that meets certain business requirements. AI coding assistants aren’t there yet, although I’m sure they’ll improve and maybe we can revisit this in a few years. But in the meantime, I’d suggest that generating more of the application this way just generates more of the same problems, and at new and riskier levels of architectural and design abstraction.
This isn’t the first time programmers were supposed to get orders-of-magnitude productivity improvements. In the 80s and 90s, there was a focus on the architecture and design layers with CASE and UML-based code generators.
I learned on these things, and later taught them to students myself. And the main reason CASE and UML as code generation tools never really caught on at scale was because although the tools were quite good at creating application structure (and keeping design and code in-sync), you were left with an imposing number of empty functions to fill in. And once you started filling them in, “round-tripping” back to the tools to make modifications and do maintenance was a very fragile process.
So if CASE/UML is good at generating application structure and AI code assistants are good at generating working functions, maybe there’s an opportunity to combine the two to help achieve the original productivity vision. Let me know if you’re aware of folks already working on this!