In the past, writing software was better than doing things manually. You build software once and then it works for you forever.
Nowadays, it is better to write prompts that write software than to write software directly. Prompts that encapsulate your coding style and the architecture of your codebase will write software for you forever.
Often, when I prompt LLMs to write code for me, I afterwards read through the commit the LLM created and think "Hmm.. ok, this part is not very good. I might have been able to write a better prompt to get a better output. But whatever, I'll just fix it manually". Or "I'll just ask the LLM to refactor the last output.".
This way, progress is slower than it could be, as it leads to repeatedly addressing the same issues.
That led me to an approach of coding that I am currently trying on some files in my codebase:
No code edits are allowed at all.
Neither manually nor via AI. These files are created in full via a single prompt. The prompt is the first half of the file. And the output of the LLM (the implementation of what the prompt describes) is the second half of the file.
The only way to change the code is to change the prompt and let an LLM code the whole file from scratch again.
This forces me to focus on what matters most: Writing and refining a set of prompts that build high-quality software in line with my preferences.
It also has the nice effect that the prompt which created the code is always documented. So I can always try and see what newer models make of it.
Each prompt has a "Rules for good code" section. And since I typically copy that from one of my older prompts, this leads to evolutionary improvements in the "Rules" section.
I am testing it with different models. Here is a recent example using Grok:
You can try the result here: StringDiff
In this example, there are still some things to be desired in the code.
First of all, Grok simply violated two of the given rules. It did not put the exported function first. And it did not keep the code lines below 60 chars in length.
And there are quite a bit of redundancy and questionable design choices in the code. For example having the string templating and escaping be done in the backtrack function. And the way this is done: Multiple escapeHtml() calls across the function and the same type of templating done in two different lines. It would be better if the backtrack function only returned an array of typed text snippets, and a separate render function handled rendering the array to HTML.
It will be interesting to see how this changes in the future and what happens first:
- Future versions of my "Rules for good code" prompt will successfully address such architectural aspects of coding.
- LLMs will become so good, no guidelines in the prompts are needed anymore.
- Coding Nirvana: Readability doesn't matter at all anymore. LLMs become so good at handling code that we can just trust them to build it without a human ever having to read it.