Programming with AI
1. Introduction
I recently discussed how we use Co-Pilot and ChatGPT for programming with some of my senior colleagues. We discussed our experiences, how and when it helps, and what to expect from it in the future.
In this article, I will shortly write about what I imagine the future of programming with AI will be. This is not about what AI will do in programming in the future. I have no idea about that, and based on my mood, I either look forward amazed or in fear. This article is more about how we, programmers, will do our work in the future.
2. The past
To predict the future, we have to understand the past. If you know only the current state, you cannot reliably extrapolate. Extrapolation needs at least two points and knowledge about the speed of change in the future (maximum and minimum of the derivative function). So, here we go, looking a bit at the past, focusing on the aspects that I feel are mainly important to predict how we will work in the future with AI in programming.
2.1. Machine Code
When computers were first introduced, we programmed them in machine code. This sentence should read, "your father/mother programmed them in machine code", for most of you. I had the luck to program a Polish clone of the PDP-11 in machine code.
To create a program, we used assembly language. We wrote that on a piece of checkerboard paper(and then we typed it in). No.
Note
|
As I write this article, Co-Pilot is switched on, suggesting the sentences' ends. In 10% of the cases, I accept the suggestion. Co-Pilot suggested the part between ( and ) in the last sentence. It’s funny that even Co-Pilot cannot imagine a system where we did not have an assembler. |
We wrote the assembly on the left side of the paper and the machine code on the right. We used printed code tables to look up the machine codes, and we had to calculate the addresses. After that, we entered the code. This was also a cumbersome process.
There were switches on the front panel of the computer. There were 11 switches for the address (as far as I remember) and eight switches for the data. A switch flipped up meant a bit with a value of 1, and a switch flipped down meant a bit with a value of 0. We set the address and the desired value, and then we had to push a button to write the value into the memory. The memory consisted of ferrite rings that kept their value even after the power was switched off.
2.2. Assembly
It was a relief when we got the assembler. It was already on a different machine and a different processor. We looked at the machine code that the assembler generated a few times, but not many times. The mapping between the assembly and the machine code was strictly one-to-one mapping.
The next step was higher-level languages, like C. I wrote a lot of C code as a hobby before I started my second professional career as a programmer at 40.
2.3. Close to the Metal
The mapping from C to machine code is not one-to-one. There is room for optimization, and different compiler versions may create different code. Still, the functionality of the generated code is very much guaranteed. You do not need to look at the machine code to understand what the program does. I can recall that I only did it a few times.
One of those times, we found a bug in the Sun C compiler (1987, while I was on a summer program at TU Delft). It was my mistake the other time, and I had to modify my C code. The compiler knew better than I did what the C construct I wrote meant. I do not have a recollection of the specifics.
We do not need to look at the generated code; we write on a high level and debug on a high level.
2.4. High Level
As we advance in time, we have Java. Java uses a two-level compilation. It compiles the Java code to byte code, which the Java Virtual Machine, JIT technology interprets. I looked at the generated byte code only once to learn the intricacies of the ternary operator type casting rules, and never the machine code generated. The first case could be avoided by reading the language spec, but who reads manuals?
The same is true here: we step to higher levels of abstraction and do not need to look at the generated code.
2.5. DSL and Generated Code
Even as we advance towards higher levels, we can have Domain Specific Languages (DSLs). DSLs are
-
interpreted,
-
generate high-level code, or
-
generate byte code and machine code.
The third case is rare because generating low-level code is expensive, requires much work, and is not worth the effort. Generating high-level code is more common. As an example, we can take Java::Geci fluent API generator. It reads a regular expression like the definition of the fluent API, creates a finite state machine from it, and generates the Java code containing all the interfaces and classes that implement the fluent API. The Java compiler then compiles the generated code, and the JVM interprets the resulting byte code.
Should we look at the generated code? Usually not. I actually did a lot because I wrote the generator, and so I had to debug it, but that is an exception. The generated code should perform as the definition says.
3. The Present and the Future
The next step is AI languages. This is where we are now, and it starts now. We use AI to write code based on some natural language description. The code is generated, and we have to look at it.
This is different from any earlier steps in the evolution of programming languages. The reason is that the language AI interprets is not definite the same way as Java, C, or any DSL. It can be ambiguous. It is a human language, usually English. Or something resembling English when non-native speakers like me write it.
3.1. Syntax-free
This is the advantage of AI programming. I do not need to remember the actual syntax. I can program in a language I rarely use and forget the exact syntax. I vaguely remember it, but it is not in my muscle memory.
3.2. Library-free
It can also help me with my usual programming tasks. Something that was written by other people many times before. It has it in its memory, and it can help me.
The conventional programming languages have it, but with a limited scope. There are language constructs for the usual data structures and algorithms. There are libraries for the usual tasks.
The problem is that you have to remember the one to use it. Sometimes, writing a few lines is easier than finding the library and the function that does it. It is the same philosophy as the Unix command line versus VMS. (You may not know VMS. It was the OS of the VAX VMS and Alpha machines from DEC.) If you needed to do something in VMS, there was a command for it. In Unix, you had simpler commands, but you could combine them.
With AI programming, you can write down what you want using natural language, and the AI will find the code fragments in its memory that fit the best and adapt it.
3.3. AI Language
Today, AI is generating and helping to write the code. In the future, we will tell the AI what to do, and it will execute it for us. We may not need to care about the data structure it stores the data or algorithms it applies to manage those.
Today, we think of databases when we talk about structured data. That is because databases are the tools to support the limited functionality a computer can manage. Before the computers, we just told the accountant to calculate the last year, whatever profit, balance sheet, whatnot, and they did. The data was on paper, and the managers did not care how they were organized. It was expensive because accountants are expensive. The intelligence they applied, extracting data from the different paper-based documents, was their strong point; calculation was just a mechanical task.
Computers came, and they were strong doing the calculations. They were weak in extracting data from the documents. The solution was to organize the data into databases. It needed more processing on the input, but it was still cheaper than having accountants do the calculations.
With AI, computers can do calculations and extract data from documents. If it can be done cheaply, there is no reason any more to keep the data in a structured way. It can get structured when we need them for a calculation on the fly. The advantage is that we can do any calculation, and we may not face the issue that the data structure is unsuitable for the calculation we need. We just tell the AI program using natural language.
Is there a new patient coming to the practice? Just tell the program all the data, and it will remember like an assistant with unlimited memory who never forgets. Do you want to know when a patient last visited? Just ask the program. You do not need to care how the artificial simulated neurons store the information.
It certainly will use more computing power and energy than a well-tuned database, but on the other hand, it will have higher flexibility, and the development cost will be significantly lower.
This is when we will talk to the computers, which will help us universally. I am not shy about predicting this future because it will come when I will not be around anymore. But what should we expect in the near future?
3.4. The near future
Now, AI tools are interactive. We write some comments or code, and the AI generates the code for us, which is the story’s end. From that point on, our "source code" is the generated code.
You can feel from the previous sentence the contradiction. It is like if we would write the code in Java once, then compile it into byte code, and then use the byte code to maintain it. We do not do that.
Source code is what we write. Generated code is never source code.
I expect meta-programming tools for various existing languages to extend them. You insert some meta-code (presumably into comments) into your application, and the tool will generate the code for you. However, the generated code is generated and not the source. You do not touch it. If you need to maintain the application, modify the comment, and the tool will generate the code again. It will be similar to what Java::Geci is doing.
You insert some comments into your code, and the code generator inserts the generated code into the editor-fold block following the comment. Java::Geci currently does not have an AI-based code generator, or at least I do not know about any. It is an open-source framework for code generators; anyone could write a code generator utilizing AI tools.
Later languages will include the possibility from the start. These languages will be some kind of hybrid solution. There will be some code described by human language, probably describing business logic, and some technical parts more like a conventional programming language. It is similar to how we apply DSL today, with the difference that the DSL will be AI-processed.
As time goes forward, the AI part will grow, and the conventional programming part will shrink to the point when it will disappear from the application code. However, it will remain in the frameworks and AI tools, just like today’s machine code and assembly. Nobody codes in assembly anymore, but wait? There are still people who do. Those who write the code generators.
And those who will still maintain 200 years from now in the future the IBM mainframe assembly and COBOL programs.
4. Conclusion and Takeaway
I usually write a conclusion and a takeaway at the end of the article. So I do it now. That is all, folks.
Comments
Please leave your comments using Disqus, or just press one of the happy faces. If for any reason you do not want to leave a comment here, you can still create a Github ticket.