Using AI for coding

Overview

In this unit, we discuss using LLM AI tools to help write code.

Goals

Know how and why to use current LLM AI tools to help with coding.
Be familiar with several approaches of LLM AI assisted coding.

Reading

Introduction

If you write code, there are (at least) two major parts to the effort. First, you have to figure out what exactly you want to accomplish with your code. Second, you need to write a bunch of commands in the programming language of your choice to get what you are hoping to accomplish.

The first part is generally the intellectually more challenging one, and a step which AI can help somewhat, but this is largely still driven by the expert modeler/analyst (that would be you). The second part is generally less hard, but it can be very tedious and time-consuming, especially if you are new to coding or if you need to write a lot of code. AI tools are getting very good at assisting with the actual writing of code.

We are already in a scenario where most coders have AI tools running that help with writing code. Sometimes this is just the coding of small parts of a project, but increasingly AI tools can write complex bits of software rather well. This can lead to much more efficient, and potentially also better code. However, it is still important that a human understands, directs and evaluates what the AI produces.

Good prompting

To get good results from the AI, it is important that you be as specific as you can with your prompt.

Try this prompt with one of the LLM AI:

Write R code that generates a scatterplot and a violin plot.

Your result might or might not be close to what you had in mind. If you are not providing a lot of details, the AI can decide what to do and sometimes it is close to what you had in mind, but often it is not.

Now try this prompt:

Write R code that generates a dataset of 100 individuals with ages from 18 to 49, BMI values from 15 to 40 and smoking status as yes or no. Assume that age and BMI are uncorrelated. Assume that smokers have a somewhat lower BMI. Then use the patchwork R package to generate a panel of ggplot2 plots. The first panel should show a violin plot with BMI on the y-axis and smoking status on the x-axis. The second panel should show a scatterplot with age on the y-axis and BMI on the x-axis. Add thorough documentation to your code.

Quite likely, the second prompt will lead to code that is very close to what you had in mind, since you specified a lot of details.

Good prompts can be quite long, but they can also save you time by not having to go back and forth between your requests and what the AI produces. Also, detailed prompts force you to carefully think through what you want the code to do, and how you want things structured.

You can of course always iterate and ask for changes. But the more you can specify up front, the better.

It can be useful to write down AI prompts so you remember what you asked for. That also helps somewhat with reproducibility. You could for instance stick AI prompts at the top of R/Quarto files, or place them into a separate file.

You also notice that for the second prompt, I had to know more about the programming language, for instance I had to know that there is a package called ggplot2 and one called patchwork.

The importance of prompts also applies when you are trying to update or fix code. The more detailed you can tell the AI what went wrong, for instance giving it the exact error message, the more likely it is that you get a useful response.

You will find that the more you broadly know about a topic, the more useful those AI tools become. This means that you still have to learn some coding (or whatever the topic is) and understand it enough on a big picture level to be able to be useful. But you don’t necessarily need to be an expert.

Think of being like a composer, who needs to know enough about the various instruments of an orchestra to be able to write music for each instrument. But they don’t need to be able to play each instrument. Similarly, you need to know enough about coding or whatever the topic is you are working on to compose prompts for the AI and evaluate what it produces, but you don’t necessarily need to be an expert coder.

Iterating

Often, you might not get exactly what you want from the AI with your first prompt. Quite likely, you realize that you weren’t specific enough, or that you really wanted something slightly different but didn’t properly specify it in the prompt. Often, the code might also not quite work. The AI might have just made up a package or function that doesn’t exist, or otherwise introduced mistakes.

While it would be nice to get a great product on the first try, the process is so fast that it doesn’t matter much. Just try again. You can either update your prompt and feed it to the AI again. Or you can tell it what changes you want to make.

Ask for explanations and documentation

Both when you use AI to write new code, or have it look at existing code, always insist on it providing detailed explanations and documentation. If you do this consistently, you will likely get first versions of new or updated code that are already fairly understandable.

And then you can of course keep asking for more details. For instance if there’s a code snippet you found online, or something you wrote a while ago but don’t understand anymore what you were doing, chances are the AI can help you understand it. Just feed it the code and ask for explanations.

Similarly, if you have a bug in your code, you can ask the AI to fix it, but you should also ask it to explain what the problem was and how it fixed it. If it’s a silly problem like using the wrong object for an operation, the explanation might not be too insightful. But if it’s a more complex problem, the explanation can be very helpful. This way you can learn from the process and avoid similar problems in the future.

Switching AI tools

As with all tasks, it can happen that a certain AI tool just gets stuck either writing or updating/fixing code. It keeps making the same mistake. Even good prompting and nudging won’t get it unstuck. In that case, it might be worth trying a different tool. That could be a different model from the same company, or a different AI provider altogether. Different models have different strengths and weaknesses, and sometimes switching can sometimes get you unstuck.

Work modular

While the AI tools are getting increasingly better at doing big projects one at a time, it is still often useful to break down tasks into bits. Not only can you provide more guidance for each bit, but it is also easier to evaluate smaller bits of code. For instance, if you want to write code that does something fairly complex, you can first ask the AI to write code for smaller sub-tasks, then combine those into a bigger function. This way, if something goes wrong, you can more easily identify where the problem is. It also makes it easier to ask for explanations and documentation for smaller bits of code.

Manual intervention

The overall goal is to get working code as quickly as possible, not to have things completely AI generated. Therefore, if you reach a stage where you realize it’s faster if you just write or fix the code yourself, go ahead and do so.

AI is perfectly fine with this. You can just take the code you have, make your manual changes, and then feed the updated code back to the AI and ask it to continue from there. Just make sure the AI is aware that things have changed. Sometimes, when you use AI locally to work on a project, it might read files at some stage, and not re-load/re-read them later when you have made manual changes. In that case, you might have to inform the AI about the changes you made and tell it to re-read the files.

Summary

LLM AI tools are very helpful at assisting with writing, editing and explaining code. The more you know what you want, and the more specific you can be (which requires some level of subject matter expertise) the better your results. Rarely do you get exactly what you want on the first try, but iterating is easy and fast.

This list of tips will help you be as efficient as possible:

Be as detailed and specific as possible.
Iterate. Either only AI-based iterations or a mix of manual and AI iterations.
Try different AI engines or settings or prompt phrasing.
Ask the AI to add a lot of comments into the code to explain what each line of code does.
Break down big tasks into smaller tasks, ask the AI to solve the smaller tasks, then put it together.

Further Resources

The Posit Blog often has articles about using AI tools for coding, often related to R.

Practice

Write a detailed prompt for a small coding task and compare it with a vague version.
Ask the AI to add comments to the code it generates and check whether they are accurate.
Find some code online that you don’t understand and ask the AI to explain it to you.