Categories
ruminations Software Work Writing

LLMs are good coders, useless writers

My writer friends say Large Language Models (LLMs) like ChatGPT and Bard are overhyped and useless. Software developer friends say they’re a valuable tool, so much so that some pay out-of-pocket for ChatGPT Plus. They’re both correct: the writing they spew is pointless at best, pernicious at worst. … and coding with them has become an exciting part of my job as a data analyst.

Here I share a few concrete examples where they’ve shined for me at work and ruminate on why they’re good at coding but of limited use in writing. Compared to the general public, computer programmers are much more convinced of the potential of so-called Generative AI models. Perhaps these examples will help explain that difference.

Example 1: Finding a typo in my code

I was getting a generic error message from running this command, something whose Google results were not helpful. My prompt to Bard:

Bard told me I had a “significant issue”:

Yep! So trivial, but I wasn’t seeing it. It also suggested a styling change and, conveniently, gave me back the fixed code so that I could copy-paste it instead of correcting my typos. Here the LLM was able to work with my unique situation when StackOverflow and web searches were not helping. I like that the LLM can audit my code.

Example 2: Writing a SQL query

Today I started writing a query to check an assumption about my data. I could see that in translating my thoughts directly to code, I was getting long-winded, already on my third CTE (common table expression). There had to be a simpler way. I described my problem to Bard and it delivered.

My prompt:

Bard replied:

It was accompanied by a good explanation, but I didn’t read it – this was clearly correct. And it was much shorter than my partial solution.

I’ve learned that even in cases like this where I can arrive at a good answer myself, it’s faster to evaluate and modify an LLM’s code than it is to generate it myself. This is not the case with writing prose: prompting Bard to draft an email and editing the result takes more time and yields a worse result than just writing it.

Example 3: working in an unfamiliar tech stack

This one stunned me. In layman’s terms, I have a program that occasionally and silently stops working. I’d like to be notified of the outage before it causes problems in related systems.

I asked:

Here is the complete response. It’s quite thorough; scroll past if the details bore you.

Watching this response spool out upon my screen felt magical. It took me instantly from scratching my head to 80% of the way. I ended up using most of this – here’s my final version of the monitoring script if you wish to compare. The main differences are that I used a dedicated mail service (per its suggestion) and eliminated the recursive call at the end.

I asked Bard for help setting that dedicated mail service it suggested. It struck out, but that only cost me a few minutes. Then I found a good answer on StackOverflow. StackOverflow remains of critical importance in this era of LLMs. Its biggest advantage is that upvoted answers carry a seal of merit. LLM output is always caveat emptor.

This problem was in the sweet spot for LLM assistance: this is a new area for me and I would have struggled to come up with any piece of this myself. Each part – fetching the container health value, determining the syntax and logic of a bash script, and adding it to cron – would have taken multiple searches and trial-and-error. However, I know enough to tweak and implement the LLM’s solution. One minute spent prompting the LLM saved me several hours, most likely, and produced a simple solution.

Helpful coder, pointless writer

I’ve been pondering why LLMs have such promise for coding and not for writing (I also write fiction). It comes down to the purposes of writing vs. coding and how those align, or don’t, with the nature of an LLM. When it comes to writing, the mismatch is so fundamental that even with great improvements, I see few legitimate uses for LLMs as writers. (Anti-social uses already exist, see below).

Regarding the nature of an LLM, my friend George suggested I think of LLMs as an ultra-powerful autocomplete function trained on the entire internet. They produce a result close to the average of its training data, are unoriginal, and can’t assess whether something is true.

Why we code vs. why we write

The goal of programming is to get a computer to do a task. Coding is a means to that end. Ideally I would deploy existing software and avoid writing code entirely. But in practice, something needs customizing or creating. These tasks have some wrinkle(s) that make them unique but are overall quite similar to problems that thousands of others have solved.

This is a perfect fit for LLMs. How many SQL queries like the response to prompt above was Bard trained on? Millions? So it can autocomplete that perfectly. When it comes to instructions for a computer, I want unoriginality.

There is no fact to get wrong in coding and I can validate empirically if the solution does what it says. (I build up from small pieces, as they’re easier to verify and the LLM is more likely to get those right).

Contrast this with writing, who main purpose is to convey information. Such information falls into two categories.

Some of this information is known only to me: that I think we should cancel this afternoon’s meeting, or a funny remark from my kindergartner that I wish to share. Other information is known to the world, and thus to the LLM. For instance, that the song “Sleigh Ride” was written by composer Leroy Anderson.

The first case is simple: the LLM is no help here. I would have to feed all of the novel information into the prompt and at that point I’ve written the message myself. Better to just do it in my voice from the beginning. (There was no way to make use of an LLM in writing this post).

In the case of known information, the LLM was trained on it, and thus “knows” it – which means it’s on probably on the web. To convey that information, I’d simply link to the best one of those articles or webpages. I need a search engine, not an LLM. Having the LLM “write” this information simply introduces blabber and potential inaccuracies on top of the same underlying article.

The only use cases at present where an LLM’s “writing” is useful are those where the point is not to convey information. These are bad-faith applications that exploit the reader’s expectations of the interaction. All feel yucky and involve deceit to various extents. Here are a few examples I’ve encountered of people making use of LLM text generation:

  • An acquaintance at the University of Michigan told me their colleague was using ChatGPT to pad a grant application. They had a couple of pages of content, but such concise bullet points were not the right format for submission. They used the LLM to fluff out the document until it was the customary length and structure.
  • Similarly, a high school teacher told me he was asked to write a letter of recommendation on short notice. The student gave him a “brag sheet”, the teacher plugged it into ChatGPT and edited the result. The teacher said this saved them time and produced an adequate result.
  • A friend who owns a small business has someone regularly write posts on his website. The exact content doesn’t matter, he just does it so that his website stays competitive in search engine results. When he saw ChatGPT he said “great! This thing can crank those out for us.”
  • The “SEO Heist” by Jake Ward. He stole a competitor’s website traffic by using an LLM to write his own versions of thousands of their webpages. The case illustrates one of the few ways in which LLM writing is currently “useful,” in the sense of making money, while making the world worse.

The phenomenon of generating meaningless articles and passing them off as authoritative has been underway for a while, as content mills churned out pages of dubious merit to attract pageviews. This fall, I found myself searching for how to get gum off of a leather car seat. I read several articles and didn’t trust that any of it was factual. So I employed the trick of adding “site:reddit.com” to my search, restricting the results to Reddit, where I can read something written by a human.

Finding answers on the web has gotten harder and it’s about to get much worse thanks to LLMs.

So there you have it: LLMs are incredible tools for coding! But their text generation is “useful” only where it violates the societal norm that we communicate in order to inform each other. Interesting times.

Further reading on LLMs

I found these articles insightful. They’re all coding-oriented; if folks have encountered good takes on LLMs and writing, please share.

  • A Coder Considers the Waning Days of the Craft (The New Yorker) – sentimental musings on topics covered in this post. Coding as we know it might change, but “hacking is forever”.
  • The Three Types of AI-Assisted Programmers (StackOverflow Blog)- more sober about LLM limitations than I expected given the platform. The takeaway: “If a firehose of mid-quality code could fit into your process somewhere, and you’re wary of the pitfalls, AI can be a great tool to have in your team’s toolbelt.”
  • ChatGPT x Superset (The Preset Blog) – for technical readers. The part of the article I linked to has a nice exploration of what ChatGPT is good at and not, in the context of Apache Superset and data visualization. Reading this pushed me to use LLMs more for writing (and sometimes interpreting) SQL.

3 replies on “LLMs are good coders, useless writers”

Thanks, Sam! I get asked about this stuff all the time — and your post gives me a helpful framework for thinking/talking about it . . .

How does the following statement:

SELECT t1.vin, t1.snapshot_date, t1.locale
FROM EV_VINs_Presence t1
LEFT JOIN EV_VINs_Presence t2
ON t1.vin = t2.vin
AND t1.snapshot_date = t2.snapshot_date
AND t2.locale = ‘State’
WHERE t1.locale = ‘Ann Arbor’
AND t2.vin IS NULL;

Match your request, especially given the condition that “t2.vin IS NULL”?
In fact, your request specifically states that everything should match except locale. How is this clearly correct to you? Can you share the explanation that was given?

This seems like a blatant error to me, but maybe I’m wrong, would you mind elaborating?

Bard lost my query history, maybe when it rebranded to Gemini- only my most recent query is there. So its explanation is lost. It’s certainly possible that I’m wrong, but here’s my explanation:

This takes the Ann Arbor VINs, in a table called t1, and left-joins the State VINs (t2) by VIN and snapshot_date.

Without the final line, that’s all it does: match on VIN and snapshot_date, filtering each of t1 and t2 accordingly to make them the Ann Arbor and State tables. Where there’s a match, t1.vin and t2.vin will always be identical. If a corresponding State record doesn’t exist, there won’t be a match, and the left join will add nothing, so all of the t2.* fields for that record will be NULL. The last line applies that filter. I believe it would be the same if it said t2.snapshot_date IS NULL or t2.locale IS NULL.

Leave a Reply

Your email address will not be published. Required fields are marked *