chatgpt as lossy compression

2024.04.30
My favorite sci-fi author Ted Chiang wrote ChatGPT is a blurry jpeg of the web:
Imagine what it would look like if ChatGPT were a lossless algorithm. If that were the case, it would always answer questions by providing a verbatim quote from a relevant Web page. We would probably regard the software as only a slight improvement over a conventional search engine, and be less impressed by it. The fact that ChatGPT rephrases material from the Web instead of quoting it word for word makes it seem like a student expressing ideas in her own words, rather than simply regurgitating what she's read; it creates the illusion that ChatGPT understands the material. In human students, rote memorization isn't an indicator of genuine learning, so ChatGPT's inability to produce exact quotes from Web pages is precisely what makes us think that it has learned something. When we're dealing with sequences of words, lossy compression looks smarter than lossless compression.
Admittedly this was written last year but I think he underestimates the usefulness of ChatGPT in applying knowledge to a particular case at hand:
This analogy makes even more sense when we remember that a common technique used by lossy compression algorithms is interpolation--that is, estimating what's missing by looking at what's on either side of the gap. When an image program is displaying a photo and has to reconstruct a pixel that was lost during the compression process, it looks at the nearby pixels and calculates the average. This is what ChatGPT does when it's prompted to describe, say, losing a sock in the dryer using the style of the Declaration of Independence: it is taking two points in "lexical space" and generating the text that would occupy the location between them. ("When in the Course of human events, it becomes necessary for one to separate his garments from their mates, in order to maintain the cleanliness and order thereof. . . .") ChatGPT is so good at this form of interpolation that people find it entertaining: they've discovered a "blur" tool for paragraphs instead of photos, and are having a blast playing with it.
I'm willing to grant that asking ChatGPT to apply its embedded gleaned knowledge to a particular problem is basically that kind of of interpolation, but in practice it is far more useful than making entertaining mashups. In my case, especially for technical tasks - to quote David Winer
ChatGPT is like having a programming partner you can try ideas out on, or ask for alternative approaches, and they're always there, and not too busy to help out. They know everything you don't know and need to know, and rarely hallucinate (you have to check the work, same as with a human btw). It's remarkable how much it is like having an ideal human programming partner. It's the kind of helper I aspire to be.

Jessica Valenti goes hard into the damage Republicans know their actions against a woman's right to choose will be and it's fundamental unspoken premise - "enforcing a worldview that says it's women's job to be pregnant, and to stay pregnant to matter what the cost or consequence."


(I'm not saying there aren't some counterpoints but in it seems like a good rule of thumb)
I really have trouble squaring "Google is poor" (has to conduct layoffs, offshore and outsource jobs, remove staplers) with record revenue and profits, CEO pay so high, stock buybacks and now even dividends. I can't connect the dots -- not charitably.

People are worried about AI becoming a paperclip maximizer. But the thing is, wall street already is a paperclip maximizer.