Just wanted to address that because it's kinda sorta overstating things a bit, which a pedant might take issue with, but it ain't really wrong.
Back in the day, I went through a number of Mazdas. A few 323s ('87 sedan, '87 wagon, '89 sedan), and a 626 ('90 sedan). To me they were all pretty much the same car. Sure, they were different colors, the bodies varied, feature arrangements changed between models and years, but they still had the same essentials like an internal combustion engine, brake pedals in the same location, a radio, etc.
Similar deal with AI models. There are definitely changes made between versions even within a family, so saying "exactly the same" is perhaps not entirely accurate, but it's not entirely wrong. There is variation in the training data and parameters, maybe some tweaks to architecture, possibly training methodology refinements, post-training fine tuning, but there's no revolutionary leap going on here, just progressive evolution, mostly in the margins.
So Claude Sonnet 4.6 is, from where I sit, the same as Claude Sonnet 4.5. From an output perspective, I haven't seen any dramatic departures in quality or performance, so it's like I just traded in last year's model for the latest. At any rate, nothing that suggests the update made the damned things intelligent or self-aware.
I have trained, deployed, and evaluated my own custom models. One of the fascinating things is just how much variation there can be even when you do this multiple times under the same conditions. Shouldn't be surprising, really, given how non-deterministic these things are, but still, the experience was quite illuminating.
For instance, I was training models to have more domain-specific knowledge so I could see how I might reduce the need to retrieve additional context from a knowledge base, which could reduce latency and token costs while ideally improving output quality. Along the way I discovered that two models trained on the exact same dataset with the exact same parameters did not behave in the exact same way after training. In fact, even the same exact model deployed in different compute environments did not behave in the exact same way.
Given that vendor models are inherently black boxes, it makes sense to differentiate between them based on behavior, rather than their architecture or how they were trained. And all of them still exhibit the same characteristics, such as hallucinations and sycophancy, with the same type of interactions (text in, text out). No paradigm shifts, just maybe more recent data encoded in their neural net layers.
So for all intents and purposes, today's Claude is exactly the same as it was a few years back. While doing the same thing over and over can lead to different probabilistic results, it's still your father's Oldsmobile. And that motherfucker wasn't conscious and didn't have selfhood, neither.
Selah.
Update: I was suddenly reminded of an old Eddie Murphy routine that's surprisingly germane 44 years later.

No comments:
Post a Comment