Tuesday, November 25, 2025

Hello, World!

My first computer was an Apple ][+ (Bell & Howell edition), which my parents bought for me from The Computer Store (not kidding) in Toledo, Ohio, on February 14, 1980.  Quoth Mom: "Happy Valentine's Day, Happy Easter, Happy Birthday, and Merry Christmas!"  Rightly so, as the price tag was about $3k (almost $12k today).  Upgrades and games came out of my own pocket from that day forward (gifts notwithstanding).

I used that primitive thing - hardly very ahead of stone knives and bearskins - through the end of sophomore year of college (May 1989), so it had a good, long working life.  I held onto it until 2017, when I was forced to send it to a lovely farm in upstate New York (along with many other beloved possessions, but let's not get into that now).

My first programs were written in Applesoft BASIC, although I also dabbled with Assembly language.  All the homework for my AP class on Pascal in high school was written on that venerable machine.  At Colby, I expanded my programming universe with C on the campus VAX (running Ultrix), and after I graduated, I somehow convinced the administration to pay me more than 6 bucks an hour (50% higher than minimum wage!) to develop an information kiosk application in Hypercard.  When the Web came on the scene, I even picked up a bit of Javascript (which lead me to consider a job offer from an old college chum at Time Warner in NYC).

I was a shitty programmer.  I chose frivolous names for variables and functions, never commented in the code, and was always surprised when I couldn't figure out what the fuck I'd done when debugging.  Today, I'd be hard pressed to write a "Hello, World!" program, so I leave such worldly matters to less flaky developers.

However, since idle hands are the Devil's workshop, and I have no teaching engagements until next Thursday, I decided to muck around with coding to create a crappy chatbot as a simple proof of concept.  My intent is to build a demo/lab for non-technical learners who are new to Generative AI (which might be a manifestation of the Evil One himself), as well dust off the synapses by doing a bit of quasi professional development.

With all the talk of vibe coding, and my current state of ignorance and apathy, I figured it would be appropriate to use the fancy tools available to help me bootstrap, so I turned to my company's own Nova model (you can, too!).  It was a fascinatingly frustrating endeavor, but I finally got things working after several hours, fitfully spread over the last couple days.

Took a few iterations of prompt engineering to get me close enough to start tweaking the output code.  And even then, sometimes the model was rather useless when I encountered errors.  I knocked down most of the problems, but remained stuck on an issue generating responses and passing them to the browser.

I eventually broke down and found sample code in actual documentation, just like in the days of yore.  While I was able to execute that manually on my EC2 instance, I still was having trouble integrating it with my little web page, so had to ultimately do a side-by-side comparison of what I had gotten from the AI and the human-generated script.  But these old eyeballs and my misfiring brain spotted the cause pretty quickly, and I ended up getting things running soon thereafter so I could converse with Claude Sonnet 4.5 (another popular LLM supported in Amazon Bedrock) about the experience1.

Here's an excerpt of our exchange, starting with the bot's response to my asking why Nova missed the mark (slightly edited to remove extraneous details):

  • Claude 
I appreciate you [sic] sharing that frustration! It's a great question that touches on some fundamental limitations we all share. Here's why LLMs (myself included) get things wrong: 

1. We're pattern matchers, not understanding machines - We're trained on vast amounts of text and learn statistical patterns, but we don't truly "understand" code execution, API behavior, or real-world constraints. We might generate syntactically correct code that looks right based on patterns we've seen, but miss subtle implementation details. 

2. Training data has a cutoff - Our knowledge freezes at a certain point. API specifications change, libraries update, and new best practices emerge. Code that worked during training might be outdated now. 

3. We hallucinate with confidence - This is perhaps our most dangerous trait - we can generate plausible-sounding but completely incorrect information without any internal "uncertainty signal." We don't actually test the code we write. 

4. Context about YOUR specific setup is limited - When Nova Pro gave you code, it likely didn't know all the specifics about authentication methods, response formats, or error handling in your particular environment. 

5. We optimize for coherence, not correctness - We're essentially very sophisticated autocomplete systems. We generate what seems like it should come next, not necessarily what actually works
The irony isn't lost on me that I'm an LLM explaining LLM failures! What specific issue did the code have? I'm curious what the gap was between the suggested approach and what actually worked.
  • NTodd
Nova told me this would return your response to the web page: result.get('outputText', 'No output received')

While a code example that I found used this: "".join([output["text"] for output in result["content"]])

The former kept returning "No output received" while the latter finally let me see your responses.  Do you think the original proposal would work with some tweaks, or was that a hallucination?
  • Claude 
The original proposal was likely a hallucination or outdated information. Here's why... 

The Key Difference: outputText  - This key doesn't exist in the standard AWS Bedrock response structure for newer models (like Claude). 

It might have been: - Confused with a different AWS service - Based on outdated API documentation - A generic placeholder that doesn't match actual response format… 
Could It Be Fixed? Not really with just tweaks - the structure is fundamentally different… 
The working code you found correctly navigates the actual API response structure. Nova's suggestion seems to have been based on incorrect assumptions about the response format rather than actual Bedrock API documentation.

Well there ya have it.  Anyway, it was a useful exercise, and now I've got all the ingredients I need for one of my patented, half-baked in-class romps through How Shit Works.  I am invincible!

In conclusion: "Learn to code," they said.


1 - I shan't link to it, because it is really ugly, and I want to keep working on enhancements.

PS - Quick update on the footnote.  I now have made a couple different purpose-built versions: one that can quickly provide address ranges needed to create security rules for certain services; another that has access to the writings of St Thomas Aquinas in Latin and English, so can expound on things like charitas, or what the Angelic Doctor might say about AI.  Still ugly.

No comments:

Post a Comment