Prompt-Caching | mikePietsch.com

30.8 Constitutional AI and Claude's Safety Philosophy

Right, let’s talk about the elephant in the server room: safety. You’re probably thinking, “Great, another lecture about how my AI might go rogue.” But stick with me. This isn’t about shackling creativity; it’s about building a system that’s robust, reliable, and doesn’t accidentally suggest you add bleach to your pasta sauce for extra flavor (yes, that’s a real thing people have gotten from less careful models). Anthropic’s approach, Constitutional AI, is genuinely clever. Instead of just trying to patch bad behavior after the fact with a mountain of filters (a losing battle), they baked the principles directly into Claude’s training. Think of it less like a stern parent and more like a personal constitution—a core set of rules and principles the model uses to govern its own responses. It’s a system of self-critique and revision.

30.7 The Files API and Batch Processing

Right, let’s talk about moving beyond one-off chat completions. The Files API and batch processing are where you stop asking Claude to solve a single riddle and start putting it on a factory line to solve a thousand. It’s the difference between a scalpel and a conveyor belt of scalpel-wielding robots. The core idea is brutally simple: you upload your data, you tell Claude what to do with each piece of it, and you get back a neatly packaged JSONL file with all the results. No more babysitting individual API calls.

30.6 Vision: Analyzing Images and Documents

Right, let’s talk about getting Claude to open its eyes. The Vision API isn’t just about slapping an image into a prompt and hoping for the best. It’s about giving Claude a new pair of glasses and teaching it how to read the fine print. The magic here is that you can toss almost any common image or document format (JPEG, PNG, PDF, DOCX, you name it) at the model and it will not only see the pixels but understand the content. This is where we move from a fancy chatbot to a genuine analysis engine.

30.5 Prompt Caching: Reducing Latency and Cost for Long Contexts

Right, let’s talk about one of the most powerful features you probably didn’t know you needed: prompt caching. Think of it like this. You’re sending a massive 100k context prompt to Claude, maybe a huge document for analysis followed by a set of instructions. You do this repeatedly in a loop, or across different user sessions. Every single time, you’re paying to process that entire document and you’re waiting for every single token to be processed. It’s like photocopying a book every time you want to ask a question about a specific page. It’s absurdly wasteful. Prompt caching is the three-hole punch and binder that fixes this.

30.4 Extended Thinking: Eliciting Deep Reasoning

Alright, let’s get into the real magic: making Claude think. Not just the surface-level, pattern-matching stuff, but the deep, chain-of-thought reasoning that makes this model feel so different. This isn’t about tricking the API; it’s about structuring your conversation to properly utilize the architecture it was built on. The Power of the Scratchpad The single most effective technique for eliciting deep reasoning is to explicitly give Claude a workspace. Think of it like this: you’re asking a brilliant colleague a hard question. They don’t just stare into the middle distance and then blurt out a perfect answer. They grab a whiteboard, jot down facts, reason step by step, cross things out, and then present their conclusion.

30.3 Tool Use: Defining Tools and Handling Tool Results

Right, let’s get our hands dirty with Claude’s Tool Use. This isn’t just about making an API call; it’s about teaching Claude to be your highly capable, slightly pedantic, software intern. The core idea is simple: you define functions (tools) that Claude can call, and it returns the results to you. But the devil, as always, is in the details. Defining Your Tools: The Schema is the Contract First, you need to tell Claude what tools are available. You do this by passing a list of tool definitions in the tools parameter. Each tool is a JSON schema that describes a function. This schema is your contract with Claude. Be excruciatingly specific here. Vague contracts lead to confused AIs.

30.2 Messages API: System Prompts, Human and Assistant Turns

Alright, let’s talk about the Messages API. This is the core of how you actually talk to Claude. Forget the old, simplistic “single prompt” model. This is a conversation, and getting that right is 90% of the battle. The API is built around a messages array, where each object represents a turn in the conversation. It feels natural because it is natural—it’s how we communicate. The Three Roles: System, Human, Assistant Every message in the array must have a role property. This isn’t just bureaucratic labeling; it tells Claude who is speaking and, more importantly, how to interpret that text. There are three roles, and they are not created equal.

30.1 Claude Model Family: Haiku, Sonnet, and Opus

Alright, let’s talk about the main event: the Claude model family. You’ve got three options—Haiku, Sonnet, and Opus—and your job is to know which one to pick for the task at hand. This isn’t just about picking the “smartest” one; it’s about picking the right tool. Using Opus to summarize a tweet is like using a particle accelerator to crack a walnut. Impressive? Sure. A grotesque waste of resources? Absolutely.