Sandro Turriate

Coder, cook, explorer

I analyzed 1,000 of my prompts. Here's what I learned about talking to AI.

Jun 3, 2026

I built a coding assistant called yap. Every message I send gets stored in a local SQLite database — a debugging feature that turned into a mirror. I recently read through all 1,020 of my substantive prompts (after stripping out the 56 yes/go/commit it rubber stamps). I learned a few things.

The dataset

1,020 prompts across two months. Daily averages ranged from 10 to 4,229 characters. One day: 138 messages at 118 characters each — rapid-fire debugging. Another day: 7 messages at 3,313 characters each — careful feature specs. Same person, same tool, very different modes.

The approval loop

My most common message type is under 20 characters: yes, go, do it, B, sure, do it.

These fall into two buckets:

Decisions — the model presents options A, B, C. I pick one. That's collaboration, not waste. Can't frontload a decision you haven't made yet.

Ceremony — the model asks "should I apply this fix?" and the answer was always going to be yes. I found 56 of these. Each one is a full round-trip where the model stops, waits, then re-reads the entire conversation from scratch to continue. All that, for zero new information.

The antidote is exit criteria. Instead of "fix the bug" → plan → "yes" → fix → "commit it," just say: "Fix the bug. Commit when tests pass." Three turns become one.

How I waste words

Burying the lede. Here's an actual message I sent:

I just built the new binary and I'm seeing the same leftover artifacts of session text when restoring a session. Do you need to ask me questions about the bug, because you wrote code, but you didn't fix it. do you need another screenshot?

I'm thinking out loud here, the way you would with a coworker. Totally natural! But the model doesn't need the preamble — it needs the signal:

Built at HEAD. Session restore still shows ghost lines at rows 5-8. Previous fix didn't work. Screenshot attached.

Half the words, zero ambiguity. Being direct isn't being rude — it's being kind to a system that will earnestly try to make sense of every word you give it.

Fake questions. When you've already decided, a question mark invites a lecture instead of a fix. Compare:

Do you think there should be 7 hunks? I see 2 distinct hunks in the new if/else block, but that feels like 1 hunk to me, not 2.

I already knew what I wanted. The direct version:

Adjacent add/delete lines should be a single hunk, not two. Merge them.

Save questions for when you genuinely want the model's opinion.

Scope creep mid-message. Two bugs, one message, and the second becomes a footnote:

The status line gets messed up on resize [long reproduction] ALSO, if I type a multiline input, when I press the arrow key, it should not change my text to the last text in history…

The second bug gets a shallow fix or gets dropped entirely when the context window fills up. One bug, one message — unless you fan them out (keep reading).

The prompt optimizer

yap has a built-in optimizer that rewrites my input before it hits the model:

What I wroteWhat it sent
"the cards view has quick jump dots on the right side, when I hover over them if the text is long the tooltips fills the full screen and even overflows to the left, making it invisible. it should have some width, and probably truncate after 6-8 lines.""Fix tooltips on quick jump dots: add max-width constraint, truncate after 6-8 lines."

I built it as a safety net to catch my bad habits. The goal is to beat it — to internalize the rewrite before hitting enter.

What to actually do about it

Lead with the verb. "Fix X" not "I'm seeing X and I think maybe Y could be..." State expected behavior next: "Cursor should be at column 0 after resize. Currently at column 40."

Batch intent, not approvals. "Fix the bug. If tests pass, commit." No need to make the AI stop and ask for permission it doesn't need.

One bug, one message. Or better yet — fan them out in parallel.

Parallelize with subagents. This is the big one. When you list three bugs in one message, the model tends to lose focus by bug #3. The context fills with diffs from #1, the fix for #2 bleeds into unrelated files, and #3 gets the tired version of the model's attention.

Most multi-agent coding tools support some form of worker spawning now — Claude Code has background agents, Cursor has parallel edits, and Pi launches more Pi processes. In yap, the main chat can spawn workers but I also built /each specifically for quick parallel bugfixes:

/each
- fix the flicker on resize
- arrow keys navigate wrong in multiline input
- email addresses render as footnotes in tables
a yap slash command for parallel agents

yap spawns a worker per bullet, each in its own git worktree. They run in parallel, can't step on each other, and auto-merge when done. The nice thing: each worker gets a clean context. Worker #3 isn't wading through 50 tool calls from #1 and #2. Full attention, one problem.

More round-trips, but targeted ones. This sounds contradictory — didn't I just say to cut unnecessary round-trips? But the goal isn't fewer messages. It's fewer wasted messages. Instead of one long conversation that slowly drifts off course, break work into focused exchanges: one bug, one task, clear success criteria. The model doesn't lose the plot because there's no subplot.

And here's why more conversations isn't more expensive: every time you send a message, you're actually sending the entire conversation — system prompt, all previous messages, and your new one. The model is stateless; it re-reads everything from scratch each turn.

Prompt caching changes the math. Think of it like a book with a bookmark. Without caching, the model re-reads from page one every turn. With caching, the provider says "pages 1–200 haven't changed" and picks up from the bookmark. Cached tokens cost ~90% less on Anthropic, with similar discounts on OpenAI.

So when you start a fresh conversation for bug #2, the system prompt and tool definitions — identical across sessions — hit cache right away. You only pay full price for the new bug report itself. Short, focused conversations play right into how caching works.

What surprised me

Going in, I was worried my prompts had gotten lazier. Early on I wrote multi-paragraph specs with exact file paths and line numbers. These days I sometimes fire off a single sentence and trust the model to figure it out. Had I crossed the line from "concise" into "vague"?

Turns out, no. Those early multi-paragraph specs often buried the request in background context. The later terse messages tend to be direct commands with clear scope. Shorter isn't lazier — it's more practiced.

What hasn't fully changed is my tendency toward indirectness — questions when I mean commands, narration when I should lead with verbs. I'm a naturally indirect communicator, and that's probably not going away entirely. But I am getting better. Months of writing to a model that rewards precision has started to retrain my instincts — I catch myself mid-question now and rewrite it as a statement before hitting enter. Slowly but surely, I'm putting the prompt optimizer out of a job.

Here's the funny part: my most effective prompts are the frustrated ones. When the AI breaks something, I stop editorializing and get precise. Exact symptoms, reproduction steps, expected behavior. Frustration strips away my tendency toward indirectness and forces me to communicate with precision.

The goal is to write with that same clarity all the time — just without needing the frustration to get there.