There is one prompt in the history of this project that I think about more than any other. It was not the first prompt. It was not the cleverest. It was the one that set the direction for everything that followed, and it was, in retrospect, significantly wrong.
The prompt
Early in the project, I needed to describe the weather system. FairwayPlan's itineraries are weather-aware: the solver uses weather data to schedule rounds, generate clothing recommendations, and decide whether a day should be a rest day. The weather system needs to work across three time horizons, from live forecasts to five-year averages to seasonal fallbacks, and the data flows into multiple downstream systems.
My prompt described it as a single service with a caching layer. I sketched out the API surface, the data format, the cache TTLs. It was clear and specific, which is usually a good thing. The agent built exactly what I asked for. A single weather endpoint that tried to do all three tiers in one function, with conditional logic branching on how far in the future the requested date was.
It worked. Technically. And then it became the source of nearly every weather-related bug for the next two weeks.
The function was too long. The caching logic for live forecasts (3-hour TTL) conflicted with the caching logic for archive data (24-hour TTL) because they shared a key structure that did not account for the tier. The fallback path, when neither the forecast API nor the archive API responded, was bolted on as an afterthought because my prompt had not mentioned it. The clothing recommendation logic sat at the bottom of the same function, reading from variables that were set differently depending on which branch had executed above.
Every fix created a new edge case. Every edge case required understanding the entire function to resolve. The code was a faithful execution of a brief that had not thought hard enough about separation of concerns.
The downstream cost of a vague brief
The interesting thing about agents is that they do what you say. This sounds obvious. It is also the primary way they cause problems. A human developer, given the same brief, might have said "this should probably be three separate functions with a common data contract." They would have pushed back on the structure, asked clarifying questions, introduced their own opinions about code organisation.
Early in a project, the agent does not do this. It has no context to push back from. It does not know your codebase yet. It does not know your patterns. It builds what you describe, and if what you describe is a monolith function with three responsibilities, that is what you get.
I rewrote the weather system a week later. Three separate lookup functions, a clean fallback chain, dedicated cache keys per tier, and the clothing logic pulled out entirely. The rewrite took a single session. The original took a single session too. The difference was a week of debugging in between that would not have happened if the initial brief had been better.
The lesson is one I keep relearning: the most expensive prompt is the one at the start. It is also the one you spend the least time on, because you are eager to begin and the agent is ready to build and the combination of those two things creates a momentum that feels productive but is actually just fast.
A thirty-minute brief that properly separates concerns would have saved days of rework.
When the agent pushes back
Something changes as the project matures. The CLAUDE.md file grows. The agent has seen your patterns, read your architecture notes, absorbed your conventions. And at a certain point, it starts having opinions.
The first time Claude pushed back on an approach I had described, I was surprised. I had outlined a plan to add a caching layer to the course pricing lookup. The agent responded with something along the lines of: this would work, but the pricing data changes seasonally, and the cache invalidation logic would need to account for the date-dependent pricing tiers in course_pricing. Had I considered whether the complexity of the cache was worth it for data that is queried infrequently?
It was right. The cache would have been more complex than the query it was trying to avoid. I dropped the idea.
This happens regularly now. I describe an approach. The agent flags a concern. Not always correctly. Sometimes its pushback is based on a misunderstanding of the system, or it is being overly cautious about something that does not matter. But often enough, it catches something real. A constraint I forgot. An interaction with an existing system that my plan did not account for. A simpler way to achieve the same result.
The skill of knowing when to listen
The tricky part is that the agent pushes back with the same confident, measured tone whether it is right or wrong. There is no signal in the delivery. You cannot tell from the way it phrases its concern whether the concern is deep or superficial.
Over time, I have developed a rough filter. If the agent's pushback references specific files, specific data structures, specific interactions between components, I take it seriously. It has read the code. It knows something concrete. If the pushback is general, "this could introduce complexity" or "consider whether this scales," I weigh it less. That is the kind of caution that sounds wise but does not contain information.
The other signal is whether the agent offers an alternative. A pushback that says "this might not work" is less useful than one that says "this might not work, and here is a different approach that avoids the problem." The first is a flag. The second is a thought-through alternative that I can evaluate on its own merits.
I override the agent regularly. Sometimes I am right and the override saves time. Sometimes I am wrong and I end up circling back to what the agent suggested in the first place. Both outcomes are fine. The point is not to always listen. The point is to treat the pushback as information, run it through your own understanding of the system, and make a deliberate decision rather than a reflexive one.
The brief is the bottleneck
If I could go back and change one thing about how I started this project, it would be the time allocation. I spent minutes on the initial prompts and hours on the code that followed. That ratio should have been reversed. A thirty-minute brief that properly separates concerns, specifies data contracts, and anticipates the integration points would have saved days of rework.
And later, when the agent has enough context to push back, the right move is to pause and actually consider what it is saying. The agent has read every file in the project. It holds the entire codebase in its context window. When it flags a concern, it might be seeing a connection you forgot. Or it might be generating generic caution. The skill is telling the difference, and you only develop that skill by paying attention to the pushbacks and tracking which ones turn out to matter.
The weather system works well now. Three tiers, clean boundaries, reliable fallbacks. It got there eventually. The scenic route just cost more than it needed to.
When the agent flags a concern, it might be seeing a connection you forgot. The skill is telling the difference between genuine insight and generic caution.