The Skills That Don't Come from Coding
Top survival skills for the Software Factory
The Skill Gap
'm probably stating the obvious here: teams without training see lower productivity gains from AI tools than the trained ones. The gains might even be negative, if the first three weeks of the project are spent running 10 agents ralph-looping in-line specifications in parallel and creating a mess of conflicting plans and code.
But training for what, exactly? Let's take a look at this new skill set case-by-case.
Code writing
Yes, the art. The thing we used to be proud of. We knew the syntax by heart, all the APIs and patterns, all the nice principles and practices, like DRY, SOLID and so forth, and sometimes even lived by it (at least when reviewing PRs). What's ahear?
- Traditional
- Express intent through implementation. You think it, you type it, you debug it.
- AI-Governed
- Define intent through specifications. The AI types it. You still debug it.
- Training
- Progressive refinement exercises
Specification writing
Most developers have always hated this part, both the writing and reading. Agile gave us a brief (?) respite, you know, working code over and so forth. It's not going to cut anymore, although who really writes the specifications is another question.
So, from user stories to detailed specs will go like this:
- Traditional
- Brief user stories. "As a user, I want to filter the list." Ship it.
- AI-Governed
- Precise specs with edge cases, performance bounds, and enough context for an agent that has the memory of a goldfish.
- Training
- Spec reviews, spec-to-implementation prediction, examples, trials
Important addition for the specifications itself is to have a 'task template' for AI to process them. Whether it's baked in the 'User Story 2.0 for AI' or as AI documentatoin file is not important, but have a good list of things that need to be refined based on the specicication, like cross-cutting concerns, links to supplementatal material like images and wireframes helps a lot. Good news is that... you can use AI to help you with this.
Reviewing AI Output
Reviewing whatever AI throws you at from the depths of the software factory is hard just because of the insane volume it generates.
It's way easier if you manage to constain and structure the outputs, such as code, tests and planning documents coherently with project-wide practices like naming, associations. You'll need to babysit your agents a lot though that they actually follow these practices.
- Traditional
- Infer the author's intent from their implementation choices. Ask them in Slack if confused.
- AI-Governed
- Verify against spec. Detect AI pattern deviations. You can't ask the AI why (or be prepared it might 'lie' to you).
- Training
- Pattern deviation detection, planted error exercises, how to automate review checklists,
Architecture Judgment
Finally, the architecture. How your software is constructed from layers (onions), modules and so forth. How will it be compiled, deployed and hosted. In agile era much of this was emergent, i.e. we started small, expanded and refactored. That can still be done, but you need to specify in much greated detail, for example how the code should be laid out. And again, erect lots of checks and guardrails to actualyl enforce these practices.
- Traditional
- Emerges through upfront design and iterative refinement over sprints.
- AI-Governed
- Needs to be defined clearly in text and constantly maintained there.
- Training
- Architecture review katas with real plan artifacts; generate something and see if it follows your general architecture.
The dominant training model today is tool training: how to use Cursor, how to prompt Claude Code, how to configure Copilot. Now the trend is shifting towards tuning those tools, and taking use of agents. Tool expertise matters, but it's the easy part.
The hard part. i.e. taking the game to 'next level' is teaching people to work differently and constantly improve. For many it might require developing a different relationship to the work itself.
Pick the right battles and Antti's Golden Rule of AI
Before we dive into the specifics of the skills listed above, there's a foundational instinct you need to develop: knowing what kind of problem to throw at an AI in the first place. LLMs are remarkably good at generating structured output: code, tools, scripts, transformations. They are remarkably bad, or at least rather unreliable, at being the tool themselves.
Examples of these DOs and DON'Ts below:
- Don't
- Load a 10,000-row CSV into an LLM and ask "what's the average salary?". It's just an expensive, slow and unreliable calculator.
- Do
- Paste a few sample lines and ask the AI to generate a script that computes the answer. Now you have a tool you can run, verify, and reuse.
- Don't
- Ask AI to shift a heading 50 pixels to the left. Open your browser dev tools and do it in three seconds.
- Do
- Ask AI when you need to restructure your entire CSS grid system and want to understand the implications.
- Don't
- Ask AI to keep track of what you've done across sessions. It will forget, hallucinate, or lose context.
- Do
- Ask AI to build you a CLI tool that tracks what you've done. Now you have something deterministic that actually works.
Luckily, many AI tools often resort to making Python scripts (for everything) when you feed them too much data or ask them to solve things like mathematical calculations. Yeah, sometimes but not always. Roughly speaking, remember that LLMs are not data processors, but they might pretend to be. By nature, and their internal system prompts, they lack the human laziness of refusing to do something too laborious or difficult for them to do. This tendency of over-confidence and never admitting error is often missed by humans and misunderstood as if the answer you got was actually correct.
All in all, this instinctive method of asking AI to create the tools rather than solving the problem directly is perhaps the single most important habit to develop early for any usage of AI beyond simplistic text processing.
So Antti's Golden Rule of Using AI shall be:
Use AI to build tools, not to be the tool. Anything you can do with CLI should still be done with CLI.
Thanks to one of my brilliant colleagues for this CLI metaphor. I'm gonna use it often
Next let's head back to our Skill-writing class.
Writing Specifications for AI
The principle from Chapter 2 applies here with a twist: the quality of your output is still bounded by the quality of your input, but now the consumer of that input is a machine that doesn't really think, is mostly unaware of its own limitations, and pretends a lot. Chapter 13 explores the economics of specification quality in detail, bun in short,the skill of writing good specifications is the skill of knowing what the AI needs to know, and how to say it clearly enough that there's no room for creative interpretation.
A short 101 could be like this:
- Don't
- Create a todo app
- Do
- Plan a todo app
or
- Don't
- Add a filter to my list
- Do
- Add a from-to date filter to my TasksList.tsx. Use the similar date filter as in the other grids. Limit the maximum span to one week. Use a calendar picker with manual text input. If the filter returns zero results, show a message "No tasks found for the selected date range".
To complicate things further, your prompt (or a PRD md file that you pass as context) is not the only source of information; for instance above, the standard behaviour of date filters should be in your ui-patterns.md or similar, which should be included automatically into your agentic UI developer context.
Unfortunately, specification writing is not a skill most developers have been trained in. It's a lost art for many. Agile's emphasis on user stories and acceptance criteria produced a generation of developers comfortable with brief, conversational descriptions of intent. You know, "As a user, I want to filter the list so I can find relevant items" is an adequate user story. You didn't need to say which list, and the developer would probably ask for more details when needed.
The brief and good-looking stories are (sometimes) disastrous specifications for AI Agents. The agent needs to know: which list? which filter criteria? which UI component? what happens when the filter returns zero results? what's the performance expectation for 10,000 items?
Learn how to turn agents into interrogators
One of the most remarkable production boosts for generative AI is the ability to summarize and analyze text. This applies to reviewing and improving specifications as well.
So, instead of dreading being turned into a full-time spec writer instead of a coder, I have good news for you. Use the AI to interrogate you for the facts that are required to narrow the scope and what kind of holes and conflicts your story has. This kind of iterative, top-down approach to build the instructions is highly recommended, and part of for instance the GSD process I've referred to many places in this book.
Here's a sample interview by CSD planner agent when I was building another (but nowhere as interesting as the roguelike) tool, called ctxl, short for 'Context Lightning'. Anyway, as I was iterating a plan for the next generation of my ctxl tool and planning for the next iteration, the agent asked me a lot of questions about the details of what I really wanted.
So, no matter if you create your own to match your project or use an existing one, I recommend having a dedicated planning or research agent ask you these questions before you're (oh well) allowed to code anything. Given the correct recipe, LLMs are very good in finding edge cases and holes in your story as good detectives or prosecutors.
Teaching specification writing requires exercises in precision, not just process knowledge. It's an art on its own and you need to adapt it to this new way of working, too. The new specs 'User stories 2.0' need to be tailored for the AI, but still readable by various stakeholders.
I never sat down to write a complete spec up front for my game. Each feature started as a rough idea, and the spec emerged through rounds of dialogue with the research and planning agents. They would point out things I hadn't considered, I'd refine the scope, and by the end we had something precise enough to execute. The specification was a byproduct of the conversation, not a document I authored.
So a good training for this could be practical:
- Hand out relatively easy tasks
- Have the team write specifications for them in different formats, for instance: 1) Classic user story with acceptance criteria or 2) A detailed user story many more edge cases, detailed definition of done, DON'Ts and WHAT IFs iterated and refined through dialogue with the AI.
- Arrange a demo to run the agent create implementation and tests on these two versions of the spec.
- Compare the results.
This will make the difference in specification quality tangible.
There's no "magic standard" for specifications, but there are definitely bad ones. I've found that having the AI produce most of the behavioural stuff as Gherkin, and the user-facing stuff as a separate design plan, helps a lot to cut down the inevitable back-and-forths with the agents.
What we actually learned
When we introduced governed AI development to a real project team, the ramp-up was slower than I expected. To be fair, much of that was the new domain and tech stack, not the AI workflow itself. Getting the workflow stable enough to trust also took its own sweet time. And there was a lot of variety among people: some jumped on board quickly, others had more trouble adjusting.
People found very different ways to cope. Some were very careful, trying to read everything the AI produced and manually approving every single tool call. Thorough, yes. Sustainable, not so much. You can imagine how that scales when the agent wants to make 40 file edits. Others went the opposite way: less reading, more trusting, full speed ahead. We ended up with totally wrong features being built because nobody had checked whether the agent actually understood the requirement or just confidently produced something that looked right.
We ran several walkthroughs, a dedicated training day, and handed out plenty of material to read. Looking back, I think the best way to learn this stuff is just to use it and become an AI whisperer through practice rather than lectures. We started with real project work instead of made-up exercises, which got us moving fast but meant people hit the hard lessons on production code. In hindsight, some controlled exercises first might have been a gentler start.
The biggest lesson: more hands-on, 1-on-1 pairing would have helped the people who struggled. A training day gives you the concepts. Sitting next to someone while they wrestle with a stubborn agent gives you the instincts.
Review techniques
Reviewing AI-generated specifications, code, test cases and data models is a whole other ballgame from the old code-review rubberstampings.
When reviewing code written by a living human being, you can usually infer the author's intent from their implementation choices. Like, why they used this pattern, what edge case they were handling, what assumption they were making. And if required, you can comment and ask and chances are you're already familiar with domain, architecture and practices for the codebase.
When reviewing mostly machine-generated stuff, you might not have that context. And the amount of material is often just too much for anybody to actually review.
AI-generated commit messages and PR descriptions are often very good, and they can be a great help to understand the intent behind the code, which is a crucial part of the review process. So make sure to read them carefully.
This changes what you're looking for. In the old world you could be semi-confident that the developer hasn't added too many extra features or interpreted the requirements entirely incorrectly. He might have even tested them before committing with a critical eye.
Don't trust the formatting
With AI, those convincing and nicely organized, good-looking documents with lots of checkboxes and colors, and the beautifully indented and formatted code might be just slop.
You've probably seen those summary messages: "All you ever wanted has now been delivered and ready for production. All code compiles and test coverage is 100%. Great success". And when you say 'npm compile' the first thing you notice is an insane number of linter errors, and the 100% test coverage was just fitted to code, not to the original requirements, and even those don't pass.
When you scratch the surface, the result you got is not doing even remotely what it was supposed to do.
Always be ready to start over.
It's pointless to argue with AI and insults won't work either. If the internal task list of your agent has 12 steps, you're at phase 7, and you find yourself in an infinite loop of same errors, and the context usage is in the red, it's time to stop, start a new session, and start over. You were just suffering from the context overload, rot, pollution or whatever, or didn't have a clear enough intent to start with. Rethink, check where the things started to deteriorate, and start a new session with a clearer plan and instructions.
In the end, not much was lost, and reassuring yourself that 'AI' will get it if you keep insisting 'I already told you you're executing the tool in wrong directory' won't necessarily help.
All in all my take-home top list of skills to learn is this:
- Practice
- a lot and learn when the AI is lying to you and when it's not. This is a skill that can be developed.
- Don't trust
- the formatting, the structure, the colors, the checkboxes. They are just a facade. Test content against the original requirements.
- Use the intermediate results
- as a guide to understand intent. It might be a PLAN.md, commits made, the 'Tasks' list of your agent or whatever. Did it really make sense?
- Too much to read = you tried to do too much in a single step.
- Break down the work into smaller chunks.
- Use a separate review agent
- , Pick a different model than the one that generated the code or document to have a real second opinion.
- Add a new session
- Press the +. The 15-30 minutes you wasted arguing with AI are gone forever.