Knowledge Priming: Teaching AI Assistants Your Codebase Like You'd Onboard a Human

Knowledge Priming: Teaching AI Assistants Your Codebase Like You'd Onboard a Human

I’ve watched developers get frustrated with AI coding assistants. They generate code, find it doesn’t fit, regenerate with corrections, repeat until they either give up or spend more time fixing the output than writing it themselves. The pattern is so predictable I started calling it the Frustration Loop.

The weird thing is, these same developers would never dream of asking a new hire to contribute code on Day 1 without any onboarding. We know humans need context. We walk them through conventions, show them examples of good code, explain architectural choices. Only then do we expect aligned contributions.

So why do we skip this step with AI?

The Problem With Generic Defaults

When you ask an AI assistant to “create a UserService that handles authentication” without any context, it does exactly what an uninformed developer would do: it falls back on prior experience. Except AI’s prior experience isn’t one company’s codebase. It’s millions of repositories, tutorials, and Stack Overflow answers blended into an average.

You get class-based code when your team uses functional patterns. You get Express when you standardized on Fastify. You get outdated API calls because the training data predates the current version. The code is syntactically correct. It might even pass basic tests. But it’s completely wrong for your project.

This isn’t an artificial intelligence capability problem. It’s a context problem.

The Three Layers of AI Knowledge

I find it helpful to think about AI knowledge as a hierarchy. At the bottom is training data: everything the model learned during pre-training. Billions of tokens of generic patterns from across the internet. Then comes the system prompt: those instructions that frame how the model should behave in general. At the top, with the highest priority, is the context window: the actual conversation and any documents you include in your prompt.

The mechanistic reason this matters is attention. Transformer models have what amounts to a finite budget of attention. Every token in the context window competes for influence over the output. When you fill that window with generic training patterns, the model draws on the average of everything it’s seen. When you fill it with specific, high-signal project context, those tokens attract more attention weight and steer generation toward the patterns that actually matter.

This is manual RAG. You’re retrieving the right information and augmenting the generation process. The difference is you’re doing the retrieval yourself instead of relying on an automated system.

What Knowledge Priming Actually Looks Like

Knowledge priming is just onboarding for AI. You create a document that gives context before asking for code generation. Not comprehensive documentation. Not a brain dump. A curated cheat sheet.

I’ve settled on seven sections that mirror what I’d walk through with a human colleague:

System Overview: The big picture. What kind of application is this? What are the major components? How do they interact? This is the equivalent of showing someone the architecture diagram on their first day.

Technology Stack: The specific versions matter more than you’d think. APIs change between major versions. If you’re on React 18 and AI defaults to React 17 patterns, you’re going to have a bad time. List the actual versions you’re using.

Curated Learning Resources: Every team has trusted sources that shaped how they think. The official docs they actually read, but also the blog posts that influenced their architecture, the tutorials that explained things clearly. Five to ten sources that genuinely shaped how the team works. This is powerful because it primes AI with the same mental models your team already has.

Project Structure: Where things live matters. File placement conventions, module organization, how services talk to each other. This prevents AI from inventing its own structure.

Naming Conventions: Consistency matters more than personal preference. How you name files, functions, variables, types. It’s boring but essential.

Code Examples: Show, don’t just tell. Include two or three examples of what you consider good code from the actual codebase. This is the pattern to follow.

Anti-Patterns: Tell AI what NOT to do. The mistakes your team learned the hard way. This is surprisingly effective at preventing common issues.

Treating Context As Infrastructure

Here’s where it gets interesting. The most powerful approach isn’t manually pasting context at the start of each session. That’s a habit that fades. Instead, treat priming as infrastructure.

Store the priming document in .github/copilot-instructions.md or .cursorrules or wherever your AI tool looks for project-specific context. Version it alongside the code. Review changes to it like you’d review code. Make it apply automatically to every session.

This transforms priming from a personal productivity hack into team infrastructure. The difference between something that might happen if you remember and something that always happens because it’s built into the workflow.

Just as onboarding materials for new hires are maintained as organizational assets, not improvised each time, priming documents should be treated as first-class artifacts.

The Failure Modes I’ve Observed

One mistake is making the priming document too comprehensive. You don’t need to document everything. You need to document the minimum context required to generate aligned code. AI can always ask follow-up questions. Start focused, expand only when patterns emerge showing what’s missing.

The other risk is staleness. Documentation rots. Every team has a graveyard of outdated wikis. A stale priming doc is worse than none because it teaches AI outdated patterns. But a priming doc that lives in the repo, reviewed like code, stays current by design. When someone changes a convention, they update the priming doc in the same PR.

A Real Example

Here’s a condensed priming document from a project I worked on. Notice it’s under 50 lines. That’s the target:

# Project Context for AI

## System Overview
Node.js microservice API for user management. Fastify web framework, PostgreSQL database, functional programming style.

## Stack
- Node.js 18.x
- Fastify 4.x
- PostgreSQL 14
- TypeScript 5.x

## Key Resources
- Fastify docs: https://www.fastify.io/docs/latest/
- Our team's functional patterns guide (internal wiki)

## Structure
/src/routes - HTTP endpoints
/src/services - Business logic (pure functions)
/src/db - Database queries
/src/types - TypeScript definitions

## Naming
- Files: kebab-case (user-service.ts)
- Functions: camelCase (getUserById)
- Types: PascalCase (UserProfile)

## Code Example
See /src/services/user-service.ts for the pattern. Services are pure functions that receive dependencies as parameters.

## Anti-Patterns
- No class-based services
- No direct database calls from routes
- No response mutations

Under 50 lines. Focused, specific, actionable.

When Does This Actually Pay Off?

I suspect the payoff is greatest for non-trivial work that spans multiple sessions or involves team coordination. For a quick utility function, manual correction might be faster than maintaining context infrastructure. But for anything that requires understanding how pieces fit together, priming seems to compound.

The foundation matters. Design-first conversations are more productive when AI already understands the architecture. Custom commands work better when AI knows the conventions. The investment in priming makes everything else work better.

Maybe the real shift isn’t in the technique itself but in how we think about AI assistance: less like a search engine that returns code snippets, more like a new team member who needs onboarding before they can contribute effectively.

Read Next