Semantic Memory and User Modeling
Build the agent's accumulated model of the user — preferences, expertise,
Semantic memory is the agent's model of the user. Not events ("user did X on date Y") but generalizations ("user prefers concise output," "user is intermediate in Python," "user lives in Vancouver").
## Key Points
- "Prefers concise responses."
- "Likes code examples in TypeScript over Python."
- "Wants links cited inline rather than at the end."
- "Communicates formally."
- "Senior backend engineer; expert in Go and Python."
- "Beginner in machine learning."
- "Has 10 years of legal experience."
- "Familiar with this product's API."
- "Works at Acme Corp; team of 8 engineers."
- "Lives in Vancouver; PT timezone."
- "Building a SaaS for restaurants."
- "Currently focused on the migration project."
## Quick Example
```
- "User is a senior backend engineer with 10 years of experience."
- "User strongly prefers concise responses without lengthy explanations."
- "User lives in Vancouver, Pacific Time."
- "User is currently working on an auth migration."
```
```
You are an assistant for User u-123. Some context:
- They are a senior backend engineer.
- They prefer concise responses.
- They are working on an auth migration project.
- They use TypeScript and Go.
```skilldb get agent-memory-skills/Semantic Memory and User ModelingFull skill: 280 linesSemantic memory is the agent's model of the user. Not events ("user did X on date Y") but generalizations ("user prefers concise output," "user is intermediate in Python," "user lives in Vancouver").
Semantic memory shapes every interaction. The agent that knows the user prefers brief responses gives brief responses by default. The agent that knows the user is in finance interprets ambiguous questions in finance context. Done well, the agent feels like it knows the user.
Done badly, semantic memory is wrong, stale, or invasive. Wrong: the agent confidently misinterprets requests because it has the wrong model. Stale: the user has moved on but the agent hasn't. Invasive: the agent surfaces information the user didn't expect to be remembered.
What's in Semantic Memory
Categories of user knowledge:
Preferences
How the user wants things done.
- "Prefers concise responses."
- "Likes code examples in TypeScript over Python."
- "Wants links cited inline rather than at the end."
- "Communicates formally."
These are explicit or inferred preferences that affect output style.
Expertise
What the user knows.
- "Senior backend engineer; expert in Go and Python."
- "Beginner in machine learning."
- "Has 10 years of legal experience."
- "Familiar with this product's API."
Expertise affects what level to pitch responses at.
Context
The user's broader situation.
- "Works at Acme Corp; team of 8 engineers."
- "Lives in Vancouver; PT timezone."
- "Building a SaaS for restaurants."
- "Currently focused on the migration project."
Context affects how to interpret requests.
Goals
What the user is trying to accomplish.
- "Wants to ship the auth migration this quarter."
- "Trying to learn Rust."
- "Researching options for a database upgrade."
Goals shape how to prioritize among possible responses.
Constraints
What the user can't do.
- "Cannot use AWS due to corporate policy."
- "Has limited time today (15 minutes available)."
- "Working alone; no team to delegate to."
Constraints rule out otherwise-useful options.
Updating Semantic Memory
Semantic memory updates throughout interactions.
Explicit User Statements
The user states a preference or fact:
- "I prefer brief responses."
- "I'm new to this language."
- "I work at Acme."
Direct ingest. These statements get distilled into semantic facts.
Inference from Episodes
The user repeatedly demonstrates a pattern. Multiple episodes form the basis for a generalization.
- User has asked clarifying questions in three different sessions → "user values clarity over speed."
- User has used the same code style in five projects → "user prefers functional patterns."
- User has rejected verbose responses repeatedly → "user prefers brief responses."
Inference happens periodically. After every N episodes, an LLM analyzes patterns; new semantic facts emerge.
Correction
The user corrects an existing fact:
- "Actually, I'm in Calgary now, not Vancouver."
- "I've moved past beginner; I'd say intermediate."
- "I'm not at Acme anymore; I'm at BigCorp."
The system updates. Old facts are superseded; new ones become active.
Storing Semantic Memory
Structured Profile
A typed object with named fields:
{
"user_id": "u-123",
"expertise": {
"python": "expert",
"javascript": "intermediate",
"machine_learning": "beginner"
},
"preferences": {
"response_length": "brief",
"code_style": "functional",
"tone": "professional"
},
"context": {
"timezone": "America/Vancouver",
"company": "Acme Corp",
"current_project": "auth migration"
}
}
Pros: easy to query and reason about.
Cons: rigid; new categories require schema changes.
Unstructured Facts
A list of natural-language facts, embedded for retrieval:
- "User is a senior backend engineer with 10 years of experience."
- "User strongly prefers concise responses without lengthy explanations."
- "User lives in Vancouver, Pacific Time."
- "User is currently working on an auth migration."
Pros: flexible.
Cons: facts can contradict; harder to reason structurally.
Hybrid
Structured for fields the system needs to reason about; unstructured for nuanced descriptions.
Most production systems do this. The structured layer captures the most-queried information; the unstructured layer captures the rest.
Surfacing Semantic Memory
How does semantic memory affect the agent's responses?
In the System Prompt
Inject relevant facts into the system prompt at the start of each session:
You are an assistant for User u-123. Some context:
- They are a senior backend engineer.
- They prefer concise responses.
- They are working on an auth migration project.
- They use TypeScript and Go.
The agent's responses are shaped by this context throughout the session.
For long-running sessions, the context may need refreshing or rotating; the system prompt has a token budget.
Retrieved Per Turn
Semantic facts are stored separately and retrieved when relevant. For a query about timezone, retrieve the user's timezone fact.
More token-efficient; only relevant facts enter the prompt.
Both
Some facts (preferences, expertise) live in the system prompt because they're always relevant. Others (specific project details) get retrieved on demand.
Avoiding Wrong Models
The biggest risk: a wrong semantic fact contaminates every response.
The agent thinks the user is a beginner; the user is actually expert. Every response is patronizing.
Mitigations:
Provenance
Each semantic fact has a source. The agent knows whether a fact came from explicit user statement (high confidence) or inference from a single episode (low confidence).
{
"fact": "User is a beginner in Python.",
"source": "inferred from 1 conversation",
"confidence": 0.4
}
The agent treats low-confidence facts as hypotheses, not certainties.
Revisability
Make it easy for the user to correct facts. "I notice I have you down as a Python beginner. Is that right?"
Or expose a profile page where the user can directly edit.
Triangulation
Don't act on a fact based on one piece of evidence. Wait for corroboration before treating as established.
The first time a user says "I prefer brief," store it as a candidate. The second time they confirm (explicitly or by accepting brief responses without complaint), promote to fact.
Decay
Facts not reinforced over time decay in confidence. The user said something three months ago and hasn't repeated; the fact is more likely stale.
Privacy
Semantic memory is intimate. The agent has a model of the user. The user has the right to:
- See the model.
- Correct the model.
- Delete parts of the model.
- Export the model.
- Know what's being stored.
Build the inspection and edit UI from day one. Users who can see their semantic profile trust the system; users who can't, eventually become wary.
For sensitive categories (health, religion, finances) — be extra careful. Either don't store, or store only with explicit consent.
Cross-Context Boundaries
Semantic memory should respect contexts. The agent's knowledge of the user in their work context might be different from their personal context.
If the agent has access to both, it should distinguish:
- Work facts: company, role, projects, work-related preferences.
- Personal facts: hobbies, family, personal preferences.
In a work-context conversation, the agent surfaces work facts but not personal ones (unless the user asks).
This is a design choice. Some users want everything blended; others want separation. Design for both as configurable.
Bootstrapping a New User
When a new user starts, the semantic memory is empty. The agent should:
- Ask a few onboarding questions to populate basics (expertise level, preferences).
- Default to safe defaults (medium-length responses, professional tone, no assumptions).
- Update aggressively as the user reveals more.
Don't make the user fill out a profile form. The semantic memory builds through conversation; onboarding is a few questions, not an ordeal.
Anti-Patterns
No semantic memory; only episodic. The agent doesn't have generalizations, only events. Every session feels like meeting the user fresh.
Semantic memory built only from explicit statements. The agent misses what's implicit. Inference fills the gap.
No confidence scoring. A wrong fact is treated as gospel. Provenance and confidence catch this.
Permanent facts. No decay. The user has changed; the model hasn't. Decay confidence over time.
Memory invisible to user. User can't correct. Trust erodes. Expose the model.
No context separation. Work and personal blend; context-inappropriate facts surface. Tag and respect contexts.
Rigid schema only. New patterns require code changes. Hybrid structure with unstructured facts is more flexible.
Install this skill directly: skilldb add agent-memory-skills
Related Skills
Designing Episodic Memory for Agents
Build the episodic memory layer that stores specific past events and
Short-Term vs Long-Term Agent Memory
Design the memory architecture for a stateful agent — what's in
Vector-Backed Agent Memory with RAG
Implement an agent memory system using a vector database with retrieval-
Adversarial Code Review
Adversarial implementation review methodology that validates code completeness against requirements with fresh objectivity. Uses a coach-player dialectical loop to catch real gaps in security, logic, and data flow.
API Design Testing
Design, document, and test APIs following RESTful principles, consistent
Architecture
Design software systems with sound architecture — choosing patterns, defining boundaries,