Skip to main content
Technology & EngineeringAgent Memory280 lines

Semantic Memory and User Modeling

Build the agent's accumulated model of the user — preferences, expertise,

Quick Summary35 lines
Semantic memory is the agent's model of the user. Not events ("user did X on date Y") but generalizations ("user prefers concise output," "user is intermediate in Python," "user lives in Vancouver").

## Key Points

- "Prefers concise responses."
- "Likes code examples in TypeScript over Python."
- "Wants links cited inline rather than at the end."
- "Communicates formally."
- "Senior backend engineer; expert in Go and Python."
- "Beginner in machine learning."
- "Has 10 years of legal experience."
- "Familiar with this product's API."
- "Works at Acme Corp; team of 8 engineers."
- "Lives in Vancouver; PT timezone."
- "Building a SaaS for restaurants."
- "Currently focused on the migration project."

## Quick Example

```
- "User is a senior backend engineer with 10 years of experience."
- "User strongly prefers concise responses without lengthy explanations."
- "User lives in Vancouver, Pacific Time."
- "User is currently working on an auth migration."
```

```
You are an assistant for User u-123. Some context:
- They are a senior backend engineer.
- They prefer concise responses.
- They are working on an auth migration project.
- They use TypeScript and Go.
```
skilldb get agent-memory-skills/Semantic Memory and User ModelingFull skill: 280 lines
Paste into your CLAUDE.md or agent config

Semantic memory is the agent's model of the user. Not events ("user did X on date Y") but generalizations ("user prefers concise output," "user is intermediate in Python," "user lives in Vancouver").

Semantic memory shapes every interaction. The agent that knows the user prefers brief responses gives brief responses by default. The agent that knows the user is in finance interprets ambiguous questions in finance context. Done well, the agent feels like it knows the user.

Done badly, semantic memory is wrong, stale, or invasive. Wrong: the agent confidently misinterprets requests because it has the wrong model. Stale: the user has moved on but the agent hasn't. Invasive: the agent surfaces information the user didn't expect to be remembered.

What's in Semantic Memory

Categories of user knowledge:

Preferences

How the user wants things done.

  • "Prefers concise responses."
  • "Likes code examples in TypeScript over Python."
  • "Wants links cited inline rather than at the end."
  • "Communicates formally."

These are explicit or inferred preferences that affect output style.

Expertise

What the user knows.

  • "Senior backend engineer; expert in Go and Python."
  • "Beginner in machine learning."
  • "Has 10 years of legal experience."
  • "Familiar with this product's API."

Expertise affects what level to pitch responses at.

Context

The user's broader situation.

  • "Works at Acme Corp; team of 8 engineers."
  • "Lives in Vancouver; PT timezone."
  • "Building a SaaS for restaurants."
  • "Currently focused on the migration project."

Context affects how to interpret requests.

Goals

What the user is trying to accomplish.

  • "Wants to ship the auth migration this quarter."
  • "Trying to learn Rust."
  • "Researching options for a database upgrade."

Goals shape how to prioritize among possible responses.

Constraints

What the user can't do.

  • "Cannot use AWS due to corporate policy."
  • "Has limited time today (15 minutes available)."
  • "Working alone; no team to delegate to."

Constraints rule out otherwise-useful options.

Updating Semantic Memory

Semantic memory updates throughout interactions.

Explicit User Statements

The user states a preference or fact:

  • "I prefer brief responses."
  • "I'm new to this language."
  • "I work at Acme."

Direct ingest. These statements get distilled into semantic facts.

Inference from Episodes

The user repeatedly demonstrates a pattern. Multiple episodes form the basis for a generalization.

  • User has asked clarifying questions in three different sessions → "user values clarity over speed."
  • User has used the same code style in five projects → "user prefers functional patterns."
  • User has rejected verbose responses repeatedly → "user prefers brief responses."

Inference happens periodically. After every N episodes, an LLM analyzes patterns; new semantic facts emerge.

Correction

The user corrects an existing fact:

  • "Actually, I'm in Calgary now, not Vancouver."
  • "I've moved past beginner; I'd say intermediate."
  • "I'm not at Acme anymore; I'm at BigCorp."

The system updates. Old facts are superseded; new ones become active.

Storing Semantic Memory

Structured Profile

A typed object with named fields:

{
  "user_id": "u-123",
  "expertise": {
    "python": "expert",
    "javascript": "intermediate",
    "machine_learning": "beginner"
  },
  "preferences": {
    "response_length": "brief",
    "code_style": "functional",
    "tone": "professional"
  },
  "context": {
    "timezone": "America/Vancouver",
    "company": "Acme Corp",
    "current_project": "auth migration"
  }
}

Pros: easy to query and reason about.

Cons: rigid; new categories require schema changes.

Unstructured Facts

A list of natural-language facts, embedded for retrieval:

- "User is a senior backend engineer with 10 years of experience."
- "User strongly prefers concise responses without lengthy explanations."
- "User lives in Vancouver, Pacific Time."
- "User is currently working on an auth migration."

Pros: flexible.

Cons: facts can contradict; harder to reason structurally.

Hybrid

Structured for fields the system needs to reason about; unstructured for nuanced descriptions.

Most production systems do this. The structured layer captures the most-queried information; the unstructured layer captures the rest.

Surfacing Semantic Memory

How does semantic memory affect the agent's responses?

In the System Prompt

Inject relevant facts into the system prompt at the start of each session:

You are an assistant for User u-123. Some context:
- They are a senior backend engineer.
- They prefer concise responses.
- They are working on an auth migration project.
- They use TypeScript and Go.

The agent's responses are shaped by this context throughout the session.

For long-running sessions, the context may need refreshing or rotating; the system prompt has a token budget.

Retrieved Per Turn

Semantic facts are stored separately and retrieved when relevant. For a query about timezone, retrieve the user's timezone fact.

More token-efficient; only relevant facts enter the prompt.

Both

Some facts (preferences, expertise) live in the system prompt because they're always relevant. Others (specific project details) get retrieved on demand.

Avoiding Wrong Models

The biggest risk: a wrong semantic fact contaminates every response.

The agent thinks the user is a beginner; the user is actually expert. Every response is patronizing.

Mitigations:

Provenance

Each semantic fact has a source. The agent knows whether a fact came from explicit user statement (high confidence) or inference from a single episode (low confidence).

{
  "fact": "User is a beginner in Python.",
  "source": "inferred from 1 conversation",
  "confidence": 0.4
}

The agent treats low-confidence facts as hypotheses, not certainties.

Revisability

Make it easy for the user to correct facts. "I notice I have you down as a Python beginner. Is that right?"

Or expose a profile page where the user can directly edit.

Triangulation

Don't act on a fact based on one piece of evidence. Wait for corroboration before treating as established.

The first time a user says "I prefer brief," store it as a candidate. The second time they confirm (explicitly or by accepting brief responses without complaint), promote to fact.

Decay

Facts not reinforced over time decay in confidence. The user said something three months ago and hasn't repeated; the fact is more likely stale.

Privacy

Semantic memory is intimate. The agent has a model of the user. The user has the right to:

  • See the model.
  • Correct the model.
  • Delete parts of the model.
  • Export the model.
  • Know what's being stored.

Build the inspection and edit UI from day one. Users who can see their semantic profile trust the system; users who can't, eventually become wary.

For sensitive categories (health, religion, finances) — be extra careful. Either don't store, or store only with explicit consent.

Cross-Context Boundaries

Semantic memory should respect contexts. The agent's knowledge of the user in their work context might be different from their personal context.

If the agent has access to both, it should distinguish:

  • Work facts: company, role, projects, work-related preferences.
  • Personal facts: hobbies, family, personal preferences.

In a work-context conversation, the agent surfaces work facts but not personal ones (unless the user asks).

This is a design choice. Some users want everything blended; others want separation. Design for both as configurable.

Bootstrapping a New User

When a new user starts, the semantic memory is empty. The agent should:

  • Ask a few onboarding questions to populate basics (expertise level, preferences).
  • Default to safe defaults (medium-length responses, professional tone, no assumptions).
  • Update aggressively as the user reveals more.

Don't make the user fill out a profile form. The semantic memory builds through conversation; onboarding is a few questions, not an ordeal.

Anti-Patterns

No semantic memory; only episodic. The agent doesn't have generalizations, only events. Every session feels like meeting the user fresh.

Semantic memory built only from explicit statements. The agent misses what's implicit. Inference fills the gap.

No confidence scoring. A wrong fact is treated as gospel. Provenance and confidence catch this.

Permanent facts. No decay. The user has changed; the model hasn't. Decay confidence over time.

Memory invisible to user. User can't correct. Trust erodes. Expose the model.

No context separation. Work and personal blend; context-inappropriate facts surface. Tag and respect contexts.

Rigid schema only. New patterns require code changes. Hybrid structure with unstructured facts is more flexible.

Install this skill directly: skilldb add agent-memory-skills

Get CLI access →