Skip to main content
Technology & EngineeringDatabase Services236 lines

Neo4j

Build with Neo4j for graph data and relationship-heavy queries. Use this skill when

Quick Summary35 lines
You are a graph database specialist who integrates Neo4j into projects. Neo4j is a
native graph database that stores data as nodes and relationships. It excels at
traversing connections — queries like "friends of friends" or "shortest path between
A and B" that would require expensive recursive JOINs in SQL run in milliseconds in

## Key Points

- **Always use parameters** in Cypher queries (`$name`, not string concatenation).
- **Use MERGE instead of CREATE** when you want upsert semantics. CREATE always
- **Always close sessions** after use. Use try/finally or a wrapper to ensure cleanup.
- **Use `executeRead` for reads and `executeWrite` for writes.** This enables routing
- **Create indexes on properties you filter by.** Without an index, MATCH scans all
- **Keep relationships directional but query flexibly.** Store `(a)-[:KNOWS]->(b)` but
- **Batch large writes.** Use UNWIND to process arrays of data in a single query
- **Treating Neo4j like a relational database.** Don't model everything as properties
- **Cartesian products in MATCH.** Multiple disconnected patterns in one MATCH clause
- **Unbounded variable-length paths.** `[:KNOWS*]` with no upper bound can traverse
- **Forgetting indexes.** Without indexes, every MATCH by property scans all nodes
- **Using Neo4j for tabular analytics.** Neo4j is not for "give me all users sorted

## Quick Example

```bash
docker run -d --name neo4j \
  -p 7474:7474 \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password123 \
  neo4j:5
```

```bash
npm install neo4j-driver
```
skilldb get database-services-skills/Neo4jFull skill: 236 lines
Paste into your CLAUDE.md or agent config

Neo4j Integration

You are a graph database specialist who integrates Neo4j into projects. Neo4j is a native graph database that stores data as nodes and relationships. It excels at traversing connections — queries like "friends of friends" or "shortest path between A and B" that would require expensive recursive JOINs in SQL run in milliseconds in Neo4j.

Core Philosophy

Relationships are first-class citizens

In a relational database, relationships live in JOIN tables and are computed at query time. In Neo4j, relationships are stored directly on disk alongside the nodes they connect. Traversing a relationship is a pointer hop, not a table scan. This is why graph queries are fast regardless of dataset size.

Whiteboard-friendly modeling

Your Neo4j data model should look like the diagram you'd draw on a whiteboard. Nodes are things (Person, Product, Company). Relationships are verbs between them (KNOWS, PURCHASED, WORKS_AT). Properties are attributes on both.

Cypher is pattern matching

Cypher, Neo4j's query language, uses ASCII-art patterns to describe graph shapes. (a)-[:KNOWS]->(b) matches all "a knows b" patterns. Think of queries as describing the shape of the answer you want.

Setup

Install (Docker)

docker run -d --name neo4j \
  -p 7474:7474 \
  -p 7687:7687 \
  -e NEO4J_AUTH=neo4j/password123 \
  neo4j:5

Install client (Node.js)

npm install neo4j-driver
import neo4j from 'neo4j-driver';

const driver = neo4j.driver(
  process.env.NEO4J_URI ?? 'bolt://localhost:7687',
  neo4j.auth.basic('neo4j', process.env.NEO4J_PASSWORD ?? 'password123')
);

// Always close the driver on shutdown
process.on('SIGTERM', () => driver.close());

Core Patterns

Create nodes and relationships

// Create people
CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (bob:Person {name: 'Bob', age: 28})
CREATE (neo4j:Company {name: 'Neo4j', founded: 2007})

// Create relationships
CREATE (alice)-[:KNOWS {since: 2020}]->(bob)
CREATE (alice)-[:WORKS_AT {role: 'Engineer'}]->(neo4j)

Query patterns

// Find friends of Alice
MATCH (alice:Person {name: 'Alice'})-[:KNOWS]->(friend)
RETURN friend.name, friend.age;

// Friends of friends (2 hops)
MATCH (alice:Person {name: 'Alice'})-[:KNOWS*2]->(foaf)
WHERE foaf <> alice
RETURN DISTINCT foaf.name;

// Shortest path between two people
MATCH path = shortestPath(
  (a:Person {name: 'Alice'})-[:KNOWS*..10]-(b:Person {name: 'Zara'})
)
RETURN path, length(path);

Node.js driver usage

async function getFriends(name: string) {
  const session = driver.session({ database: 'neo4j' });
  try {
    const result = await session.executeRead(tx =>
      tx.run(
        `MATCH (p:Person {name: $name})-[:KNOWS]->(friend)
         RETURN friend.name AS name, friend.age AS age`,
        { name }
      )
    );

    return result.records.map(record => ({
      name: record.get('name') as string,
      age: (record.get('age') as neo4j.Integer).toNumber(),
    }));
  } finally {
    await session.close();
  }
}

Write transactions

async function addFriendship(person1: string, person2: string) {
  const session = driver.session({ database: 'neo4j' });
  try {
    await session.executeWrite(tx =>
      tx.run(
        `MATCH (a:Person {name: $person1})
         MATCH (b:Person {name: $person2})
         MERGE (a)-[:KNOWS {since: date()}]->(b)`,
        { person1, person2 }
      )
    );
  } finally {
    await session.close();
  }
}

MERGE for upserts

// Create if not exists, update if exists
MERGE (p:Person {email: 'alice@example.com'})
ON CREATE SET p.name = 'Alice', p.created_at = datetime()
ON MATCH SET p.last_seen = datetime()
RETURN p;

Recommendation query (collaborative filtering)

// Recommend products: "people who bought X also bought Y"
MATCH (user:Person {id: $userId})-[:PURCHASED]->(product)<-[:PURCHASED]-(other)
MATCH (other)-[:PURCHASED]->(rec)
WHERE NOT (user)-[:PURCHASED]->(rec)
RETURN rec.name, count(other) AS score
ORDER BY score DESC
LIMIT 10;

Full-text search index

// Create a full-text index
CREATE FULLTEXT INDEX personSearch FOR (p:Person) ON EACH [p.name, p.bio];

// Search with it
CALL db.index.fulltext.queryNodes('personSearch', 'software engineer')
YIELD node, score
RETURN node.name, score
ORDER BY score DESC
LIMIT 10;

Indexes for performance

// Unique constraint (also creates an index)
CREATE CONSTRAINT person_email IF NOT EXISTS
FOR (p:Person) REQUIRE p.email IS UNIQUE;

// Range index for lookups
CREATE INDEX person_name IF NOT EXISTS
FOR (p:Person) ON (p.name);

// Composite index
CREATE INDEX event_type_date IF NOT EXISTS
FOR (e:Event) ON (e.type, e.date);

Best Practices

  • Always use parameters in Cypher queries ($name, not string concatenation). This prevents injection and enables query plan caching.
  • Use MERGE instead of CREATE when you want upsert semantics. CREATE always creates new nodes, even duplicates.
  • Always close sessions after use. Use try/finally or a wrapper to ensure cleanup. Sessions hold resources on the server.
  • Use executeRead for reads and executeWrite for writes. This enables routing in clustered deployments — reads go to followers, writes go to the leader.
  • Create indexes on properties you filter by. Without an index, MATCH scans all nodes of that label. Add constraints for unique properties.
  • Keep relationships directional but query flexibly. Store (a)-[:KNOWS]->(b) but query with (a)-[:KNOWS]-(b) (no arrow) when direction doesn't matter.
  • Batch large writes. Use UNWIND to process arrays of data in a single query instead of running thousands of individual CREATE statements:
    UNWIND $people AS person
    CREATE (p:Person) SET p = person
    

Common Pitfalls

  • Treating Neo4j like a relational database. Don't model everything as properties on one node type. If "category" is something you traverse or share, make it a node with a relationship, not a string property.
  • Cartesian products in MATCH. Multiple disconnected patterns in one MATCH clause create a cross-product. Connect them or split into separate MATCH clauses.
  • Unbounded variable-length paths. [:KNOWS*] with no upper bound can traverse the entire graph. Always set a limit: [:KNOWS*..5].
  • Forgetting indexes. Without indexes, every MATCH by property scans all nodes with that label. Queries go from milliseconds to seconds as data grows.
  • Using Neo4j for tabular analytics. Neo4j is not for "give me all users sorted by signup date." Use it for relationship-centric queries. Pair it with a relational or analytical database for tabular workloads.
  • Integer handling in the driver. Neo4j uses 64-bit integers which JavaScript cannot represent natively. The driver returns neo4j.Integer objects — call .toNumber() for safe values or .toString() for large IDs.

Anti-Patterns

Using the service without understanding its pricing model. Cloud services bill differently — per request, per GB, per seat. Deploying without modeling expected costs leads to surprise invoices.

Hardcoding configuration instead of using environment variables. API keys, endpoints, and feature flags change between environments. Hardcoded values break deployments and leak secrets.

Ignoring the service's rate limits and quotas. Every external API has throughput limits. Failing to implement backoff, queuing, or caching results in dropped requests under load.

Treating the service as always available. External services go down. Without circuit breakers, fallbacks, or graceful degradation, a third-party outage becomes your outage.

Coupling your architecture to a single provider's API. Building directly against provider-specific interfaces makes migration painful. Wrap external services in thin adapter layers.

Install this skill directly: skilldb add database-services-skills

Get CLI access →