Neo4j
Build with Neo4j for graph data and relationship-heavy queries. Use this skill when
You are a graph database specialist who integrates Neo4j into projects. Neo4j is a native graph database that stores data as nodes and relationships. It excels at traversing connections — queries like "friends of friends" or "shortest path between A and B" that would require expensive recursive JOINs in SQL run in milliseconds in ## Key Points - **Always use parameters** in Cypher queries (`$name`, not string concatenation). - **Use MERGE instead of CREATE** when you want upsert semantics. CREATE always - **Always close sessions** after use. Use try/finally or a wrapper to ensure cleanup. - **Use `executeRead` for reads and `executeWrite` for writes.** This enables routing - **Create indexes on properties you filter by.** Without an index, MATCH scans all - **Keep relationships directional but query flexibly.** Store `(a)-[:KNOWS]->(b)` but - **Batch large writes.** Use UNWIND to process arrays of data in a single query - **Treating Neo4j like a relational database.** Don't model everything as properties - **Cartesian products in MATCH.** Multiple disconnected patterns in one MATCH clause - **Unbounded variable-length paths.** `[:KNOWS*]` with no upper bound can traverse - **Forgetting indexes.** Without indexes, every MATCH by property scans all nodes - **Using Neo4j for tabular analytics.** Neo4j is not for "give me all users sorted ## Quick Example ```bash docker run -d --name neo4j \ -p 7474:7474 \ -p 7687:7687 \ -e NEO4J_AUTH=neo4j/password123 \ neo4j:5 ``` ```bash npm install neo4j-driver ```
skilldb get database-services-skills/Neo4jFull skill: 236 linesNeo4j Integration
You are a graph database specialist who integrates Neo4j into projects. Neo4j is a native graph database that stores data as nodes and relationships. It excels at traversing connections — queries like "friends of friends" or "shortest path between A and B" that would require expensive recursive JOINs in SQL run in milliseconds in Neo4j.
Core Philosophy
Relationships are first-class citizens
In a relational database, relationships live in JOIN tables and are computed at query time. In Neo4j, relationships are stored directly on disk alongside the nodes they connect. Traversing a relationship is a pointer hop, not a table scan. This is why graph queries are fast regardless of dataset size.
Whiteboard-friendly modeling
Your Neo4j data model should look like the diagram you'd draw on a whiteboard. Nodes are things (Person, Product, Company). Relationships are verbs between them (KNOWS, PURCHASED, WORKS_AT). Properties are attributes on both.
Cypher is pattern matching
Cypher, Neo4j's query language, uses ASCII-art patterns to describe graph shapes.
(a)-[:KNOWS]->(b) matches all "a knows b" patterns. Think of queries as
describing the shape of the answer you want.
Setup
Install (Docker)
docker run -d --name neo4j \
-p 7474:7474 \
-p 7687:7687 \
-e NEO4J_AUTH=neo4j/password123 \
neo4j:5
Install client (Node.js)
npm install neo4j-driver
import neo4j from 'neo4j-driver';
const driver = neo4j.driver(
process.env.NEO4J_URI ?? 'bolt://localhost:7687',
neo4j.auth.basic('neo4j', process.env.NEO4J_PASSWORD ?? 'password123')
);
// Always close the driver on shutdown
process.on('SIGTERM', () => driver.close());
Core Patterns
Create nodes and relationships
// Create people
CREATE (alice:Person {name: 'Alice', age: 30})
CREATE (bob:Person {name: 'Bob', age: 28})
CREATE (neo4j:Company {name: 'Neo4j', founded: 2007})
// Create relationships
CREATE (alice)-[:KNOWS {since: 2020}]->(bob)
CREATE (alice)-[:WORKS_AT {role: 'Engineer'}]->(neo4j)
Query patterns
// Find friends of Alice
MATCH (alice:Person {name: 'Alice'})-[:KNOWS]->(friend)
RETURN friend.name, friend.age;
// Friends of friends (2 hops)
MATCH (alice:Person {name: 'Alice'})-[:KNOWS*2]->(foaf)
WHERE foaf <> alice
RETURN DISTINCT foaf.name;
// Shortest path between two people
MATCH path = shortestPath(
(a:Person {name: 'Alice'})-[:KNOWS*..10]-(b:Person {name: 'Zara'})
)
RETURN path, length(path);
Node.js driver usage
async function getFriends(name: string) {
const session = driver.session({ database: 'neo4j' });
try {
const result = await session.executeRead(tx =>
tx.run(
`MATCH (p:Person {name: $name})-[:KNOWS]->(friend)
RETURN friend.name AS name, friend.age AS age`,
{ name }
)
);
return result.records.map(record => ({
name: record.get('name') as string,
age: (record.get('age') as neo4j.Integer).toNumber(),
}));
} finally {
await session.close();
}
}
Write transactions
async function addFriendship(person1: string, person2: string) {
const session = driver.session({ database: 'neo4j' });
try {
await session.executeWrite(tx =>
tx.run(
`MATCH (a:Person {name: $person1})
MATCH (b:Person {name: $person2})
MERGE (a)-[:KNOWS {since: date()}]->(b)`,
{ person1, person2 }
)
);
} finally {
await session.close();
}
}
MERGE for upserts
// Create if not exists, update if exists
MERGE (p:Person {email: 'alice@example.com'})
ON CREATE SET p.name = 'Alice', p.created_at = datetime()
ON MATCH SET p.last_seen = datetime()
RETURN p;
Recommendation query (collaborative filtering)
// Recommend products: "people who bought X also bought Y"
MATCH (user:Person {id: $userId})-[:PURCHASED]->(product)<-[:PURCHASED]-(other)
MATCH (other)-[:PURCHASED]->(rec)
WHERE NOT (user)-[:PURCHASED]->(rec)
RETURN rec.name, count(other) AS score
ORDER BY score DESC
LIMIT 10;
Full-text search index
// Create a full-text index
CREATE FULLTEXT INDEX personSearch FOR (p:Person) ON EACH [p.name, p.bio];
// Search with it
CALL db.index.fulltext.queryNodes('personSearch', 'software engineer')
YIELD node, score
RETURN node.name, score
ORDER BY score DESC
LIMIT 10;
Indexes for performance
// Unique constraint (also creates an index)
CREATE CONSTRAINT person_email IF NOT EXISTS
FOR (p:Person) REQUIRE p.email IS UNIQUE;
// Range index for lookups
CREATE INDEX person_name IF NOT EXISTS
FOR (p:Person) ON (p.name);
// Composite index
CREATE INDEX event_type_date IF NOT EXISTS
FOR (e:Event) ON (e.type, e.date);
Best Practices
- Always use parameters in Cypher queries (
$name, not string concatenation). This prevents injection and enables query plan caching. - Use MERGE instead of CREATE when you want upsert semantics. CREATE always creates new nodes, even duplicates.
- Always close sessions after use. Use try/finally or a wrapper to ensure cleanup. Sessions hold resources on the server.
- Use
executeReadfor reads andexecuteWritefor writes. This enables routing in clustered deployments — reads go to followers, writes go to the leader. - Create indexes on properties you filter by. Without an index, MATCH scans all nodes of that label. Add constraints for unique properties.
- Keep relationships directional but query flexibly. Store
(a)-[:KNOWS]->(b)but query with(a)-[:KNOWS]-(b)(no arrow) when direction doesn't matter. - Batch large writes. Use UNWIND to process arrays of data in a single query
instead of running thousands of individual CREATE statements:
UNWIND $people AS person CREATE (p:Person) SET p = person
Common Pitfalls
- Treating Neo4j like a relational database. Don't model everything as properties on one node type. If "category" is something you traverse or share, make it a node with a relationship, not a string property.
- Cartesian products in MATCH. Multiple disconnected patterns in one MATCH clause create a cross-product. Connect them or split into separate MATCH clauses.
- Unbounded variable-length paths.
[:KNOWS*]with no upper bound can traverse the entire graph. Always set a limit:[:KNOWS*..5]. - Forgetting indexes. Without indexes, every MATCH by property scans all nodes with that label. Queries go from milliseconds to seconds as data grows.
- Using Neo4j for tabular analytics. Neo4j is not for "give me all users sorted by signup date." Use it for relationship-centric queries. Pair it with a relational or analytical database for tabular workloads.
- Integer handling in the driver. Neo4j uses 64-bit integers which JavaScript
cannot represent natively. The driver returns
neo4j.Integerobjects — call.toNumber()for safe values or.toString()for large IDs.
Anti-Patterns
Using the service without understanding its pricing model. Cloud services bill differently — per request, per GB, per seat. Deploying without modeling expected costs leads to surprise invoices.
Hardcoding configuration instead of using environment variables. API keys, endpoints, and feature flags change between environments. Hardcoded values break deployments and leak secrets.
Ignoring the service's rate limits and quotas. Every external API has throughput limits. Failing to implement backoff, queuing, or caching results in dropped requests under load.
Treating the service as always available. External services go down. Without circuit breakers, fallbacks, or graceful degradation, a third-party outage becomes your outage.
Coupling your architecture to a single provider's API. Building directly against provider-specific interfaces makes migration painful. Wrap external services in thin adapter layers.
Install this skill directly: skilldb add database-services-skills
Related Skills
Cassandra
Build with Apache Cassandra for high-availability distributed data. Use this skill
Clickhouse
Build with ClickHouse for real-time analytics and OLAP workloads. Use this skill
Cockroachdb
Build with CockroachDB as a distributed SQL database. Use this skill when the
Convex
Build with Convex as a reactive backend. Use this skill when the project needs
Drizzle
Use Drizzle ORM for type-safe SQL in TypeScript. Use this skill when the project
Dynamodb
Build with Amazon DynamoDB as a serverless NoSQL database. Use this skill when