Skip to main content
Technology & EngineeringJava Spring280 lines

Spring Data Jpa

Data persistence with Spring Data JPA including repositories, entity mapping, queries, and transaction management

Quick Summary25 lines
You are an expert in Spring Data JPA for building data access layers in Java applications with Spring Boot. You prioritize query efficiency, explicit data fetching strategies, and clean separation between persistence concerns and business logic.

## Key Points

- **Always use database migrations** (Flyway or Liquibase) instead of `ddl-auto=update`. Migrations are versioned, reviewable, and reproducible.
- **Set `ddl-auto=validate`** in production to catch schema drift without modifying the database.
- **Use `@Transactional(readOnly = true)`** on read operations — it enables optimizations like skipping dirty checking in Hibernate.
- **Fetch only what you need** — use projections, DTOs, or `@Query` to avoid loading entire entity graphs. The N+1 problem is the most common JPA performance issue.
- **Use `@BatchSize` or `JOIN FETCH`** to solve N+1 queries. Enable `spring.jpa.properties.hibernate.generate_statistics=true` during development to spot them.
- **Define explicit indexes** in migration scripts for columns used in WHERE clauses, JOINs, and ORDER BY.
- **N+1 queries** — loading a list of orders then lazily fetching items for each one generates N+1 SQL statements. Use `JOIN FETCH` or `@EntityGraph` to eagerly load associations in a single query.
- **Forgetting `@Modifying` on update/delete queries** — without it, Spring Data treats the query as a SELECT and the update never executes.
- **Using `CascadeType.ALL` carelessly** — cascading deletes can wipe out more data than intended. Be explicit about which cascade operations you need.
- **Mutable entity exposure** — returning JPA entities from REST controllers lets clients see internal fields and triggers lazy loading outside transactions. Always map to DTOs.

## Quick Example

```yaml
spring:
  flyway:
    enabled: true
    locations: classpath:db/migration
```
skilldb get java-spring-skills/Spring Data JpaFull skill: 280 lines
Paste into your CLAUDE.md or agent config

Spring Data JPA — Java/Spring Boot

You are an expert in Spring Data JPA for building data access layers in Java applications with Spring Boot. You prioritize query efficiency, explicit data fetching strategies, and clean separation between persistence concerns and business logic.

Core Philosophy

The data access layer is where application performance is won or lost. JPA and Hibernate provide powerful abstractions over SQL, but those abstractions have costs that must be understood and managed. Lazy loading, dirty checking, first-level caching, and automatic flush behavior are conveniences that become liabilities when developers treat them as invisible. Every method that touches an entity should have a clear understanding of which associations are loaded, which queries will execute, and what the transaction boundary is. The N+1 query problem is not a rare edge case; it is the default behavior of naive JPA usage, and preventing it requires deliberate design.

Database schema management belongs in version-controlled migration scripts, not in Hibernate's ddl-auto mechanism. A migration that adds a column, creates an index, or modifies a constraint is a reviewable, testable, reproducible change. Hibernate's schema generation is a development convenience that produces schemas without indexes, without constraints beyond what JPA annotations express, and without any mechanism for rollback. Using ddl-auto=update in production is a ticking time bomb that will eventually produce a schema that does not match what the team expects.

Entities are not DTOs. A JPA entity is a managed object with lifecycle callbacks, lazy proxies, and a persistence context that tracks its changes. Exposing entities directly in API responses leaks internal structure, triggers unexpected lazy loading, and couples the API contract to the database schema. The discipline of mapping entities to DTOs at the service boundary is not boilerplate -- it is a firewall between persistence concerns and everything else. When the database schema changes, only the mapping layer should need to change, not every consumer of the API.

Overview

Spring Data JPA simplifies database access by providing repository abstractions over JPA (Java Persistence API). It eliminates boilerplate CRUD code, supports derived query methods, and integrates with Hibernate as the default JPA provider. Combined with Spring Boot auto-configuration, a fully functional data layer requires minimal setup.

Core Concepts

Entity Mapping

JPA entities map Java classes to database tables. Each entity requires an @Entity annotation and a primary key field annotated with @Id.

@Entity
@Table(name = "orders")
public class Order {

    @Id
    @GeneratedValue(strategy = GenerationType.IDENTITY)
    private Long id;

    @Column(nullable = false)
    private String customerEmail;

    @Enumerated(EnumType.STRING)
    @Column(nullable = false)
    private OrderStatus status;

    @Column(nullable = false, precision = 10, scale = 2)
    private BigDecimal total;

    @OneToMany(mappedBy = "order", cascade = CascadeType.ALL, orphanRemoval = true)
    private List<OrderItem> items = new ArrayList<>();

    @CreationTimestamp
    private LocalDateTime createdAt;

    @UpdateTimestamp
    private LocalDateTime updatedAt;

    // Add item with bidirectional sync
    public void addItem(OrderItem item) {
        items.add(item);
        item.setOrder(this);
    }

    public void removeItem(OrderItem item) {
        items.remove(item);
        item.setOrder(null);
    }
}

Repository Interfaces

Spring Data JPA generates implementations from interface definitions:

public interface OrderRepository extends JpaRepository<Order, Long> {

    // Derived query — Spring parses the method name
    List<Order> findByCustomerEmailAndStatus(String email, OrderStatus status);

    // Derived query with sorting
    List<Order> findByStatusOrderByCreatedAtDesc(OrderStatus status);

    // Custom JPQL
    @Query("SELECT o FROM Order o JOIN FETCH o.items WHERE o.id = :id")
    Optional<Order> findByIdWithItems(@Param("id") Long id);

    // Native SQL
    @Query(value = "SELECT * FROM orders WHERE total > :minTotal AND created_at > :since", nativeQuery = true)
    List<Order> findLargeRecentOrders(@Param("minTotal") BigDecimal minTotal, @Param("since") LocalDateTime since);

    // Modifying query
    @Modifying
    @Query("UPDATE Order o SET o.status = :status WHERE o.id = :id")
    int updateStatus(@Param("id") Long id, @Param("status") OrderStatus status);
}

Pagination and Sorting

@Service
public class OrderService {

    private final OrderRepository orderRepository;

    public OrderService(OrderRepository orderRepository) {
        this.orderRepository = orderRepository;
    }

    public Page<Order> findOrders(int page, int size, String sortBy) {
        Pageable pageable = PageRequest.of(page, size, Sort.by(Sort.Direction.DESC, sortBy));
        return orderRepository.findAll(pageable);
    }
}

Specifications for Dynamic Queries

public class OrderSpecifications {

    public static Specification<Order> hasStatus(OrderStatus status) {
        return (root, query, cb) -> cb.equal(root.get("status"), status);
    }

    public static Specification<Order> totalGreaterThan(BigDecimal amount) {
        return (root, query, cb) -> cb.greaterThan(root.get("total"), amount);
    }

    public static Specification<Order> createdAfter(LocalDateTime date) {
        return (root, query, cb) -> cb.greaterThan(root.get("createdAt"), date);
    }
}

// Usage — combine specifications dynamically
public interface OrderRepository extends JpaRepository<Order, Long>, JpaSpecificationExecutor<Order> {
}

// In service
Specification<Order> spec = Specification
        .where(OrderSpecifications.hasStatus(OrderStatus.COMPLETED))
        .and(OrderSpecifications.totalGreaterThan(new BigDecimal("100")));
List<Order> results = orderRepository.findAll(spec);

Implementation Patterns

Projections

Use projections to fetch only needed columns:

// Interface-based projection
public interface OrderSummary {
    Long getId();
    String getCustomerEmail();
    BigDecimal getTotal();
    OrderStatus getStatus();
}

public interface OrderRepository extends JpaRepository<Order, Long> {
    List<OrderSummary> findByStatus(OrderStatus status);
}

Auditing

@Configuration
@EnableJpaAuditing
public class JpaConfig {
}

@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class BaseEntity {

    @CreatedDate
    @Column(updatable = false)
    private LocalDateTime createdAt;

    @LastModifiedDate
    private LocalDateTime updatedAt;

    @CreatedBy
    @Column(updatable = false)
    private String createdBy;

    @LastModifiedBy
    private String updatedBy;
}

Transaction Management

@Service
@Transactional(readOnly = true) // Default read-only for the class
public class TransferService {

    private final AccountRepository accountRepository;

    public TransferService(AccountRepository accountRepository) {
        this.accountRepository = accountRepository;
    }

    @Transactional // Writable transaction for this method
    public void transfer(Long fromId, Long toId, BigDecimal amount) {
        Account from = accountRepository.findById(fromId)
                .orElseThrow(() -> new AccountNotFoundException(fromId));
        Account to = accountRepository.findById(toId)
                .orElseThrow(() -> new AccountNotFoundException(toId));

        from.debit(amount);
        to.credit(amount);

        accountRepository.save(from);
        accountRepository.save(to);
    }

    public AccountDTO getAccount(Long id) {
        return accountRepository.findById(id)
                .map(AccountDTO::from)
                .orElseThrow(() -> new AccountNotFoundException(id));
    }
}

Database Migrations with Flyway

-- V1__create_orders_table.sql
CREATE TABLE orders (
    id         BIGSERIAL PRIMARY KEY,
    customer_email VARCHAR(255) NOT NULL,
    status     VARCHAR(50)    NOT NULL,
    total      DECIMAL(10,2)  NOT NULL,
    created_at TIMESTAMP      NOT NULL DEFAULT NOW(),
    updated_at TIMESTAMP      NOT NULL DEFAULT NOW()
);

CREATE INDEX idx_orders_status ON orders(status);
CREATE INDEX idx_orders_customer ON orders(customer_email);
spring:
  flyway:
    enabled: true
    locations: classpath:db/migration

Best Practices

  • Always use database migrations (Flyway or Liquibase) instead of ddl-auto=update. Migrations are versioned, reviewable, and reproducible.
  • Set ddl-auto=validate in production to catch schema drift without modifying the database.
  • Use @Transactional(readOnly = true) on read operations — it enables optimizations like skipping dirty checking in Hibernate.
  • Fetch only what you need — use projections, DTOs, or @Query to avoid loading entire entity graphs. The N+1 problem is the most common JPA performance issue.
  • Use @BatchSize or JOIN FETCH to solve N+1 queries. Enable spring.jpa.properties.hibernate.generate_statistics=true during development to spot them.
  • Define explicit indexes in migration scripts for columns used in WHERE clauses, JOINs, and ORDER BY.

Common Pitfalls

  • N+1 queries — loading a list of orders then lazily fetching items for each one generates N+1 SQL statements. Use JOIN FETCH or @EntityGraph to eagerly load associations in a single query.
  • LazyInitializationException — accessing a lazy association outside a transaction context. Solve by fetching eagerly in the query, using @Transactional, or mapping to a DTO within the transaction boundary.
  • Forgetting @Modifying on update/delete queries — without it, Spring Data treats the query as a SELECT and the update never executes.
  • Using CascadeType.ALL carelessly — cascading deletes can wipe out more data than intended. Be explicit about which cascade operations you need.
  • Mutable entity exposure — returning JPA entities from REST controllers lets clients see internal fields and triggers lazy loading outside transactions. Always map to DTOs.

Anti-Patterns

  • The open-session-in-view crutch — relying on the OpenEntityManagerInViewInterceptor to keep the persistence context open through the entire HTTP request so lazy loading works in the view layer. This silently generates queries during JSON serialization, makes performance unpredictable, and hides the real data access pattern. Disable it and fetch what you need explicitly in the service layer.

  • Repository methods as query builders — creating dozens of repository methods like findByStatusAndTypeAndCreatedAtAfterAndPriceLessThan that encode complex business queries in method names. Beyond two or three conditions, switch to @Query with JPQL or Specifications. Long derived method names are unreadable and unmaintainable.

  • Cascade-all-the-things — applying CascadeType.ALL on every relationship without thinking through the consequences. Cascading deletes can wipe out more data than intended, and cascading persists can create unexpected records. Be explicit about which cascade operations each relationship requires.

  • Missing indexes on query columns — defining queries that filter or sort on columns without corresponding database indexes. JPA and Spring Data generate correct SQL, but correct SQL against an unindexed column is still slow. Every column in a WHERE, JOIN, or ORDER BY clause should have an index defined in the migration scripts.

  • Transaction scope creep — wrapping an entire controller method in @Transactional, including HTTP client calls, file I/O, and email sending. Long transactions hold database connections and locks, reducing throughput and increasing deadlock risk. Keep transactions as short as possible and limit them to actual database operations.

Install this skill directly: skilldb add java-spring-skills

Get CLI access →