Skip to main content
Technology & EngineeringDocument Generation Services344 lines

Docraptor

"DocRaptor: HTML-to-PDF API, Prince XML engine, CSS print styles, headers/footers, page breaks, async documents"

Quick Summary18 lines
DocRaptor is a hosted API powered by the Prince XML rendering engine, which offers the most complete CSS print specification support available. Choose DocRaptor when you need advanced print features like running headers, footnotes, cross-references, named pages, and CSS paged media that no browser engine handles well. It offloads rendering to an external service, eliminating the need to manage headless Chrome instances. The trade-off is network latency and a per-document cost, so it fits best for high-fidelity documents where layout precision justifies the expense.

## Key Points

- Use `test: true` during development to avoid consuming paid API calls; test documents include a watermark but are otherwise identical.
- Use the async API for documents that take more than a few seconds, typically anything over 20 pages or with heavy images.
- Leverage CSS paged media (`@page`, `@top-left`, `counter(page)`) for headers, footers, and page numbers rather than manually inserting them in HTML.
- Use `page-break-inside: avoid` on sections that must not split across pages, such as signature blocks and table rows.
- Use `thead { display: table-header-group; }` so table headers repeat on every page.
- Send complete self-contained HTML with inline styles or embedded CSS; external stylesheet fetches add latency and can fail.
- Set `prince_options.media` to `"print"` so print-specific CSS rules apply.
- Cache generated PDFs by content hash; regenerating identical documents wastes both time and API quota.
- **Ignoring the test flag.** Generating test documents in production wastes money; generating paid documents in development wastes money. Toggle based on environment.
- **Embedding massive base64 images in HTML.** This inflates payload size and can hit API limits. Host images at accessible URLs and reference them with `<img>` tags.
- **Using browser-specific CSS.** DocRaptor uses Prince, not Chrome. `-webkit-` prefixes and browser-specific features will not work. Stick to standard CSS and Prince extensions.
- **Not handling async failures.** Documents can fail to generate. Always check the status response and handle the `"failed"` state.
skilldb get document-generation-services-skills/DocraptorFull skill: 344 lines
Paste into your CLAUDE.md or agent config

DocRaptor Document Generation

Core Philosophy

DocRaptor is a hosted API powered by the Prince XML rendering engine, which offers the most complete CSS print specification support available. Choose DocRaptor when you need advanced print features like running headers, footnotes, cross-references, named pages, and CSS paged media that no browser engine handles well. It offloads rendering to an external service, eliminating the need to manage headless Chrome instances. The trade-off is network latency and a per-document cost, so it fits best for high-fidelity documents where layout precision justifies the expense.

Setup

// package.json dependencies
// "docraptor": "^3.0.0"  (or use raw HTTP)

import axios, { AxiosInstance } from "axios";

interface DocRaptorConfig {
  apiKey: string;
  baseUrl?: string;
  timeout?: number;
}

class DocRaptorClient {
  private http: AxiosInstance;

  constructor(private config: DocRaptorConfig) {
    this.http = axios.create({
      baseURL: config.baseUrl ?? "https://docraptor.com",
      timeout: config.timeout ?? 120_000,
      auth: {
        username: config.apiKey,
        password: "", // DocRaptor uses API key as username, no password
      },
    });
  }

  async createDocument(options: DocumentOptions): Promise<Buffer> {
    const response = await this.http.post(
      "/docs",
      {
        doc: {
          document_type: options.type ?? "pdf",
          document_content: options.html,
          document_url: options.url,
          name: options.name ?? "document.pdf",
          test: options.test ?? false,
          prince_options: {
            media: "print",
            ...options.princeOptions,
          },
          javascript: options.enableJavascript ?? false,
        },
      },
      { responseType: "arraybuffer" }
    );

    return Buffer.from(response.data);
  }

  async createAsyncDocument(options: DocumentOptions): Promise<string> {
    const response = await this.http.post("/async_docs", {
      doc: {
        document_type: options.type ?? "pdf",
        document_content: options.html,
        document_url: options.url,
        name: options.name ?? "document.pdf",
        test: options.test ?? false,
        prince_options: {
          media: "print",
          ...options.princeOptions,
        },
      },
    });

    return response.data.status_id;
  }

  async getAsyncStatus(
    statusId: string
  ): Promise<{ status: string; download_url?: string }> {
    const response = await this.http.get(
      `/status/${statusId}`
    );
    return response.data;
  }

  async waitForDocument(
    statusId: string,
    pollIntervalMs: number = 2000,
    maxWaitMs: number = 120_000
  ): Promise<Buffer> {
    const start = Date.now();

    while (Date.now() - start < maxWaitMs) {
      const status = await this.getAsyncStatus(statusId);

      if (status.status === "completed" && status.download_url) {
        const response = await this.http.get(status.download_url, {
          responseType: "arraybuffer",
        });
        return Buffer.from(response.data);
      }

      if (status.status === "failed") {
        throw new Error(`Document generation failed: ${statusId}`);
      }

      await new Promise((r) => setTimeout(r, pollIntervalMs));
    }

    throw new Error(`Timed out waiting for document: ${statusId}`);
  }
}

interface DocumentOptions {
  html?: string;
  url?: string;
  name?: string;
  type?: "pdf" | "xls" | "xlsx";
  test?: boolean;
  enableJavascript?: boolean;
  princeOptions?: Record<string, unknown>;
}

Key Techniques

CSS Print Styles for Prince

DocRaptor's power comes from Prince XML's CSS support. Structure your HTML with print-specific styles:

function buildInvoiceHtml(data: InvoiceData): string {
  return `<!DOCTYPE html>
<html>
<head>
<style>
  @page {
    size: A4;
    margin: 20mm 15mm 25mm 15mm;

    @top-left {
      content: "Invoice #${data.number}";
      font-size: 9pt;
      color: #6b7280;
    }

    @top-right {
      content: "${data.companyName}";
      font-size: 9pt;
      color: #6b7280;
    }

    @bottom-center {
      content: "Page " counter(page) " of " counter(pages);
      font-size: 8pt;
      color: #9ca3af;
    }
  }

  /* First page has different margins for letterhead */
  @page :first {
    margin-top: 40mm;

    @top-left { content: none; }
    @top-right { content: none; }
  }

  body {
    font-family: "Helvetica Neue", Helvetica, sans-serif;
    font-size: 10pt;
    line-height: 1.5;
    color: #1f2937;
  }

  table {
    width: 100%;
    border-collapse: collapse;
    page-break-inside: auto;
  }

  tr {
    page-break-inside: avoid;
    page-break-after: auto;
  }

  thead {
    display: table-header-group; /* Repeat header on each page */
  }

  .section { page-break-inside: avoid; }
  .page-break { page-break-before: always; }
  .keep-together { page-break-inside: avoid; }
</style>
</head>
<body>
  ${buildInvoiceBody(data)}
</body>
</html>`;
}

Synchronous PDF Generation

async function generateInvoice(
  client: DocRaptorClient,
  data: InvoiceData
): Promise<Buffer> {
  const html = buildInvoiceHtml(data);

  return client.createDocument({
    html,
    name: `invoice-${data.number}.pdf`,
    test: process.env.NODE_ENV !== "production",
  });
}

// Express handler
import { Request, Response } from "express";

async function downloadInvoice(req: Request, res: Response): Promise<void> {
  const client = new DocRaptorClient({ apiKey: process.env.DOCRAPTOR_API_KEY! });
  const data = await fetchInvoiceData(req.params.id);
  const pdf = await generateInvoice(client, data);

  res.setHeader("Content-Type", "application/pdf");
  res.setHeader("Content-Disposition", `attachment; filename="invoice-${data.number}.pdf"`);
  res.send(pdf);
}

Async Generation for Large Documents

async function generateLargeReport(
  client: DocRaptorClient,
  reportHtml: string
): Promise<Buffer> {
  const statusId = await client.createAsyncDocument({
    html: reportHtml,
    name: "annual-report.pdf",
    test: false,
  });

  // Poll until complete
  return client.waitForDocument(statusId, 3000, 300_000);
}

Page Breaks and Named Pages

function buildContractHtml(sections: ContractSection[]): string {
  const css = `
    @page cover { margin: 0; }
    @page content { margin: 25mm 20mm; }
    @page signature {
      margin: 25mm 20mm 40mm 20mm;
      @bottom-center { content: none; }
    }

    .cover { page: cover; }
    .content-section { page: content; }
    .signature-page { page: signature; page-break-before: always; }
  `;

  const body = sections
    .map(
      (s) => `
    <div class="content-section keep-together">
      <h2>${s.title}</h2>
      <div>${s.body}</div>
    </div>`
    )
    .join("\n");

  return `<!DOCTYPE html>
<html><head><style>${css}</style></head>
<body>
  <div class="cover">
    <h1>Service Agreement</h1>
  </div>
  ${body}
  <div class="signature-page">
    <h2>Signatures</h2>
    <div class="signature-block">
      <div>____________________________</div>
      <p>Authorized Signatory</p>
    </div>
  </div>
</body></html>`;
}

interface ContractSection {
  title: string;
  body: string;
}

Watermarks and Background Images

function addWatermark(html: string, watermarkText: string): string {
  const watermarkCss = `
    @page {
      @prince-overlay {
        content: "${watermarkText}";
        font-size: 80pt;
        color: rgba(200, 200, 200, 0.3);
        transform: rotate(-45deg);
        text-align: center;
        vertical-align: middle;
      }
    }
  `;

  return html.replace("</style>", `${watermarkCss}</style>`);
}

Best Practices

  • Use test: true during development to avoid consuming paid API calls; test documents include a watermark but are otherwise identical.
  • Use the async API for documents that take more than a few seconds, typically anything over 20 pages or with heavy images.
  • Leverage CSS paged media (@page, @top-left, counter(page)) for headers, footers, and page numbers rather than manually inserting them in HTML.
  • Use page-break-inside: avoid on sections that must not split across pages, such as signature blocks and table rows.
  • Use thead { display: table-header-group; } so table headers repeat on every page.
  • Send complete self-contained HTML with inline styles or embedded CSS; external stylesheet fetches add latency and can fail.
  • Set prince_options.media to "print" so print-specific CSS rules apply.
  • Cache generated PDFs by content hash; regenerating identical documents wastes both time and API quota.

Anti-Patterns

  • Using document_url for dynamic content without caching. DocRaptor fetches the URL from its servers; if the page requires authentication or produces different content per request, the result will be wrong.
  • Ignoring the test flag. Generating test documents in production wastes money; generating paid documents in development wastes money. Toggle based on environment.
  • Embedding massive base64 images in HTML. This inflates payload size and can hit API limits. Host images at accessible URLs and reference them with <img> tags.
  • Using browser-specific CSS. DocRaptor uses Prince, not Chrome. -webkit- prefixes and browser-specific features will not work. Stick to standard CSS and Prince extensions.
  • Not handling async failures. Documents can fail to generate. Always check the status response and handle the "failed" state.
  • Sending sensitive data without HTTPS. DocRaptor's API is HTTPS-only, but if you use document_url, ensure the source URL is also HTTPS so data is encrypted in transit.

Install this skill directly: skilldb add document-generation-services-skills

Get CLI access →