Skip to main content
Technology & EngineeringImage Generation Services322 lines

DALL-E Image Generation

"DALL-E API (OpenAI): image generation, editing, variations, quality/style params, size options, Node SDK"

Quick Summary35 lines
DALL-E is OpenAI's image generation model, accessed through the same API and SDK used for GPT models. DALL-E 3 is the current generation, offering high-quality image creation with strong prompt adherence and text rendering capabilities. The API is intentionally simple: you provide a prompt and configuration parameters, and receive image URLs or base64 data. OpenAI rewrites your prompt internally to improve results, and returns the revised prompt alongside the image. DALL-E integrates naturally into applications already using the OpenAI SDK, making it the lowest-friction option for teams invested in the OpenAI ecosystem. The API supports generation, editing (inpainting), and variations, though DALL-E 3 currently supports only generation.

## Key Points

- **Use DALL-E 3 for single high-quality images**: It has the best prompt comprehension and text rendering of any DALL-E version.
- **Read the revised prompt**: DALL-E 3 rewrites your prompt for better results. Log and display the revised prompt so users understand what was generated.
- **Download URLs immediately**: OpenAI image URLs expire within about an hour. Save to your own storage right after generation.
- **Use `b64_json` for server-side processing**: When you need to process the image immediately (resize, composite, store), base64 avoids the download step.
- **Use `hd` quality judiciously**: HD images cost more and take longer. Use `standard` for previews and drafts, `hd` for final output.
- **Choose `natural` style for photorealism**: The `vivid` style (default) produces more dramatic, hyper-real images. Switch to `natural` for realistic photographs.
- **Implement content policy fallbacks**: Always catch content policy errors and provide user-friendly messages rather than exposing raw API errors.
- **Set organization ID for billing control**: When your account has multiple organizations, set the org ID to ensure correct billing.
- **Relying on URL persistence**: Never store OpenAI image URLs as permanent references. They expire. Always download and re-host.
- **Using DALL-E for rapid iteration**: At 10-20 seconds per image, DALL-E is not suited for real-time or interactive generation. Use faster services like fal.ai for that workflow.
- **Passing non-PNG images to edit/variation**: The DALL-E 2 edit and variation endpoints require square PNG images. Convert and resize before submitting.
- **Not handling timeouts**: Image generation can take 15-30 seconds. Set appropriate timeouts on your HTTP client and handle them gracefully.

## Quick Example

```typescript
import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});
```

```typescript
const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  organization: process.env.OPENAI_ORG_ID,
});
```
skilldb get image-generation-services-skills/DALL-E Image GenerationFull skill: 322 lines
Paste into your CLAUDE.md or agent config

DALL-E Image Generation (OpenAI)

Core Philosophy

DALL-E is OpenAI's image generation model, accessed through the same API and SDK used for GPT models. DALL-E 3 is the current generation, offering high-quality image creation with strong prompt adherence and text rendering capabilities. The API is intentionally simple: you provide a prompt and configuration parameters, and receive image URLs or base64 data. OpenAI rewrites your prompt internally to improve results, and returns the revised prompt alongside the image. DALL-E integrates naturally into applications already using the OpenAI SDK, making it the lowest-friction option for teams invested in the OpenAI ecosystem. The API supports generation, editing (inpainting), and variations, though DALL-E 3 currently supports only generation.

Setup

Install and configure the OpenAI Node SDK:

import OpenAI from "openai";

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
});

The SDK reads OPENAI_API_KEY from the environment automatically if you omit the constructor argument, but explicit configuration is clearer.

For organization-scoped access:

const openai = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  organization: process.env.OPENAI_ORG_ID,
});

Key Techniques

Text-to-Image with DALL-E 3

interface GenerateImageOptions {
  prompt: string;
  size?: "1024x1024" | "1792x1024" | "1024x1792";
  quality?: "standard" | "hd";
  style?: "vivid" | "natural";
  responseFormat?: "url" | "b64_json";
}

async function generateImage(options: GenerateImageOptions): Promise<{
  url?: string;
  base64?: string;
  revisedPrompt: string;
}> {
  const response = await openai.images.generate({
    model: "dall-e-3",
    prompt: options.prompt,
    n: 1, // DALL-E 3 only supports n=1
    size: options.size ?? "1024x1024",
    quality: options.quality ?? "standard",
    style: options.style ?? "vivid",
    response_format: options.responseFormat ?? "url",
  });

  const image = response.data[0];
  return {
    url: image.url,
    base64: image.b64_json,
    revisedPrompt: image.revised_prompt!,
  };
}

// Usage
const result = await generateImage({
  prompt: "A watercolor painting of a cozy bookshop interior with warm lighting",
  size: "1792x1024",
  quality: "hd",
  style: "natural",
});

console.log("Revised prompt:", result.revisedPrompt);
console.log("Image URL:", result.url);

DALL-E 2 for Multiple Images and Variations

DALL-E 2 supports generating multiple images per request and creating variations:

async function generateMultiple(
  prompt: string,
  count: number = 4,
  size: "256x256" | "512x512" | "1024x1024" = "1024x1024"
): Promise<Array<{ url: string }>> {
  const response = await openai.images.generate({
    model: "dall-e-2",
    prompt,
    n: count, // DALL-E 2 supports up to 10
    size,
    response_format: "url",
  });

  return response.data.map((img) => ({ url: img.url! }));
}

Image Variations

Generate variations of an existing image (DALL-E 2 only):

import fs from "fs";

async function createVariation(
  imagePath: string,
  count: number = 1,
  size: "256x256" | "512x512" | "1024x1024" = "1024x1024"
): Promise<Array<{ url: string }>> {
  const imageFile = fs.createReadStream(imagePath);

  const response = await openai.images.createVariation({
    image: imageFile,
    n: count,
    size,
    response_format: "url",
  });

  return response.data.map((img) => ({ url: img.url! }));
}

Image Editing (Inpainting)

Edit specific regions of an image using a mask (DALL-E 2 only):

async function editImage(
  imagePath: string,
  maskPath: string,
  prompt: string,
  options?: {
    count?: number;
    size?: "256x256" | "512x512" | "1024x1024";
  }
): Promise<Array<{ url: string }>> {
  const response = await openai.images.edit({
    image: fs.createReadStream(imagePath),
    mask: fs.createReadStream(maskPath),
    prompt,
    n: options?.count ?? 1,
    size: options?.size ?? "1024x1024",
    response_format: "url",
  });

  return response.data.map((img) => ({ url: img.url! }));
}

// The mask must be a PNG with transparent regions indicating where to edit
const results = await editImage(
  "room.png",
  "room-mask.png",
  "A modern standing desk with a monitor and plant"
);

Downloading and Storing Images

OpenAI image URLs expire after approximately one hour. Always download immediately:

async function downloadImage(url: string, outputPath: string): Promise<void> {
  const response = await fetch(url);
  if (!response.ok) {
    throw new Error(`Failed to download image: ${response.statusText}`);
  }
  const buffer = Buffer.from(await response.arrayBuffer());
  await fs.promises.writeFile(outputPath, buffer);
}

async function generateAndSave(
  prompt: string,
  outputPath: string
): Promise<{ path: string; revisedPrompt: string }> {
  const result = await generateImage({
    prompt,
    quality: "hd",
    responseFormat: "url",
  });

  await downloadImage(result.url!, outputPath);

  return {
    path: outputPath,
    revisedPrompt: result.revisedPrompt,
  };
}

Base64 Response for Direct Embedding

When you need to embed images without an intermediate download step:

async function generateBase64Image(prompt: string): Promise<string> {
  const result = await generateImage({
    prompt,
    responseFormat: "b64_json",
    quality: "standard",
  });

  // Use as data URI
  return `data:image/png;base64,${result.base64}`;
}

// For direct Buffer usage
async function generateImageBuffer(prompt: string): Promise<Buffer> {
  const result = await generateImage({
    prompt,
    responseFormat: "b64_json",
  });

  return Buffer.from(result.base64!, "base64");
}

Batch Generation with Rate Limiting

async function batchGenerate(
  prompts: string[],
  options?: { concurrency?: number; delayMs?: number }
): Promise<Array<{ prompt: string; url: string; revisedPrompt: string }>> {
  const concurrency = options?.concurrency ?? 3;
  const delayMs = options?.delayMs ?? 1000;
  const results: Array<{ prompt: string; url: string; revisedPrompt: string }> = [];

  for (let i = 0; i < prompts.length; i += concurrency) {
    const batch = prompts.slice(i, i + concurrency);

    const batchResults = await Promise.allSettled(
      batch.map((prompt) =>
        generateImage({ prompt, quality: "standard" })
      )
    );

    for (let j = 0; j < batchResults.length; j++) {
      const result = batchResults[j];
      if (result.status === "fulfilled") {
        results.push({
          prompt: batch[j],
          url: result.value.url!,
          revisedPrompt: result.value.revisedPrompt,
        });
      } else {
        console.error(`Failed for prompt "${batch[j]}":`, result.reason);
      }
    }

    if (i + concurrency < prompts.length) {
      await new Promise((r) => setTimeout(r, delayMs));
    }
  }

  return results;
}

Error Handling with Content Policy

async function safeGenerate(prompt: string): Promise<{
  success: boolean;
  url?: string;
  revisedPrompt?: string;
  error?: string;
}> {
  try {
    const result = await generateImage({ prompt, quality: "standard" });
    return {
      success: true,
      url: result.url,
      revisedPrompt: result.revisedPrompt,
    };
  } catch (error) {
    if (error instanceof OpenAI.APIError) {
      if (error.status === 400 && error.message.includes("content_policy")) {
        return {
          success: false,
          error: "The prompt was rejected by the content policy. Please revise.",
        };
      }
      if (error.status === 429) {
        return {
          success: false,
          error: "Rate limit exceeded. Please try again shortly.",
        };
      }
    }
    throw error;
  }
}

Best Practices

  • Use DALL-E 3 for single high-quality images: It has the best prompt comprehension and text rendering of any DALL-E version.
  • Read the revised prompt: DALL-E 3 rewrites your prompt for better results. Log and display the revised prompt so users understand what was generated.
  • Download URLs immediately: OpenAI image URLs expire within about an hour. Save to your own storage right after generation.
  • Use b64_json for server-side processing: When you need to process the image immediately (resize, composite, store), base64 avoids the download step.
  • Use hd quality judiciously: HD images cost more and take longer. Use standard for previews and drafts, hd for final output.
  • Choose natural style for photorealism: The vivid style (default) produces more dramatic, hyper-real images. Switch to natural for realistic photographs.
  • Implement content policy fallbacks: Always catch content policy errors and provide user-friendly messages rather than exposing raw API errors.
  • Set organization ID for billing control: When your account has multiple organizations, set the org ID to ensure correct billing.

Anti-Patterns

  • Requesting n > 1 with DALL-E 3: DALL-E 3 only supports n=1. To generate multiple images, make parallel requests with the same prompt -- each will produce a unique result due to prompt rewriting.
  • Relying on URL persistence: Never store OpenAI image URLs as permanent references. They expire. Always download and re-host.
  • Ignoring prompt rewriting: DALL-E 3 modifies your prompt. If you need exact prompt control, prepend "I NEED to test how the tool works with extremely simple prompts. DO NOT add any detail, just use it AS-IS:" -- though this reduces quality.
  • Using DALL-E for rapid iteration: At 10-20 seconds per image, DALL-E is not suited for real-time or interactive generation. Use faster services like fal.ai for that workflow.
  • Passing non-PNG images to edit/variation: The DALL-E 2 edit and variation endpoints require square PNG images. Convert and resize before submitting.
  • Not handling timeouts: Image generation can take 15-30 seconds. Set appropriate timeouts on your HTTP client and handle them gracefully.
  • Skipping cost estimation: DALL-E 3 HD at 1792x1024 costs significantly more than standard 1024x1024. Track usage and set budget alerts.

Install this skill directly: skilldb add image-generation-services-skills

Get CLI access →