Technology & EngineeringImage Generation Services344 lines

Stability AI Image Generation

"Stability AI: Stable Diffusion API, text-to-image, image-to-image, inpainting, upscaling, SDXL, REST API"

Quick Summary18 lines

Stability AI provides direct REST API access to their Stable Diffusion models, including SD3, SDXL, and specialized endpoints for inpainting, upscaling, and image-to-image transformations. Unlike aggregator platforms, Stability AI is the model creator, so their API offers the most current model versions and lowest-level control. The API uses multipart form data for image uploads and returns either base64-encoded images or direct binary data. Credit-based billing means you purchase credits upfront and each generation costs a fixed amount based on the model and resolution. Prefer their REST API with typed wrappers over unofficial SDKs for maximum reliability.

## Key Points

- **Check credit balance before batch jobs**: Call `getBalance()` and estimate costs before queuing hundreds of generations.
- **Use appropriate models**: SD3 Large Turbo for speed, SD3 Large for quality. SDXL for established workflows with known prompt engineering.
- **Handle binary responses**: Set `Accept: image/*` for direct binary output to avoid base64 encoding overhead.
- **Validate dimensions**: SDXL requires dimensions divisible by 64. SD3 uses aspect ratios instead of pixel dimensions.
- **Use seeds for reproducibility**: Store the seed from successful generations to reproduce or iterate on results.
- **Implement retry with backoff**: The API returns 429 on rate limits. Use exponential backoff starting at 1 second.
- **Stream to storage**: For large batches, stream response buffers directly to cloud storage instead of holding them in memory.
- **Use style presets**: SDXL supports presets like `photographic`, `digital-art`, `anime` to guide generation without lengthy prompts.
- **Ignoring content filter responses**: The API returns `finish_reason: "CONTENT_FILTERED"` when output is blocked. Always check this field instead of assuming success.
- **Using v1 endpoints for new projects**: The v2beta endpoints offer better models and more features. Only use v1 for SDXL-specific workflows.
- **Sending oversized images**: Resize input images to reasonable dimensions before uploading. The API has payload limits and charges more for larger inputs.
- **Polling without delay**: Creative upscale takes 1-5 minutes. Polling every second wastes API calls and risks rate limiting.

skilldb get image-generation-services-skills/Stability AI Image GenerationFull skill: 344 lines

Paste into your CLAUDE.md or agent config

Stability AI Image Generation

Core Philosophy

Setup

Configure the Stability AI REST client with your API key:

const STABILITY_API_BASE = "https://api.stability.ai/v2beta";

interface StabilityConfig {
  apiKey: string;
  baseUrl?: string;
}

class StabilityClient {
  private apiKey: string;
  private baseUrl: string;

  constructor(config: StabilityConfig) {
    this.apiKey = config.apiKey;
    this.baseUrl = config.baseUrl ?? STABILITY_API_BASE;
  }

  async request(
    endpoint: string,
    formData: FormData,
    accept: "image/*" | "application/json" = "application/json"
  ): Promise<Response> {
    const response = await fetch(`${this.baseUrl}${endpoint}`, {
      method: "POST",
      headers: {
        Authorization: `Bearer ${this.apiKey}`,
        Accept: accept,
      },
      body: formData,
    });

    if (!response.ok) {
      const error = await response.json();
      throw new Error(`Stability API error: ${error.message ?? response.statusText}`);
    }

    return response;
  }

  async getBalance(): Promise<number> {
    const res = await fetch(`${this.baseUrl.replace("/v2beta", "")}/v1/user/balance`, {
      headers: { Authorization: `Bearer ${this.apiKey}` },
    });
    const data = await res.json();
    return data.credits;
  }
}

const stability = new StabilityClient({
  apiKey: process.env.STABILITY_API_KEY!,
});

Key Techniques

Text-to-Image with SD3

interface TextToImageOptions {
  prompt: string;
  negativePrompt?: string;
  aspectRatio?: "1:1" | "16:9" | "21:9" | "2:3" | "3:2" | "4:5" | "5:4" | "9:16" | "9:21";
  model?: "sd3-large" | "sd3-large-turbo" | "sd3-medium";
  seed?: number;
  outputFormat?: "jpeg" | "png" | "webp";
}

async function textToImage(options: TextToImageOptions): Promise<Buffer> {
  const formData = new FormData();
  formData.append("prompt", options.prompt);
  formData.append("model", options.model ?? "sd3-large");
  formData.append("aspect_ratio", options.aspectRatio ?? "1:1");
  formData.append("output_format", options.outputFormat ?? "png");

  if (options.negativePrompt) {
    formData.append("negative_prompt", options.negativePrompt);
  }
  if (options.seed !== undefined) {
    formData.append("seed", options.seed.toString());
  }

  const response = await stability.request(
    "/stable-image/generate/sd3",
    formData,
    "image/*"
  );

  const arrayBuffer = await response.arrayBuffer();
  return Buffer.from(arrayBuffer);
}

// Usage
const imageBuffer = await textToImage({
  prompt: "A serene mountain lake at golden hour, photorealistic",
  negativePrompt: "cartoon, illustration, low quality",
  aspectRatio: "16:9",
  model: "sd3-large",
});

import fs from "fs";
fs.writeFileSync("output.png", imageBuffer);

SDXL Generation with Fine Control

async function generateSDXL(
  prompt: string,
  options?: {
    negativePrompt?: string;
    width?: number;
    height?: number;
    cfgScale?: number;
    steps?: number;
    samples?: number;
    seed?: number;
    stylePreset?: string;
  }
): Promise<Array<{ base64: string; seed: number }>> {
  const body = {
    text_prompts: [
      { text: prompt, weight: 1.0 },
      ...(options?.negativePrompt
        ? [{ text: options.negativePrompt, weight: -1.0 }]
        : []),
    ],
    cfg_scale: options?.cfgScale ?? 7,
    width: options?.width ?? 1024,
    height: options?.height ?? 1024,
    steps: options?.steps ?? 30,
    samples: options?.samples ?? 1,
    seed: options?.seed ?? 0,
    style_preset: options?.stylePreset,
  };

  const response = await fetch(
    "https://api.stability.ai/v1/generation/stable-diffusion-xl-1024-v1-0/text-to-image",
    {
      method: "POST",
      headers: {
        "Content-Type": "application/json",
        Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
        Accept: "application/json",
      },
      body: JSON.stringify(body),
    }
  );

  if (!response.ok) {
    throw new Error(`SDXL generation failed: ${response.statusText}`);
  }

  const data = await response.json();
  return data.artifacts.map((a: any) => ({
    base64: a.base64,
    seed: a.seed,
  }));
}

Image-to-Image

async function imageToImage(
  imageBuffer: Buffer,
  prompt: string,
  strength: number = 0.65,
  options?: { negativePrompt?: string; seed?: number }
): Promise<Buffer> {
  const formData = new FormData();
  const blob = new Blob([imageBuffer], { type: "image/png" });
  formData.append("image", blob, "input.png");
  formData.append("prompt", prompt);
  formData.append("strength", strength.toString());
  formData.append("output_format", "png");
  formData.append("mode", "image-to-image");

  if (options?.negativePrompt) {
    formData.append("negative_prompt", options.negativePrompt);
  }
  if (options?.seed !== undefined) {
    formData.append("seed", options.seed.toString());
  }

  const response = await stability.request(
    "/stable-image/generate/sd3",
    formData,
    "image/*"
  );

  return Buffer.from(await response.arrayBuffer());
}

Inpainting

async function inpaintImage(
  imageBuffer: Buffer,
  maskBuffer: Buffer,
  prompt: string,
  options?: { negativePrompt?: string; outputFormat?: string }
): Promise<Buffer> {
  const formData = new FormData();
  formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
  formData.append("mask", new Blob([maskBuffer], { type: "image/png" }), "mask.png");
  formData.append("prompt", prompt);
  formData.append("output_format", options?.outputFormat ?? "png");

  if (options?.negativePrompt) {
    formData.append("negative_prompt", options.negativePrompt);
  }

  const response = await stability.request(
    "/stable-image/edit/inpaint",
    formData,
    "image/*"
  );

  return Buffer.from(await response.arrayBuffer());
}

Upscaling

async function upscaleImage(
  imageBuffer: Buffer,
  prompt: string,
  outputFormat: "png" | "jpeg" | "webp" = "png"
): Promise<{ id: string }> {
  const formData = new FormData();
  formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
  formData.append("prompt", prompt);
  formData.append("output_format", outputFormat);

  const response = await stability.request(
    "/stable-image/upscale/creative",
    formData
  );

  const data = await response.json();
  return { id: data.id };
}

async function fetchUpscaleResult(generationId: string): Promise<Buffer | null> {
  const response = await fetch(
    `${STABILITY_API_BASE}/stable-image/upscale/creative/result/${generationId}`,
    {
      headers: {
        Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
        Accept: "image/*",
      },
    }
  );

  if (response.status === 202) {
    return null; // Still processing
  }

  return Buffer.from(await response.arrayBuffer());
}

// Poll for upscale completion
async function upscaleAndWait(imageBuffer: Buffer, prompt: string): Promise<Buffer> {
  const { id } = await upscaleImage(imageBuffer, prompt);

  for (let i = 0; i < 30; i++) {
    await new Promise((r) => setTimeout(r, 10_000));
    const result = await fetchUpscaleResult(id);
    if (result) return result;
  }

  throw new Error("Upscale timed out");
}

Search-and-Replace Edit

async function searchAndReplace(
  imageBuffer: Buffer,
  searchPrompt: string,
  replacePrompt: string
): Promise<Buffer> {
  const formData = new FormData();
  formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
  formData.append("prompt", replacePrompt);
  formData.append("search_prompt", searchPrompt);
  formData.append("output_format", "png");

  const response = await stability.request(
    "/stable-image/edit/search-and-replace",
    formData,
    "image/*"
  );

  return Buffer.from(await response.arrayBuffer());
}

// Replace the sky in a photo
const edited = await searchAndReplace(
  originalImage,
  "overcast grey sky",
  "vivid sunset with orange and purple clouds"
);

Best Practices

Check credit balance before batch jobs: Call getBalance() and estimate costs before queuing hundreds of generations.
Use appropriate models: SD3 Large Turbo for speed, SD3 Large for quality. SDXL for established workflows with known prompt engineering.
Handle binary responses: Set Accept: image/* for direct binary output to avoid base64 encoding overhead.
Validate dimensions: SDXL requires dimensions divisible by 64. SD3 uses aspect ratios instead of pixel dimensions.
Use seeds for reproducibility: Store the seed from successful generations to reproduce or iterate on results.
Implement retry with backoff: The API returns 429 on rate limits. Use exponential backoff starting at 1 second.
Stream to storage: For large batches, stream response buffers directly to cloud storage instead of holding them in memory.
Use style presets: SDXL supports presets like photographic, digital-art, anime to guide generation without lengthy prompts.

Anti-Patterns

Ignoring content filter responses: The API returns finish_reason: "CONTENT_FILTERED" when output is blocked. Always check this field instead of assuming success.
Using v1 endpoints for new projects: The v2beta endpoints offer better models and more features. Only use v1 for SDXL-specific workflows.
Sending oversized images: Resize input images to reasonable dimensions before uploading. The API has payload limits and charges more for larger inputs.
Polling without delay: Creative upscale takes 1-5 minutes. Polling every second wastes API calls and risks rate limiting.
Mixing up multipart and JSON: Text-to-image v2beta uses multipart form data, not JSON. The v1 endpoints use JSON. Mixing these causes 400 errors.
Not storing generation metadata: Always save the seed, prompt, and model version alongside generated images for reproducibility.
Hardcoding aspect ratios: Let users choose aspect ratios from the supported list rather than calculating custom width/height pairs.

Install this skill directly: skilldb add image-generation-services-skills

Get CLI access →

Stability AI Image Generation

Stability AI Image Generation

Core Philosophy

Setup

Key Techniques

Text-to-Image with SD3

SDXL Generation with Fine Control

Image-to-Image

Inpainting

Upscaling

Search-and-Replace Edit

Best Practices

Anti-Patterns

Related Skills

Adobe Firefly API

Cloudinary Image Generation & Manipulation

DALL-E Image Generation

fal.ai Image Generation

Imgix Image Processing

Leonardo AI Image Generation