Stability AI Image Generation
"Stability AI: Stable Diffusion API, text-to-image, image-to-image, inpainting, upscaling, SDXL, REST API"
Stability AI provides direct REST API access to their Stable Diffusion models, including SD3, SDXL, and specialized endpoints for inpainting, upscaling, and image-to-image transformations. Unlike aggregator platforms, Stability AI is the model creator, so their API offers the most current model versions and lowest-level control. The API uses multipart form data for image uploads and returns either base64-encoded images or direct binary data. Credit-based billing means you purchase credits upfront and each generation costs a fixed amount based on the model and resolution. Prefer their REST API with typed wrappers over unofficial SDKs for maximum reliability. ## Key Points - **Check credit balance before batch jobs**: Call `getBalance()` and estimate costs before queuing hundreds of generations. - **Use appropriate models**: SD3 Large Turbo for speed, SD3 Large for quality. SDXL for established workflows with known prompt engineering. - **Handle binary responses**: Set `Accept: image/*` for direct binary output to avoid base64 encoding overhead. - **Validate dimensions**: SDXL requires dimensions divisible by 64. SD3 uses aspect ratios instead of pixel dimensions. - **Use seeds for reproducibility**: Store the seed from successful generations to reproduce or iterate on results. - **Implement retry with backoff**: The API returns 429 on rate limits. Use exponential backoff starting at 1 second. - **Stream to storage**: For large batches, stream response buffers directly to cloud storage instead of holding them in memory. - **Use style presets**: SDXL supports presets like `photographic`, `digital-art`, `anime` to guide generation without lengthy prompts. - **Ignoring content filter responses**: The API returns `finish_reason: "CONTENT_FILTERED"` when output is blocked. Always check this field instead of assuming success. - **Using v1 endpoints for new projects**: The v2beta endpoints offer better models and more features. Only use v1 for SDXL-specific workflows. - **Sending oversized images**: Resize input images to reasonable dimensions before uploading. The API has payload limits and charges more for larger inputs. - **Polling without delay**: Creative upscale takes 1-5 minutes. Polling every second wastes API calls and risks rate limiting.
skilldb get image-generation-services-skills/Stability AI Image GenerationFull skill: 344 linesStability AI Image Generation
Core Philosophy
Stability AI provides direct REST API access to their Stable Diffusion models, including SD3, SDXL, and specialized endpoints for inpainting, upscaling, and image-to-image transformations. Unlike aggregator platforms, Stability AI is the model creator, so their API offers the most current model versions and lowest-level control. The API uses multipart form data for image uploads and returns either base64-encoded images or direct binary data. Credit-based billing means you purchase credits upfront and each generation costs a fixed amount based on the model and resolution. Prefer their REST API with typed wrappers over unofficial SDKs for maximum reliability.
Setup
Configure the Stability AI REST client with your API key:
const STABILITY_API_BASE = "https://api.stability.ai/v2beta";
interface StabilityConfig {
apiKey: string;
baseUrl?: string;
}
class StabilityClient {
private apiKey: string;
private baseUrl: string;
constructor(config: StabilityConfig) {
this.apiKey = config.apiKey;
this.baseUrl = config.baseUrl ?? STABILITY_API_BASE;
}
async request(
endpoint: string,
formData: FormData,
accept: "image/*" | "application/json" = "application/json"
): Promise<Response> {
const response = await fetch(`${this.baseUrl}${endpoint}`, {
method: "POST",
headers: {
Authorization: `Bearer ${this.apiKey}`,
Accept: accept,
},
body: formData,
});
if (!response.ok) {
const error = await response.json();
throw new Error(`Stability API error: ${error.message ?? response.statusText}`);
}
return response;
}
async getBalance(): Promise<number> {
const res = await fetch(`${this.baseUrl.replace("/v2beta", "")}/v1/user/balance`, {
headers: { Authorization: `Bearer ${this.apiKey}` },
});
const data = await res.json();
return data.credits;
}
}
const stability = new StabilityClient({
apiKey: process.env.STABILITY_API_KEY!,
});
Key Techniques
Text-to-Image with SD3
interface TextToImageOptions {
prompt: string;
negativePrompt?: string;
aspectRatio?: "1:1" | "16:9" | "21:9" | "2:3" | "3:2" | "4:5" | "5:4" | "9:16" | "9:21";
model?: "sd3-large" | "sd3-large-turbo" | "sd3-medium";
seed?: number;
outputFormat?: "jpeg" | "png" | "webp";
}
async function textToImage(options: TextToImageOptions): Promise<Buffer> {
const formData = new FormData();
formData.append("prompt", options.prompt);
formData.append("model", options.model ?? "sd3-large");
formData.append("aspect_ratio", options.aspectRatio ?? "1:1");
formData.append("output_format", options.outputFormat ?? "png");
if (options.negativePrompt) {
formData.append("negative_prompt", options.negativePrompt);
}
if (options.seed !== undefined) {
formData.append("seed", options.seed.toString());
}
const response = await stability.request(
"/stable-image/generate/sd3",
formData,
"image/*"
);
const arrayBuffer = await response.arrayBuffer();
return Buffer.from(arrayBuffer);
}
// Usage
const imageBuffer = await textToImage({
prompt: "A serene mountain lake at golden hour, photorealistic",
negativePrompt: "cartoon, illustration, low quality",
aspectRatio: "16:9",
model: "sd3-large",
});
import fs from "fs";
fs.writeFileSync("output.png", imageBuffer);
SDXL Generation with Fine Control
async function generateSDXL(
prompt: string,
options?: {
negativePrompt?: string;
width?: number;
height?: number;
cfgScale?: number;
steps?: number;
samples?: number;
seed?: number;
stylePreset?: string;
}
): Promise<Array<{ base64: string; seed: number }>> {
const body = {
text_prompts: [
{ text: prompt, weight: 1.0 },
...(options?.negativePrompt
? [{ text: options.negativePrompt, weight: -1.0 }]
: []),
],
cfg_scale: options?.cfgScale ?? 7,
width: options?.width ?? 1024,
height: options?.height ?? 1024,
steps: options?.steps ?? 30,
samples: options?.samples ?? 1,
seed: options?.seed ?? 0,
style_preset: options?.stylePreset,
};
const response = await fetch(
"https://api.stability.ai/v1/generation/stable-diffusion-xl-1024-v1-0/text-to-image",
{
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
Accept: "application/json",
},
body: JSON.stringify(body),
}
);
if (!response.ok) {
throw new Error(`SDXL generation failed: ${response.statusText}`);
}
const data = await response.json();
return data.artifacts.map((a: any) => ({
base64: a.base64,
seed: a.seed,
}));
}
Image-to-Image
async function imageToImage(
imageBuffer: Buffer,
prompt: string,
strength: number = 0.65,
options?: { negativePrompt?: string; seed?: number }
): Promise<Buffer> {
const formData = new FormData();
const blob = new Blob([imageBuffer], { type: "image/png" });
formData.append("image", blob, "input.png");
formData.append("prompt", prompt);
formData.append("strength", strength.toString());
formData.append("output_format", "png");
formData.append("mode", "image-to-image");
if (options?.negativePrompt) {
formData.append("negative_prompt", options.negativePrompt);
}
if (options?.seed !== undefined) {
formData.append("seed", options.seed.toString());
}
const response = await stability.request(
"/stable-image/generate/sd3",
formData,
"image/*"
);
return Buffer.from(await response.arrayBuffer());
}
Inpainting
async function inpaintImage(
imageBuffer: Buffer,
maskBuffer: Buffer,
prompt: string,
options?: { negativePrompt?: string; outputFormat?: string }
): Promise<Buffer> {
const formData = new FormData();
formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
formData.append("mask", new Blob([maskBuffer], { type: "image/png" }), "mask.png");
formData.append("prompt", prompt);
formData.append("output_format", options?.outputFormat ?? "png");
if (options?.negativePrompt) {
formData.append("negative_prompt", options.negativePrompt);
}
const response = await stability.request(
"/stable-image/edit/inpaint",
formData,
"image/*"
);
return Buffer.from(await response.arrayBuffer());
}
Upscaling
async function upscaleImage(
imageBuffer: Buffer,
prompt: string,
outputFormat: "png" | "jpeg" | "webp" = "png"
): Promise<{ id: string }> {
const formData = new FormData();
formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
formData.append("prompt", prompt);
formData.append("output_format", outputFormat);
const response = await stability.request(
"/stable-image/upscale/creative",
formData
);
const data = await response.json();
return { id: data.id };
}
async function fetchUpscaleResult(generationId: string): Promise<Buffer | null> {
const response = await fetch(
`${STABILITY_API_BASE}/stable-image/upscale/creative/result/${generationId}`,
{
headers: {
Authorization: `Bearer ${process.env.STABILITY_API_KEY}`,
Accept: "image/*",
},
}
);
if (response.status === 202) {
return null; // Still processing
}
return Buffer.from(await response.arrayBuffer());
}
// Poll for upscale completion
async function upscaleAndWait(imageBuffer: Buffer, prompt: string): Promise<Buffer> {
const { id } = await upscaleImage(imageBuffer, prompt);
for (let i = 0; i < 30; i++) {
await new Promise((r) => setTimeout(r, 10_000));
const result = await fetchUpscaleResult(id);
if (result) return result;
}
throw new Error("Upscale timed out");
}
Search-and-Replace Edit
async function searchAndReplace(
imageBuffer: Buffer,
searchPrompt: string,
replacePrompt: string
): Promise<Buffer> {
const formData = new FormData();
formData.append("image", new Blob([imageBuffer], { type: "image/png" }), "image.png");
formData.append("prompt", replacePrompt);
formData.append("search_prompt", searchPrompt);
formData.append("output_format", "png");
const response = await stability.request(
"/stable-image/edit/search-and-replace",
formData,
"image/*"
);
return Buffer.from(await response.arrayBuffer());
}
// Replace the sky in a photo
const edited = await searchAndReplace(
originalImage,
"overcast grey sky",
"vivid sunset with orange and purple clouds"
);
Best Practices
- Check credit balance before batch jobs: Call
getBalance()and estimate costs before queuing hundreds of generations. - Use appropriate models: SD3 Large Turbo for speed, SD3 Large for quality. SDXL for established workflows with known prompt engineering.
- Handle binary responses: Set
Accept: image/*for direct binary output to avoid base64 encoding overhead. - Validate dimensions: SDXL requires dimensions divisible by 64. SD3 uses aspect ratios instead of pixel dimensions.
- Use seeds for reproducibility: Store the seed from successful generations to reproduce or iterate on results.
- Implement retry with backoff: The API returns 429 on rate limits. Use exponential backoff starting at 1 second.
- Stream to storage: For large batches, stream response buffers directly to cloud storage instead of holding them in memory.
- Use style presets: SDXL supports presets like
photographic,digital-art,animeto guide generation without lengthy prompts.
Anti-Patterns
- Ignoring content filter responses: The API returns
finish_reason: "CONTENT_FILTERED"when output is blocked. Always check this field instead of assuming success. - Using v1 endpoints for new projects: The v2beta endpoints offer better models and more features. Only use v1 for SDXL-specific workflows.
- Sending oversized images: Resize input images to reasonable dimensions before uploading. The API has payload limits and charges more for larger inputs.
- Polling without delay: Creative upscale takes 1-5 minutes. Polling every second wastes API calls and risks rate limiting.
- Mixing up multipart and JSON: Text-to-image v2beta uses multipart form data, not JSON. The v1 endpoints use JSON. Mixing these causes 400 errors.
- Not storing generation metadata: Always save the seed, prompt, and model version alongside generated images for reproducibility.
- Hardcoding aspect ratios: Let users choose aspect ratios from the supported list rather than calculating custom width/height pairs.
Install this skill directly: skilldb add image-generation-services-skills
Related Skills
Adobe Firefly API
"Adobe Firefly: text-to-image generation, generative fill, generative expand, style reference, content credentials, REST API"
Cloudinary Image Generation & Manipulation
"Cloudinary: image and video upload, transformation, AI-based generation, background removal, CDN delivery, URL-based API"
DALL-E Image Generation
"DALL-E API (OpenAI): image generation, editing, variations, quality/style params, size options, Node SDK"
fal.ai Image Generation
"fal.ai: fast inference, Flux, realtime image gen, queue API, webhooks, JavaScript SDK, serverless GPU"
Imgix Image Processing
"Imgix: real-time image processing and CDN, URL-based transformations, resizing, cropping, watermarking, face detection, format optimization"
Leonardo AI Image Generation
"Leonardo AI: image generation API, fine-tuned models, canvas editing, texture generation, REST API"