CLAUDE.mdtypescript
md-agents CLAUDE.md
Generate images using OpenAI's GPT-Image-1 model. This tool is specifically designed for GPT-Image-1, which returns base64 data (unlike DALL-E which can return URLs).
GPT-Image-Gen - AI Assistant Instructions
Tool Purpose
Generate images using OpenAI's GPT-Image-1 model. This tool is specifically designed for GPT-Image-1, which returns base64 data (unlike DALL-E which can return URLs).
Critical Implementation Details
Model Specifications
- ALWAYS use model:
gpt-image-1(NOTdall-e-2ordall-e-3) - Response format: ONLY
b64_json(GPT-Image-1 doesn't support URLs) - Quality values:
low,medium,high,auto(NOTstandardorhd) - Size options:
1024x1024,1024x1536,1536x1024
When to Use This Tool
Automatic Triggers
- User asks to "generate an image" or "create a picture"
- User provides detailed visual descriptions
- Tasks requiring AI-generated imagery
- When DALL-E tools fail or aren't available
Use Cases
- Creative content generation
- Placeholder images for development
- Concept visualization
- Batch image generation for datasets
- Testing image processing pipelines
Integration Patterns
With b64img Tool
# Generate and immediately convert
gpt-image-gen "prompt" --base64 | b64img --auto
# Generate multiple and batch convert
gpt-image-gen "prompt" --count 5 --output-dir ./raw/ && \
b64img ./raw/*.b64 --outdir ./images/
With Cloud Storage
# Generate and upload to Cloudflare Images
gpt-image-gen "product photo" --output temp.png && \
cf-images upload temp.png --name "product-1"
# Batch generate and upload
for i in {1..5}; do
gpt-image-gen "variation $i" --output "img-$i.png" && \
cf-images upload "img-$i.png"
done
Pipeline Examples
# Generate → Convert → Optimize → Upload
gpt-image-gen "logo" --base64 | \
b64img --stdout | \
convert - -quality 85 logo.jpg | \
aws s3 cp - s3://bucket/logo.jpg
API Key Management
Check for API Key
- First check environment:
$OPENAI_API_KEY - Then check config:
gpt-image-gen config get api-key - If missing, prompt user to set it
Setting API Key
# Preferred: environment variable
export OPENAI_API_KEY=sk-...
# Alternative: config file
gpt-image-gen config set api-key sk-...
Error Handling
Common Issues and Solutions
| Error | Solution |
|-------|----------|
| "Invalid API key" | Check/update OpenAI API key |
| "Rate limit exceeded" | Tool auto-retries, or wait and retry |
| "Invalid model" | Ensure using gpt-image-1 not DALL-E |
| "Invalid quality" | Use low/medium/high/auto, not standard/hd |
| "Prompt too long" | Max 4000 characters |
| "Invalid size" | Must be one of the three supported sizes |
Retry Logic
- Automatic exponential backoff for rate limits
- Max 3 retries by default
- Wait times: 1s, 2s, 4s between retries
Performance Optimization
For Speed
- Use
--quality lowfor drafts - Single image generation is faster than batch
- Pre-compile binary with
./build.sh
For Quality
- Use
--quality highfor production - Size
1536x1024or1024x1536for more detail - Generate multiple variations with
--count
For Cost
- Use
--quality low(~$0.01/image) - Batch similar prompts together
- Preview with low quality, finalize with high
Prompt Engineering Tips
Best Practices
- Be specific about style, colors, composition
- Include artistic references (e.g., "oil painting style")
- Specify lighting and mood
- Add technical details (e.g., "8K resolution", "photorealistic")
Example Prompts
# Detailed and specific
"A serene Japanese garden with a red wooden bridge over a koi pond, cherry blossoms in full bloom, soft morning light, watercolor painting style"
# Technical specification
"Modern minimalist logo design, geometric shapes, blue and white color scheme, flat design, vector art style, centered composition"
# Photorealistic
"Professional product photography of a luxury watch, black background, dramatic lighting, macro lens, high detail, commercial style"
Batch Processing
From File
# Create prompt file
cat > prompts.txt << EOF
A cute robot assistant
A magical forest at twilight
An abstract representation of joy
EOF
# Process all
gpt-image-gen batch prompts.txt --output-dir ./results/
Programmatic Generation
# Generate variations
for style in "oil painting" "watercolor" "pencil sketch"; do
gpt-image-gen "Portrait in $style style" \
--output "portrait-${style// /-}.png"
done
Output Formats
Binary Image (Default)
- Automatic base64 to binary conversion
- Supports PNG, JPG, WebP output
- Preserves image quality
Raw Base64
- Use
--base64flag - For piping to other tools
- For embedding in JSON/HTML
Metadata JSON
- Use
--jsonflag - Includes prompt, size, timestamp
- Useful for cataloging
Cost Awareness
Estimated Costs
- Low quality: ~$0.01 per image
- Medium quality: ~$0.02 per image
- High quality: ~$0.04 per image
- Auto quality: ~$0.03 per image
Cost Optimization
- Test with low quality first
- Use specific sizes (not auto)
- Batch similar requests
- Cache generated images
Testing Commands
Basic Functionality
# Test API connection
gpt-image-gen "test" --quality low --quiet
# Test batch processing
echo -e "cat\ndog" | gpt-image-gen batch /dev/stdin --output-dir ./test/
# Test configuration
gpt-image-gen config
Important Reminders
- Model Name: Always
gpt-image-1, never DALL-E models - Base64 Only: GPT-Image-1 only returns base64, not URLs
- Quality Values: Use low/medium/high/auto (not standard/hd)
- Size Limits: Only three sizes supported
- Rate Limits: Tool handles automatically with retries
- Max Prompt: 4000 characters
- Max Images: 10 per request
Maintenance Notes
- Keep TypeScript types in sync with API changes
- Update cost estimates quarterly
- Test with latest Bun version
- Monitor OpenAI API deprecations
- Maintain backward compatibility