API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
OpenAI
The Conversational Creator
Best when you need excellent prompt adherence and text-heavy designs.
Strengths
-
▸
Unmatched prompt adherence
-
▸
Extremely beginner-friendly
-
▸
Best-in-class for text in images
-
▸
Strong, harmonious compositions
Limitations
-
▸
Loss of direct control due to auto-prompting
-
▸
Photorealism can lag behind competitors
-
▸
Strict safety filters block public figures
Sample Prompts
Text-to-Image
A futuristic sci-fi weapon prop, metallic silver with glowing blue energy cores, tactical grip, floating in dramatic studio lighting
Image-to-Image
Transform this into a cyberpunk aesthetic with neon accents and weathered metal texture
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| GPT-5 |
gpt-5-chat-latest |
Analysis
Text
|
Smartest non-reasoning model. |
| GPT-5 Nano |
gpt-5-nano |
Analysis
Text
|
Fast, cost-efficient reasoning. |
| GPT-4.1 |
gpt-4.1 |
Analysis
Text
|
Previous generation flagship. |
| GPT-Image 1 |
gpt-image-1 |
Txt2Img
Img2Img
|
Latest flagship image generation with editing. |
| DALL-E 3 |
dall-e-3 |
Txt2Img
|
Popular text-to-image with excellent prompt adherence. |
API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
Google Gemini
The Photorealistic Powerhouse
Best for polished, photoreal visuals and production-ready commercial art.
Strengths
-
▸
Superior photorealism (Imagen)
-
▸
Professional-grade editing suite
-
▸
Unique multimodal reasoning
-
▸
Allows negative prompts
Limitations
-
▸
Dual-model system can be confusing
-
▸
Very strict content filters
-
▸
Can miss fine details
Sample Prompts
Text-to-Image
Photorealistic medieval sword, damascus steel blade with intricate engravings, leather-wrapped handle, museum quality lighting, 8K detail
Image-to-Image
Enhance this prop with realistic metal textures and professional studio lighting. Negative: cartoon, plastic, low quality
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| Gemini 2.5 Flash |
gemini-2.5-flash |
Analysis
Text
|
Fastest and most cost-efficient multimodal model. |
| Imagen 4 (Standard) |
imagen-4.0-generate-001 |
Txt2Img
Img2Img
|
Production-ready for photoreal props and product renders. |
| Imagen 4 Ultra |
imagen-4.0-ultra-generate-001 |
Txt2Img
Img2Img
|
Highest fidelity for hero renders and marketing artwork. |
| Imagen 3 |
imagen-3.0-generate-002 |
Txt2Img
Img2Img
|
Previous generation high-quality model with editing. |
API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
xAI Grok
The Unfiltered Realist
Best for boundary-pushing, loosely moderated images with real-time culture.
Strengths
-
▸
Excellent photorealism for people
-
▸
Creative freedom (public figures, styles)
-
▸
Strong on real-world entities
-
▸
Real-time intelligence and vision
Limitations
-
▸
Highly inconsistent results
-
▸
Poor instruction adherence
-
▸
Ethical and legal risks
Sample Prompts
Text-to-Image
Realistic tactical gear inspired by modern special forces, modular pouches, weathered fabric, dramatic action shot
Advanced
Celebrity cosplay prop, screen-accurate replica with battle damage and LED effects
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| Grok 4.1 Fast (Reasoning) |
grok-4-1-fast-reasoning |
Analysis
Text
|
Best tool-calling with 2M context, includes reasoning mode. |
| Grok 4 |
grok-4 |
Analysis
Text
|
Flagship with real-time intelligence and vision. |
| Grok 2 Image (Latest) |
grok-2-image |
Txt2Img
|
Latest text-to-image generation model. |