API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
OpenAI
The Conversational Creator
Best when you need excellent prompt adherence and text-heavy designs.
Strengths
-
▸
Unmatched prompt adherence
-
▸
Extremely beginner-friendly
-
▸
Best-in-class for text in images
-
▸
Strong, harmonious compositions
Limitations
-
▸
Loss of direct control due to auto-prompting
-
▸
Photorealism can lag behind competitors
-
▸
Strict safety filters block public figures
Sample Prompts
Text-to-Image
A refined modern living room with warm oak finishes, layered lighting, soft shadows, and clean editorial composition
Image-to-Image
Transform this into a cyberpunk aesthetic with neon accents and weathered metal texture
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| GPT-5 |
gpt-5-chat-latest |
Analysis
Text
|
Smartest non-reasoning model. |
| GPT-5 Nano |
gpt-5-nano |
Analysis
Text
|
Fast, cost-efficient reasoning. |
| GPT-4.1 |
gpt-4.1 |
Analysis
Text
|
Previous generation flagship. |
| GPT-Image 1 |
gpt-image-1 |
Txt2Img
Img2Img
|
Latest flagship image generation with editing. |
| DALL-E 3 |
dall-e-3 |
Txt2Img
|
Popular text-to-image with excellent prompt adherence. |
API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
Google Gemini
The Photorealistic Powerhouse
Best for polished, photoreal visuals and production-ready commercial art.
Strengths
-
▸
Superior photorealism (Imagen)
-
▸
Professional-grade editing suite
-
▸
Unique multimodal reasoning
-
▸
Allows negative prompts
Limitations
-
▸
Dual-model system can be confusing
-
▸
Very strict content filters
-
▸
Can miss fine details
Sample Prompts
Text-to-Image
Photorealistic boutique hotel lobby with stone flooring, bronze accents, sculptural seating, natural daylight, and premium material detail
Image-to-Image
Enhance this room with realistic material textures, balanced daylight, and premium finish detail. Negative: cartoon, plastic, low quality
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| Gemini 2.5 Flash |
gemini-2.5-flash |
Analysis
Text
|
Fastest and most cost-efficient multimodal model. |
| Imagen 4 (Standard) |
imagen-4.0-generate-001 |
Txt2Img
Img2Img
|
Production-ready for photoreal rooms, furniture, and commercial visualization. |
| Imagen 4 Ultra |
imagen-4.0-ultra-generate-001 |
Txt2Img
Img2Img
|
Highest fidelity for flagship renders and polished marketing visuals. |
| Imagen 3 |
imagen-3.0-generate-002 |
Txt2Img
Img2Img
|
Previous generation high-quality model with editing. |
API Configuration
⚠️ Security: Keys are stored locally in your browser and sent directly to the API providers. No intermediate server is used.
xAI Grok
The Unfiltered Realist
Best for boundary-pushing, loosely moderated images with real-time culture.
Strengths
-
▸
Excellent photorealism for people
-
▸
Creative freedom (public figures, styles)
-
▸
Strong on real-world entities
-
▸
Real-time intelligence and vision
Limitations
-
▸
Highly inconsistent results
-
▸
Poor instruction adherence
-
▸
Ethical and legal risks
Sample Prompts
Text-to-Image
A moody downtown cafe interior with cinematic contrast, reflective surfaces, bold styling, and realistic nighttime ambiance
Advanced
Turn this room into a bold nightlife lounge with richer contrast, dramatic practical lighting, and more expressive material styling
Available Models
| Model Name |
Model ID |
Capabilities |
Description |
| Grok 4.1 Fast (Reasoning) |
grok-4-1-fast-reasoning |
Analysis
Text
|
Best tool-calling with 2M context, includes reasoning mode. |
| Grok 4 |
grok-4 |
Analysis
Text
|
Flagship with real-time intelligence and vision. |
| Grok 2 Image (Latest) |
grok-2-image |
Txt2Img
|
Latest text-to-image generation model. |