AI Video & Image Generation API — Kling, Google, xAI, OpenAI

Unified API for 95+ AI models. Pay-per-use, no subscriptions. One integration for video, image, audio, and text generation.

Why Glio

Models

Video Generation

Kling

Kling 3.0 supports single-shot and multi-shot video generation with element references, start/end frames, and advanced mode controls.

Google

Google Veo 3.1 is Google DeepMind's state-of-the-art text-to-video model with both Quality and Fast generation modes. Ideal for filmmakers and content creators requiring realistic video from text descriptions. Generates videos at 16:9 or 9:16 aspect ratios with support for video extension chains.

xAI

Grok Imagine Text-to-Video generates videos from text prompts using xAI's video generation model with multiple creative modes. Perfect for social media content, animated scenes, and dynamic visual storytelling. Features Fun, Normal, and Spicy generation modes with multiple aspect ratio options.

OpenAI

Sora 2 is OpenAI's breakthrough text-to-video AI capable of creating realistic scenes from text descriptions. Perfect for content creators and marketers needing quick video generation. Generates 10-15 second clips in portrait or landscape orientation with character consistency support.

Midjourney

Midjourney Video (SD) creates standard definition video animations from source images. Cost-effective solution for social media content and rapid video prototyping. Offers batch sizes of 1-4 videos with adjustable motion intensity and all standard Midjourney controls.

ByteDance

Seedance 1.5 Pro is ByteDance's high-quality text-to-video model with optional native audio generation. Best for marketing videos, social content, and promotional clips. Supports 4-12 second clips at up to 1080p with fixed camera option.

Hailuo

Hailuo 02 Standard Image-to-Video is MiniMax's budget-friendly animation model with duration and resolution control. Ideal for cost-effective video production and social content. Supports 6-10 second clips at 512p or 768p with end frame option.

Luma

Luma Ray 2 is Luma Labs' video-to-video AI model that transforms existing footage based on text prompts. Perfect for creative video editing, style transfer, and modifying video content without re-shooting. Accepts MP4/MOV/AVI input videos up to 10 seconds and 500MB, with optional watermark support.

Runway

Runway Gen-3 Alpha is Runway's advanced text-to-video model creating 5-10 second cinematic clips. Designed for filmmakers and video professionals requiring high-quality AI-generated footage. Offers 720p and 1080p HD output with 16:9, 9:16, 1:1, 4:3, and 3:4 aspect ratios plus video extension capability.

Wan

Wan 2.6 Text-to-Video is Alibaba's flagship AI video generation model creating high-quality videos from text descriptions in Chinese and English. Perfect for filmmakers, advertisers, and content creators needing professional video content. Generates 5-15 second videos at 720p or 1080p resolution.

Topaz

Topaz Video Upscale delivers AI-powered video enhancement, upscaling footage up to 4x resolution with exceptional detail preservation. Ideal for restoring old video content, improving low-resolution clips, and preparing videos for 4K displays. Supports MP4, MOV, and MKV formats with intelligent artifact removal.

LTXV

LTXV-2 is Lightricks' flagship text-to-video and image-to-video AI model delivering high-quality video generation with integrated audio support. Best for creating professional marketing content, social media videos, and visual storytelling with resolutions up to 4K (2160p). Supports 6-10 second clips at 25 or 50 fps with optional starting frame for precise creative control.

Infinitalk

Infinitalk Audio-to-Video is a MeiGen-AI lip-sync video generation model that creates talking head videos from audio and portrait images. Ideal for virtual presenters, educational content, and social media creators needing realistic talking avatar videos. Supports 480p and 720p resolution with customizable prompts for scene guidance.

Image Generation

Google

Nano Banana Pro (Gemini 3 Pro) is Google's advanced text-to-image model featuring exceptional structural accuracy and precise text rendering capabilities. Ideal for creating images with embedded text, logos, or detailed compositions. Outputs up to 4K resolution with support for 10+ aspect ratios including ultra-wide 21:9.

xAI

Grok Imagine Text-to-Image is xAI's photorealistic image generation model creating high-quality visuals from text prompts. Ideal for generating realistic photographs, product mockups, and lifelike scenes. Supports multiple aspect ratios including 1:1, 2:3, 3:2, 16:9, and 9:16 formats.

OpenAI

GPT Image 1.5 is OpenAI's photorealistic text-to-image generation model with medium and high quality options. Designed for creators needing production-ready images from natural language prompts. Supports 1:1, 2:3, and 3:2 aspect ratios with detailed quality control.

Midjourney

Midjourney is the industry-leading AI image generator known for stunning artistic and photorealistic outputs. Ideal for concept artists, designers, and creative professionals seeking distinctive visual styles. Supports versions v5.1 through v7 plus Niji 6 for anime, with extensive customization via stylization, weirdness, and variety controls.

ByteDance

Seedream 5.0 Lite is ByteDance's budget-friendly text-to-image model offering solid quality at lower cost. Supports multiple aspect ratios with 2K to 3K resolution output. Great for rapid prototyping and high-volume image generation where cost efficiency matters.

Qwen

Qwen Text-to-Image is Alibaba's powerful image generation model producing high-quality visuals from text prompts with fine-grained control. Excellent for diverse creative applications with adjustable quality steps, guidance scale, and acceleration modes. Supports multiple image sizes with PNG and JPEG output formats.

Topaz

Topaz Image Upscaler uses industry-leading AI enhancement technology to upscale images up to 8x their original resolution. Best for restoring old photos, preparing images for large-format printing, and enhancing low-resolution graphics. Preserves fine details and textures while eliminating compression artifacts and noise.

Ideogram

Ideogram V3 Reframe intelligently expands images to different aspect ratios with AI-generated fill that seamlessly matches the original content. Essential for repurposing images across social media formats and adapting content for different platforms. Supports multiple output sizes with Turbo, Balanced, and Quality rendering speeds.

Recraft

Recraft Crisp Upscale enhances image resolution using AI-powered upscaling that preserves sharpness and detail. Perfect for enlarging images for print, improving low-resolution assets, and preparing visuals for high-DPI displays. Accepts PNG, JPG, and WEBP images up to 10MB.

Flux

FLUX.2 Pro is Black Forest Labs' premium text-to-image model delivering high-quality image generation from detailed prompts. Ideal for professional artwork, marketing visuals, and creative projects requiring exceptional fidelity. Supports 1K and 2K resolutions with multiple aspect ratios including square, portrait, and landscape formats.

Audio Generation

Suno

Suno Music is the flagship AI music generation model from Suno capable of creating complete songs with vocals and instrumentals from text prompts. Perfect for content creators, game developers, and musicians needing royalty-free original music. Features custom mode for full creative control, multi-track output, stem separation, WAV/MP4 export, and MIDI conversion with V4, V4.5, and V5 model options.

ElevenLabs

ElevenLabs TTS Multilingual v2 is a premium text-to-speech model from ElevenLabs featuring 21 natural-sounding voices across multiple languages. Perfect for audiobook production, video narration, and accessibility applications. Offers adjustable stability, similarity, style, and speed controls for fine-tuned voice output.

Text Generation

OpenAI

High-capability text model with tool calling and long context support.

Anthropic

Claude 4.5 Opus for advanced reasoning and coding workflows.

Pricing

Pay-per-use. 1 GL = $0.01 USD. No subscriptions, no monthly fees. Top up any amount.

View all pricing →

Quick Start

Base URL: https://api.glio.io

Auth: Authorization: Bearer YOUR_API_KEY

Create job: POST /v1/jobs with {"model": "model-alias", "params": {...}}

Poll status: GET /v1/jobs/{id}

API Documentation