Model Reference

AI Model Type Overview

This page covers the most common text, image, and video models to help you quickly understand the differences between model types and choose the right one for your first use.

Not sure where to start? We recommend reading the beginner's guide first — it'll help you make a more informed decision.

Text Models

Language · Code · Reasoning

Best for article writing, customer service, Q&A, document processing, and coding assistance.

View Text Models →

Image Models

Generation · Editing · Design

Best for illustration generation, social media assets, concept art, and visual design work.

View Image Models →

Video Models

Generation · Animation · Clips

Best for short video generation, animated content, dynamic ads, and motion graphics.

View Video Models →

Text Models

The most widely used AI model type for content generation, translation, summarization, coding, and conversational AI.

Model Name	Best For / Use Case
`gpt-4o`	General-purpose flagship. Ideal for complex reasoning, multi-step tasks, and high-quality content generation.
`gpt-4.5-nano`	Lightweight customer service model. Fast, low-cost, and optimized for high-volume simple tasks.
`gpt-5.3-chat`	General conversation, daily writing tasks, and high-quality interactive dialogue.
`gpt-5.3-codex`	Code writing, debugging, refactoring, and development assistance.
`claude-opus-4.6`	Long-form content, in-depth analysis, and complex problem solving and reasoning tasks.
`claude-sonnet-4.6`	Long document processing, report writing, content creation, and knowledge Q&A.
`deepseek-v3.2`	Daily use, general content generation, recommendations, and high-quality text output at competitive cost.
`doubao-seed-2.0-pro`	Comprehensive Chinese text tasks — general Q&A and document generation.
`doubao-seed-2.0-code`	Programming assistance, code generation, debugging, and development support.
`doubao-seed-2.0-lite`	Short text generation, fast replies, and lightweight content tasks.
`doubao-seed-2.0-mini`	Basic question answering, lightweight generation, and simple content tasks.
`gemini 3 pro`	Multi-modal understanding, complex Q&A, creative writing, and cross-modal output.
`gemini-3-flash-preview`	Fast multi-modal tasks, smart Q&A, and lightweight output at speed.
`gemini-3.1-pro-preview`	Advanced reasoning, comprehensive tasks, and long-context document processing.
`GLM-4.7`	General conversation, Q&A, and reasoning tasks.
`grok4.2`	General text Q&A, content generation, and comprehensive tasks.
`Kimi-K2.5`	Long document processing, reading comprehension, and information retrieval.
`MiniMax-M1`	Customer service, content generation, and routine daily tasks.
`MiniMax-M2.7`	Comprehensive Q&A, content generation, and text processing.
`qwen3-vl-chat`	Document understanding, visual Q&A input, and multi-modal content generation.
`qwen3-vl-plus`	More complete visual tasks and advanced cross-modal reasoning.
`qwen3.5`	General text tasks, content generation, and combined Q&A.
`qwen3.5-flash`	Low-cost fast output, simple Q&A, and lightweight content generation.
`qwen3.5-plus`	Comprehensive generation, content refinement, and single-task optimization.
`seed-2-0-mini`	Lightweight Q&A, simple generation, and quick short responses.

Image Models

Primarily used for illustration, social media assets, design drafts, and visual content creation. Essential for anyone needing high-quality visual output.

Model Name	Best For / Use Case
`imagen 4 fast`	Fast high-quality visual generation — material concepts, illustrations, and social media images.
`imagen-4-image-01`	High-quality image generation, creative concepts, and design drafts.
`kling-v3-omni-image`	Comprehensive image generation with multiple style applications and rich visual content.
`nano banana2`	Lightweight image generation with fast processing and quick output.
`qwen-image-2.0`	General image generation, illustration assets, and visual image generation.
`qwen-image-2.0-pro`	Design proposal generation, high-quality image output, and advanced visual elements.
`qwen-image-max`	High-quality flagship images, social media assets, and professional visual content.
`qwen-image-plus`	Comprehensive image generation for everyday design requirements.
`seedream-4.5`	Illustration generation, brand visuals, style assets, and creative image generation.
`seedream-5.0-lite`	Fast image generation, lightweight material creation, and simple visual concepts.
`wan2.6-t2i`	Text-to-image generation, concept illustrations, and material creation.

Video Models

Primarily used for AI video clips, image-to-video, and dynamic ad content creation. Ideal for anyone needing AI-generated motion content.

Model Name	Best For / Use Case
`kling-v3`	Short video clip generation, dynamic content, and short-form advertising material.
`seedance-1-5-pro`	Text-to-video, animated short films, and dynamic advertising content.
`seedance-2.0`	General video generation, dynamic animations, and ad content creation.
`veo 3.1`	High-quality video generation with realistic scenes and cinematic visual output.
`wan2.5-i2v-preview`	Image-to-video generation — bring still images to life with motion.
`wan2.6-i2v-flash`	Fast image-to-video conversion with audio generation capabilities.
`wan2.6-r2v-flash`	Reference image-to-video conversion with high-quality output.
`wan2.6-t2v`	Text-to-video generation, short clips, and script-driven visualization.

Common Questions About Model Types

If you're just getting started with AI, we recommend first identifying what you want to do — not just memorizing model names. You can look at the model categories (text, image, video), then read the beginner's guide on AI Token King. From there, you can try a few models and compare outputs before committing.

The beginner's guide also includes a decision tree to help you pick a starting point based on your specific goal.

The three model types handle fundamentally different kinds of output:

Text models — Read text input, generate text output. Used for Q&A, writing, summarization, translation, and code.
Image models — Generate images from text prompts or other images. Used for design, illustration, and visual content.
Video models — Generate short video clips from text or images. Used for ads, animation, and social content.

Video models are generally the most expensive; text models tend to be the cheapest and most versatile.

No — you don't need to know every model. Think of it like a menu: you don't need to try everything, just the dishes that match what you're hungry for. For most beginners, picking 2–3 models from the same category and comparing them is more than enough. The table is a reference, not a curriculum.

If your primary need is written content (blogs, emails, scripts, SEO), start with text models. We recommend beginning with established models like GPT-4o or Claude Sonnet, as they have the best documentation and largest community support.

Once you're comfortable with text generation, you can layer in image or video models for visual assets. But for pure content creation, text models alone will cover the vast majority of your needs.

Not always. Price and performance are important, but other factors matter too:

Context window — How much text can the model handle at once?
Language support — Some models are stronger in specific languages.
API reliability — Uptime, rate limits, and latency matter for production apps.
Fine-tuning availability — Can you customize the model for your use case?

AI Token King covers all of these dimensions in our comparison tool — not just price per token.

Yes — in fact, many production workflows chain multiple model types together. A common pattern: use a text model to generate a script or description, pass that to an image model to create visuals, then feed the image into a video model to animate it. This multi-model pipeline approach is increasingly common for content teams and agencies.

Ready to compare API pricing?

Now that you know the model types, see exactly how much each one costs per million tokens — and find the best fit for your budget.

View Pricing Table Back to Home