A new video presenting the current ranking of the best 150 large language models (LLMs) for text-based tasks has been published on YouTube. The video is based on data from LMArena, one of the most popular platforms for comparing modern GenAI models using real-world text tasks. Within this platform, different language models compete against each other on the same tasks, and their performance is presented in the form of a dynamic, continuously updated ranking.
The ranking was compiled based on data from December 30, 2025. The current model ranking can be found on the LMArena website in the “Text Arena” section. Below is a list of the models that were included in the ranking.
Google — Gemini
- Gemini 3 Pro
- Gemini 3 Flash
- Gemini 3 Flash (Thinking-Minimal)
- Gemini 2.5 Pro
- Gemini 2.5 Flash
- Gemini 2.5 Flash Preview (09-2025)
- Gemini 2.5 Flash Lite Preview (No Thinking)
- Gemini 2.5 Flash Lite Preview (Thinking)
- Gemini 2.0 Flash
- Gemini 2.0 Flash Lite Preview
- Gemini 1.5 Pro (001 / 002)
- Gemini Advanced (0514)
OpenAI — GPT / ChatGPT / o-Series
- GPT-5.2 / GPT-5.2 High
- GPT-5.1 / GPT-5.1 High
- GPT-5 High / GPT-5 Mini High / GPT-5 Nano High
- GPT-5 Chat
- GPT-4.5 Preview
- GPT-4.1 / GPT-4.1 Mini / GPT-4.1 Nano
- GPT-4o / GPT-4o Mini
- GPT-4 Turbo
- ChatGPT-4o (Latest)
- o4-Mini
- o3 / o3-Mini / o3-Mini High
- o1 / o1 Preview / o1 Mini
Anthropic — Claude
- Claude Opus 4.5 (Standard & Thinking)
- Claude Opus 4.1 (Standard & Thinking)
- Claude Opus 4 (2025-05-14)
- Claude Sonnet 4.5 (Standard & Thinking)
- Claude Sonnet 4 (2025-05-14)
- Claude Haiku 4.5
- Claude 3.7 Sonnet (Standard & Thinking)
- Claude 3.5 Sonnet
- Claude 3.5 Haiku
- Claude 3 Opus
xAI — Grok
- Grok 4.1 / Grok 4.1 Thinking
- Grok 4 Fast / Grok 4 Fast Reasoning
- Grok 4 (0709)
- Grok 3 Preview
- Grok 3 Mini High / Mini Beta
- Grok 2
Alibaba — Qwen / QwQ
- Qwen3 Max / Qwen3 Max Preview
- Qwen3-235B (Instruct / Thinking / No-Thinking)
- Qwen3-VL-235B (Instruct & Thinking)
- Qwen3 Next-80B (Instruct & Thinking)
- Qwen3 Coder-480B
- Qwen3-32B / Qwen3-30B
- Qwen 2.5 Max
- Qwen Plus
- Qwen Max
- QwQ-32B
DeepSeek
- DeepSeek-V3.2 (Standard & Thinking)
- DeepSeek-V3.1 (Standard / Thinking / Terminus)
- DeepSeek-V3
- DeepSeek-V2.5
- DeepSeek-R1 / R1-0528
Zhipu AI — GLM
- GLM-4.7 / 4.6 / 4.5
- GLM-4.5 Air / GLM-4.5V
- GLM-4 Plus / GLM-4 Plus-0111
Baidu — ERNIE
- ERNIE 5.0 Preview (1203)
- ERNIE 5.0 Preview (1103)
Moonshot AI — Kimi
- Kimi K2 Thinking Turbo
- Kimi K2 Preview (0905)
- Kimi K2 Preview (0711)
Mistral AI
- Mistral Large 3
- Mistral Medium (2508 / 2505)
- Mistral Small (2506)
Tencent — Hunyuan
- Hunyuan Vision 1.5 (Thinking)
- Hunyuan T1
- Hunyuan Turbos
- Hunyuan Turbo
- Hunyuan Large
MiniMax
- MiniMax-M2.1 Preview
- MiniMax-M2
- MiniMax-M1
Amazon — Nova
- Amazon Nova Experimental Chat (11-10 / 10-20 / 10-09)
- Nova 2 Lite
Meta — LLaMA
- LLaMA-4 Maverick
- LLaMA-4 Scout
- LLaMA-3.3-70B
- LLaMA-3.1-405B (BF16 / FP8)
- LLaMA-3.1 Nemotron Ultra
NVIDIA — Nemotron
- Nemotron Super-49B
- Nemotron Nano-30B
Other Models
- Command-A (03-2025)
- Gemma 3 (27B / 12B / N-E4B)
- GPT-OSS-120B / GPT-OSS-20B
- Yi-Lightning
- LongCat Flash Chat
- MiMo-V2 Flash
- Intellect-3
- Ling Flash 2.0
- Ring Flash 2.0