XSCT Bench 大模型评测排行榜

基于真实场景的 AI 大模型能力评测与排名

了解更多

什么是 XSCT Bench？

XSCT Bench 是一个独立运营的场景化大模型评测平台。我们通过真实业务场景测试，帮助用户找到最适合自己需求的 AI 模型。评测覆盖文本生成、图像生成、网页生成、视觉理解等多个维度。

当前排行榜

以下是各 AI 模型在综合、基础、进阶、困难四个维度的评分排名：

前 20 名模型

kimi-k2.6 - 综合：91.0 分 - 基础：91.3 分 - 进阶：90.9 分 - 困难：90.8 分
Anthropic: Claude Sonnet 4.6 - 综合：90.2 分 - 基础：90.7 分 - 进阶：90.2 分 - 困难：89.8 分
Claude Opus 4.6 - 综合：89.6 分 - 基础：91.2 分 - 进阶：89.6 分 - 困难：88.1 分
qwen3.6-plus-preview - 综合：88.3 分 - 基础：89.8 分 - 进阶：88.1 分 - 困难：87.2 分
GLM-5.1 - 综合：88.1 分 - 基础：89.1 分 - 进阶：88.0 分 - 困难：87.3 分
kimi-k2.5 - 综合：88.0 分 - 基础：89.5 分 - 进阶：87.8 分 - 困难：86.8 分
GLM-5v-turbo - 综合：87.7 分 - 基础：89.0 分 - 进阶：87.4 分 - 困难：86.5 分
Google: Gemma 4 26B A4B - 综合：87.4 分 - 基础：88.6 分 - 进阶：87.4 分 - 困难：86.3 分
OpenAI: GPT-5.4 - 综合：87.1 分 - 基础：87.5 分 - 进阶：87.2 分 - 困难：86.7 分
Claude Opus 4 7 - 综合：87.0 分 - 基础：88.1 分 - 进阶：86.9 分 - 困难：86.1 分
kimi-k2-thinking-turbo - 综合：86.7 分 - 基础：87.7 分 - 进阶：86.5 分 - 困难：86.1 分
GPT-5.2 - 综合：86.3 分 - 基础：86.8 分 - 进阶：86.3 分 - 困难：85.7 分
qwen3.5-plus-2026-02-15 - 综合：86.3 分 - 基础：88.3 分 - 进阶：86.1 分 - 困难：84.5 分
Google: Gemini 3.1 Pro Preview - 综合：86.1 分 - 基础：87.7 分 - 进阶：85.9 分 - 困难：84.8 分
glm-5-turbo - 综合：85.8 分 - 基础：87.2 分 - 进阶：85.6 分 - 困难：84.7 分
Google: Gemma 4 31B - 综合：85.5 分 - 基础：87.3 分 - 进阶：85.3 分 - 困难：83.8 分
Elephant - 综合：85.4 分 - 基础：87.4 分 - 进阶：85.1 分 - 困难：83.9 分
qwen3.5-omni-plus - 综合：85.3 分 - 基础：87.0 分 - 进阶：85.0 分 - 困难：84.1 分
mimo-v2-pro - 综合：84.7 分 - 基础：86.7 分 - 进阶：84.4 分 - 困难：83.1 分
Qwen: Qwen3.5-9B - 综合：84.6 分 - 基础：86.7 分 - 进阶：84.4 分 - 困难：82.9 分

XSCT Bench

在开始构建之前，先找到最适配你产品的那个模型。

AI 产品的成败，往往在选模型那一刻就已决定。我们用覆盖文本、图像、网页生成的真实产品场景测试，帮你在花时间打磨产品之前，先找到能力、效果、成本都最适配的那个模型。

找到 Product Model Fit，从小山出题(xsct.ai) 开始。

90 已覆盖模型

1,281 用例

164,040 评测总数

¥110,324 已消耗费用

帮我挑模型

告诉我你在做什么产品、要实现什么功能，
我帮你找最合适的模型。

写营销文案生成产品图写代码看图理解生成网页知识库问答

场景化评测榜

性价比
选型榜。

基于真实产品用例，综合评估能力与成本，
帮你找到最适合自己场景的那一个。

查看完整榜单

综合排名基于 164,040 次评测

🥇

kimi-k2.6

91.0

🥈

Anthropic: Claude Sonnet 4.6

90.2

🥉

Claude Opus 4.6

89.6

qwen3.6-plus-preview

88.3

GLM-5.1

88.1

还有 72 个模型

加载中…

爽看图 HOT

同一 Prompt，
差距一目了然。

横向对比各大模型在同一道题上的真实生成结果，眼见为实。

进入爽看图

模型榜单

排序：

综合能力评估（基础×30% + 进阶×40% + 困难×30%）

排名	模型	提供商	成本 i	性价比 i	综合 i	基础 i	进阶 i	困难 i	维度 i	更新时间
1	K kimi-k2.6	月之暗面	$0.59 / $2.34	46.5	91.0	91.3	90.9	90.8	24	2026-04-22
2	A Anthropic: Claude Sonnet 4.6	OpenRouter	$3.00 / $15.00	6.0	90.2	90.7	90.2	89.8	24	2026-04-12
3	C Claude Opus 4.6	PipeLLM	$5.00 / $25.00	3.0	89.6	91.2	89.6	88.1	24	2026-04-12
4	Q qwen3.6-plus-preview	阿里云百炼	$0.29 / $1.75	28.4	88.3	89.8	88.1	87.2	24	2026-04-12
5	G GLM-5.1	智谱开放平台	$0.58 / $2.63	17.6	88.1	89.1	88.0	87.3	24	2026-04-12
6	K kimi-k2.5	月之暗面	$0.59 / $3.07	14.4	88.0	89.5	87.8	86.8	24	2026-04-16
7	G GLM-5v-turbo	智谱开放平台	$0.58 / $2.63	14.6	87.7	89.0	87.4	86.5	24	2026-04-12
8	G Google: Gemma 4 26B A4B	OpenRouter	$0.08 / $0.35	100.0	87.4	88.6	87.4	86.3	24	2026-04-12
9	O OpenAI: GPT-5.4	OpenRouter	$2.50 / $15.00	2.0	87.1	87.5	87.2	86.7	24	2026-04-12
10	C Claude Opus 4 7	PipeLLM	$5.00 / $25.00	1.2	87.0	88.1	86.9	86.1	24	2026-04-22
11	K kimi-k2-thinking-turbo	月之暗面	$1.17 / $8.49	3.0	86.7	87.7	86.5	86.1	22	2026-04-22
12	G GPT-5.2	PipeLLM	$1.75 / $14.00	1.4	86.3	86.8	86.3	85.7	24	2026-04-12
13	Q qwen3.5-plus-2026-02-15	阿里云百炼	$0.12 / $0.70	28.1	86.3	88.3	86.1	84.5	25	2026-04-03
14	G Google: Gemini 3.1 Pro Preview	OpenRouter	$2.00 / $12.00	1.5	86.1	87.7	85.9	84.8	24	2026-04-12
15	G glm-5-turbo	智谱开放平台	$0.59 / $2.64	5.8	85.8	87.2	85.6	84.7	24	2026-04-12
16	G Google: Gemma 4 31B	OpenRouter	$0.13 / $0.38	31.1	85.5	87.3	85.3	83.8	24	2026-04-12
17	E Elephant	OpenRouter	$0.00 / $0.00	—	85.4	87.4	85.1	83.9	24	2026-04-22
18	Q qwen3.5-omni-plus	阿里云百炼	$0.00 / $0.00	—	85.3	87.0	85.0	84.1	24	2026-04-02
19	M mimo-v2-pro	Xiaomi MiMo	$0.97 / $2.90	2.1	84.7	86.7	84.4	83.1	24	2026-04-12
20	Q Qwen: Qwen3.5-9B	OpenRouter	$0.10 / $0.15	37.7	84.6	86.7	84.4	82.9	24	2026-04-16
21	G glm-5	智谱开放平台	$0.59 / $2.64	2.0	84.6	86.7	84.3	82.9	25	2026-04-12
22	M MiniMax-M2.7	MiniMax	$0.31 / $1.23	4.2	84.6	86.0	84.4	83.4	24	2026-04-12
23	Q qwen3.5-flash	阿里云百炼	$0.03 / $0.29	16.2	84.5	86.7	84.3	82.5	24	2026-04-16
24	Q qwen3.5-27b	阿里云百炼	$0.09 / $0.70	5.0	84.2	86.8	84.0	82.0	24	2026-04-12
25	G glm-4.7	智谱开放平台	$0.44 / $2.05	1.1	84.0	85.7	83.7	82.5	24	2026-04-12
26	Q qwen3.5-35b-a3b	阿里云百炼	$0.06 / $0.47	4.4	83.9	86.5	83.6	81.7	24	2026-04-12
27	O OpenAI: GPT-5 Mini	OpenRouter	$0.25 / $2.00	0.9	83.8	85.3	83.5	82.8	24	2026-04-02
28	S StepFun: Step 3.5 Flash	OpenRouter	$0.10 / $0.30	4.2	83.7	85.8	83.2	82.1	24	2026-04-12
29	Q qwen3-max	阿里云百炼	$0.37 / $1.46	0.8	83.6	85.9	83.3	81.8	25	2026-04-22
30	D doubao-seed-1-8	火山引擎	$0.12 / $1.17	0.8	83.5	85.8	83.3	81.5	24	2026-04-16
31	D doubao-seed-1-6	火山引擎	$0.12 / $1.17	0.7	83.5	86.0	83.1	81.5	24	2026-04-12
32	M mimo-v2-omni	Xiaomi MiMo	$0.39 / $1.93	0.3	83.4	85.6	82.9	81.8	24	2026-04-12
33	D deepseek-v3.2	阿里云百炼	$0.29 / $0.44	0.6	83.2	85.5	82.8	81.2	24	2026-04-16
34	M Meituan: LongCat Flash Chat	OpenRouter	$0.20 / $0.80	0.0	82.9	85.3	82.5	81.0	25	2026-04-02
35	M MiniMax-M2.1	MiniMax	$0.30 / $1.21	—	82.8	84.8	82.4	81.2	24	2026-04-16
36	M MiniMax-M2.5	MiniMax	$0.30 / $1.21	—	82.7	84.7	82.5	81.1	24	2026-04-22
37	Q qwen3-coder-next	阿里云百炼	$0.15 / $0.58	—	82.6	85.1	82.2	80.6	24	2026-04-12
38	A Anthropic: Claude Haiku 4.5	OpenRouter	$1.00 / $5.00	—	82.5	84.7	82.3	80.6	25	2026-04-02
39	X xAI: Grok 4.20 Beta	OpenRouter	$2.00 / $6.00	—	82.0	85.0	81.6	79.4	24	2026-04-12
40	X xAI: Grok 4.1 Fast	OpenRouter	$0.20 / $0.50	—	81.7	84.2	81.3	79.7	24	2026-04-12
41	M mimo-v2-flash	Xiaomi MiMo	$0.10 / $0.29	—	81.4	84.2	80.9	79.3	25	2026-04-16
42	Q qwen3.5-omni-flash	阿里云百炼	$0.00 / $0.00	—	80.7	83.4	80.3	78.4	24	2026-04-02
43	N NVIDIA: Nemotron 3 Super (free)	OpenRouter	$0.00 / $0.00	—	80.5	82.3	80.1	79.3	24	2026-04-12
44	G Google: Gemini 3 Flash Preview	OpenRouter	$0.50 / $3.00	—	80.1	83.1	79.7	77.5	25	2026-04-02
45	O OpenAI: gpt-oss-120b	OpenRouter	$0.04 / $0.19	—	80.0	83.0	79.6	77.7	24	2026-04-12
46	G Grok 4	PipeLLM	$3.00 / $15.00	—	80.0	82.5	79.7	78.0	24	2026-04-16
47	D doubao-seed-2-0-mini	火山引擎	$0.03 / $0.29	—	79.1	82.7	78.1	76.8	25	2026-04-12
48	Q qwen3-coder-plus	阿里云百炼	$0.58 / $2.34	—	77.9	81.9	77.2	74.9	24	2026-04-02
49	D doubao-seed-2-0-code	火山引擎	$0.47 / $2.34	—	77.7	81.0	77.3	75.1	24	2026-04-12
50	G glm-4.5-air	智谱开放平台	$0.12 / $0.88	—	77.6	81.2	77.0	75.0	25	2026-04-12
51	Q qwen3-235b-a22b	阿里云百炼	$0.29 / $1.17	—	77.2	80.9	76.9	74.0	24	2026-04-02
52	O OpenAI: GPT-5 Nano	OpenRouter	$0.05 / $0.40	—	76.3	79.3	75.9	73.9	24	2026-04-02
53	D doubao-seed-2-0-pro	火山引擎	$0.47 / $2.34	—	74.8	77.0	74.7	72.8	25	2026-04-12
54	O OpenAI: gpt-oss-20b	OpenRouter	$0.03 / $0.14	—	74.0	77.9	73.4	70.9	24	2026-04-12
55	Q qwen3-14b	阿里云百炼	$0.15 / $0.58	—	73.8	78.7	73.2	69.6	24	2026-04-02
56	Q qwen3-coder-flash	阿里云百炼	$0.15 / $0.58	—	71.0	76.3	70.3	66.5	24	2026-04-12
57	Q qwen3-8b	阿里云百炼	$0.07 / $0.29	—	70.8	76.1	70.1	66.4	24	2026-04-12
58	D doubao-seed-1-6-flash	火山引擎	$0.02 / $0.22	—	70.5	75.1	69.9	66.8	24	2026-04-12
59	D doubao-seed-2-0-lite	火山引擎	$0.09 / $0.53	—	70.1	73.7	69.9	66.8	25	2026-04-12
60	H hunyuan-large	腾讯混元	$0.35 / $1.39	—	69.3	73.9	68.7	65.5	24	2026-04-02
61	H hunyuan-turbo	腾讯混元	$0.12 / $0.29	—	66.3	72.9	65.5	60.8	25	2026-04-02
62	H hunyuan-pro	腾讯混元	$0.35 / $1.39	—	66.2	72.1	65.3	61.4	24	2026-04-02
63	Q qwen3-4b	阿里云百炼	$0.04 / $0.18	—	65.8	71.5	65.0	61.0	24	2026-04-02
64	O OpenAI: GPT-4o-mini	OpenRouter	$0.15 / $0.60	—	65.2	71.7	64.3	60.0	24	2026-04-02
65	G google/gemma-4-26b-a4b	小山本地部署	—	—	63.2	58.5	65.3	65.1	1	2026-04-03
66	M Meta: Llama 3.3 70B Instruct	OpenRouter	$0.12 / $0.38	—	62.3	68.6	61.4	57.1	24	2026-04-12
67	G Google: Gemini 2.5 Flash Lite	OpenRouter	$0.10 / $0.40	—	57.8	62.8	57.1	53.9	25	2026-04-02
68	M Mistral: Mistral Nemo	OpenRouter	$0.02 / $0.04	—	56.0	61.1	55.3	51.8	21	2026-04-12
69	Q qwen3-0.6b	阿里云百炼	$0.04 / $0.18	—	38.4	44.3	37.2	33.9	24	2026-04-12
70	W wan2.7-image	阿里云百炼	—	—	—	—	—	—	—	2026-04-12
70	Q qwen3-omni-flash	阿里云百炼	$0.26 / $1.01	—	—	—	—	—	—	2026-03-30
70	W wan2.7-image-pro	阿里云百炼	—	—	—	—	—	—	—	2026-04-22
70	O OpenAI: GPT-5.4 Mini	OpenRouter	$0.75 / $4.50	—	—	—	—	—	—	2026-03-18
70	Q qwen3-vl-flash	阿里云百炼	$0.02 / $0.22	—	—	—	—	—	—	2026-03-15
70	O OpenAI: GPT-5.4 Nano	OpenRouter	$0.20 / $1.25	—	—	—	—	—	—	2026-03-18
70	G Google: Nano Banana Pro (Gemini 3 Pro Image Preview)	OpenRouter	$2.00 / $12.00	—	—	—	—	—	—	2026-04-22
70	I Inception: Mercury 2	OpenRouter	$0.25 / $0.75	—	—	—	—	—	—	2026-04-12