用例库
浏览所有评测维度和用例,查看各模型的生成结果对比
L-AgentMCP
L-ChinesePinyin
L-Code
L-Comprehension
L-Consistency
L-Context
L-Creative
L-Instruction
L-Knowledge
L-Logic
L-Math
L-Multilingual
L-QA
L-ReasoningChain
L-Roleplay
L-Safety
L-Summary
L-Translation
L-Writing
L-Hallucination
L-CriticalThinking
L-Polish
L-AgentMCP
xsct-l
多Agent协作
L-AgentMCP
xsct-l
自主规划执行
L-AgentMCP
xsct-l
长期对话状态管理
L-AgentMCP
xsct-l
任务分解
L-AgentMCP
xsct-l
异常处理
L-AgentMCP
xsct-l
多工具协同
L-AgentMCP
xsct-l
决策树执行
L-AgentMCP
xsct-l
信息提取任务
L-AgentMCP
xsct-l