badcase of Top Model
gpt-4o
gemini-1.5-pro
qwen-max
ERNIE-4.0
ERNIE-4.0-Turbo
xunfei-4.0Ultra
Baichuan4
GLM4
GLM-4-Plus
yi-large
moonshot-v1-8k
Doubao-pro-32k
deepseek-chat-v2
Llama-3.1-70B-Instruct
qwen2.5-72b-instruct
classification badcase
info extract badcase
reading comprehension badcase
tableQA badcase
text2SQL badcase
arithmetic badcase
GSM8K badcase
BBH badcase
IFEval-zh (Instruction following) badcase