badcase of Top Model

gpt-4o
qwen-max
ERNIE-4.0
xunfei-4.0Ultra
Baichuan4
GLM4
yi-large
moonshot-v1-8k
Doubao-pro-32k
deepseek-chat-v2
Llama-3.1-70B-Instruct
qwen2-72b-instruct

  classification badcase

  ▶ See all badcases

  info extract badcase

  ▶ See all badcases

  reading comprehension badcase

  ▶ See all badcases

  tableQA badcase

  ▶ See all badcases

  text2SQL badcase

  ▶ See all badcases

  arithmetic badcase

  ▶ See all badcases

  GSM8K badcase

  ▶ See all badcases

  BBH badcase

  ▶ See all badcases

  IFEval-zh (Instruction following) badcase

  ▶ See all badcases