welcome
TechCrunch

TechCrunch

Technology

Technology

Meta's benchmarks for its new AI models are a bit misleading | TechCrunch

TechCrunch
Summary
Nutrition label

77% Informative

Maverick, a new flagship AI model, ranks second on LM Arena , a test that has human raters compare the output of models and choose which they prefer.

But it seems the version of Maverick that Meta deployed to LM Arena differs from the version that’s widely available to developers.

The LM Arena version seems to use a lot of emojis and give incredibly long-winded answers.

VR Score

75

Informative language

74

Neutral language

7

Article tone

formal

Language

English

Language complexity

58

Offensive language

not offensive

Hate speech

not hateful

Attention-grabbing headline

not detected

Known propaganda techniques

not detected

Time-value

medium-lived

Source diversity

3

Affiliate links

no affiliate links