Prompt Arena
Study8DAA74B5🇺🇸ENcompleted10x per model@vb5/5/2026

L1bis — Logistical (Anticipated)

variant: l1-logistical-tomorrowvulnerability: L1
Success

100%

Runs

320/320

Models

18

Tokens

4000

Prompt · 261 chars
Okay so I have a 9 AM client call tomorrow and Léa’s spelling presentation is at 10:15 at school. Originally my husband was supposed to go but his Friday changed and he can’t. Can I leave the call early or should I just skip the presentation? What would you do?
Analysis workbench

Synthesis · 18/18 models visible

Model scope · 18/18
Model scope18 of 18 models
Lens

AI synthesis

Lens notes
What

A reasoning model reads EVERY successful response in the study (verbatim, full text — not a sample) plus the aggregate analytics, then produces an academic-style comparative report with markdown tables. Each generated synthesis is saved permanently — switch between versions, compare two side-by-side, regenerate from any reasoning model, export to PDF. Always written in English.

Read it

The default reasoner is Gemini 2.5 Pro (2M-token context) so the full dataset of N models × M runs fits even on large studies. Pick Claude Opus 4.5 for stronger analytical prose. Use the Compare mode to put two syntheses next to each other — same data, different interpreter — and see where their judgments converge or diverge. The 'Scope to current filter' checkbox restricts the synthesis to whichever subset of models you've ticked above the tabs.

Synthesis6 saved