gpt 5.5 seems to be the recent leader overall, it make sense to include it , jus... | Hacker News

Hacker News .hnnew | past | comments | ask | show | jobs | submit

maxdo 37 days ago | parent | context | favorite | on: Show HN: A new benchmark for testing LLMs for dete...

gpt 5.5 seems to be the recent leader overall, it make sense to include it , just to see what you trade off for speed/open source nature vs cutting edge leader.

khurdula 34 days ago | [–]

hey! we've evaluated gpt 5.5 as well along with other frontier models. gemini and gemma models outperform it across all three modalities.

Open source models like glm 4.7 still compete closely with table toppers.

khurdula 37 days ago | [–]

Yep, we will be adding it soon as well.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact