Hacker News .hn
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
maxdo
37 days ago
|
parent
|
context
|
favorite
| on:
Show HN: A new benchmark for testing LLMs for dete...
gpt 5.5 seems to be the recent leader overall, it make sense to include it , just to see what you trade off for speed/open source nature vs cutting edge leader.
khurdula
34 days ago
|
next
[–]
hey! we've evaluated gpt 5.5 as well along with other frontier models. gemini and gemma models outperform it across all three modalities.
Open source models like glm 4.7 still compete closely with table toppers.
khurdula
37 days ago
|
prev
[–]
Yep, we will be adding it soon as well.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: