Ask HN: Why does LLM struggle in playing 24 puzzle | Hacker News

Hacker News new | past | comments | ask | show | jobs | submit

login

		Ask HN: Why does LLM struggle in playing 24 puzzle
		2 points by yegle 8 months ago \| hide \| past \| favorite \| 2 comments

		I have yet to find an LLM that solves the 24 puzzle with these numbers (6 9 9 10) correctly. Many seemingly capable LLM will produce incorrect intermediate or final results, e.g. claiming 9 × 9 ÷ (10 - 6) = 24. Did I miss anything in my prompt? Or if there's a reason LLMs are bad at these questions?

sp332 8 months ago | [–]

CISO of Anthropic: "No LLM has had an emergent calculator yet." https://news.ycombinator.com/item?id=39591527

resource0x 8 months ago | [–]

Because it's unsafe, according to [1]. I don't fully understand all the caveats listed there, but I won't take such a risk either :-)

[1] https://privacy.commonsense.org/evaluation/24-Game-Math-Card...

Consider applying for YC's W25 batch! Applications are open till Nov 12.
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact