Hacker News .hnnew | past | comments | ask | show | jobs | submit | remilouf's commentslogin

> Ironically LLMs solve the MxN problem he's complaining about

Enlighten me please


Ooops sorry

Author here. You're right, it's not a hard problem, but a particularly annoying one.

I haven't always done this, and the knowledge base used to visibly degrade over time. Reviewing a PR does not take a long time, maybe a few minutes, and this compounds over time.


This is actually pretty funny.


That’d be a pretty inefficient way to generate bullshit at scale


automating the creation of false testimonials is inefficient at scale? go on ...

what's the alternative?


LLM evaluations are very sensitive to the details of the prompt's structure. This post shows how using structured generation reduces the results' variance and the ranking shifts.


Looks like it’s quite the opposite: http://blog.dottxt.co/performance-gsm8k.html


What do you mean by "semantic dimension"?


That whole structured generation line of work looks promising. I hope someone else takes this and runs evaluations on other benchmarks. Curious to see if the results translate!


Agreed! While these results are very promising, there's still a lot to explore in this space.

In addition to the "prompt consistency" and "thought-control" ideas mentioned in the post, I'm definitely curious how the performance is on more complex structured data (things like codegen).


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: