Hacker News .hnnew | past | comments | ask | show | jobs | submit | devnullbrain's commentslogin

I don't see why this wouldn't just lead to model collapse:

https://www.nature.com/articles/s41586-024-07566-y

If you've spent any time using LLMs to write documentation you'll see this for yourself: the compounding will just be rewriting valid information with less terse information.

I find it concerning Karpathy doesn't see this. But I'm not surprised, because AI maximalists seem to find it really difficult to be... "normal"?

Rule of thumb: if you find yourself needing to broadcast the special LLM sauce you came up with instead of what it helped you produce, ask yourself why.


Here in 2026, many forms of training LLMs on (well-chosen) outputs of themselves, or other LLMs, have delivered gigantic wins. So 2024 & earlier fears of 'model collapse' will lead your intuition astray about what's productive.

It is unlikely you are accurately perceiving some limitation that Karpathy does not.


The article is not on training LLMs. it is about using LLMs to write a wiki for personal use. The article assumes a fully trained LLM such as ChatGPT or Claude already exists to be used.


Don't even try, after vibe coding, people seem to be adopting vibe thinking. "Model Collapse sounds cool, I'm gonna use it without looking up"


Vibe thinking... that's an interesting premise. I'll have to build up my new llm-wiki before I'll know what to think about "vibe thinking."


I was joking but also not joking, this llm-wiki idea is fun. I fed into it it's own llm-wiki.md, Foucault's Pendulum, randomly collected published papers about the philosophy of GiTS, several CCRU essays, and Manufacturing Consent. It drew fun red yarn between all of them, about the topic of red yarn (e.g. schizos drawing connections out of nothing, particularly through the use of computers, and how this relates to itself doing literally this as it does it.)

I'll spare you most of the slop but.. "The Case That I Am Abulafia: The parallel is uncomfortable and precise. [...]"

Yeah... It's fun though.


Also, TFA prescribes putting ground truth source files into a /raw directory.

Everything is derived from them and backlinks into them. Which is necessary to be vigilant about staleness, correctness, drift, and more. Just like in a human-built knowledge base.


Edit for context: the sibling comment from karpathy is gone after being flagged to oblivion. Not sure if he deleted it or if it was just removed based on the number of flags? He had copy-pasted a few snarky responses from Claude and essentially said “Claude has this to say to you:” followed by a super long run on paragraph of slop.

————

Wow, I respect karpathy so much and have learned a ton from him. But WTF is the sibling comment he wrote as a response to you? Just pasting a Claude-written slop retort… it’s sad.

Maybe we need to update that old maxim about “if you don’t have something nice to say, don’t say it” to “if you don’t have something human to say, don’t say it.”

So many really smart people I know have seen the ‘ghost in the machine’ and as a result have slowly lost their human faculties. Ezra Klein, of all people, had a great article about this recently titled “I Saw Something New in San Francisco” (gift link if you want to read it): https://www.nytimes.com/2026/03/29/opinion/ai-claude-chatgpt...


It's not sad. He's a person like you and me. devnullbrain's comment is snarky. He invoked model collapse which has nothing to do with the topic of a wiki/kb, he wrote that karpathy is not normal, and then seemed to imply that the idea was useless. I'd be pretty in my feels and the fact that he wrote it and deleted it seems like a +1 normal guy thing.


Yeah. I know you didn’t see it, but it was truly a substance-free response. Glad to see he deleted it and I know I’ve been guilty of the same kind of knee-jerk response before.


I saw it. It sucked, I agree. But like you said, we all get one (or a few) of those.


Lol at that.

It's weird how some people cover the whole range of putting out some really good stuff and other times the complete opposite.

Feels as if they were two different people ... or three, or four.


Emotional state, tiredness, drunkenness, a goods nights sleep… the number of factors that drive our responses is ridiculous.


Eh, he’s just a person. I’m not surprised he posted a rude comment haha, and it got rightfully flagged off the site for being AI slop.

Appreciate the gift link, I’ll give it a read!


Aw, I missed it


I did a proof of concept for self-updating html files (polyglot bash/html) some weeks ago. It actually works quite well, with simple prompting it seems to not just go in circles (https://github.com/jahala/o-o)


also my experience. it can't even keep up with a simple claude.md let alone a whole wiki...


You seem to be blaming the OS for how you broke it?


I'm not the PC but I think you miss most of the pain points due to: 'personal' projects.

There's not a compatible format between different compilers, or even different versions of the same compiler, or even the same versions of the same compiler with different flags.

This seems immediately to create too many permutations of builds for them to be distributable artifacts as we'd use them in other languages. More like a glorified object file cache. So what problem does it even solve?


BMIs are not considered distributable artifacts and were never designed to be. Same as PCHs and clang-modules which preceded them. Redistribution of interface artifacts was not a design goal of C++ modules, same as redistribution of CPython byte code is not a design goal for Python's module system.

Modules solve the problems of text substitution (headers) as interface description. It's why we call the importable module units "interface units". The goals were to fix all the problems with headers (macro leakage, uncontrolled export semantics, Static Initialization Order Fiasco, etc) and improve build performance.

They succeeded at this rather wonderfully as a design. Implementation proved more difficult but we're almost there.



> and the missing context is essential.

Oh yes, I'd recommend everyone who uses the phrase reads the rest of the paper to see the kinds of optimisations that Knuth considers justified. For example, optimising memory accesses in quicksort.


This shows how hard it is to create a generalized and simple rule regarding programming. Context is everything and a lot is relative and subjective.

Tips like "don't try to write smart code" are often repeated but useless (not to mention that "smart" here means over-engineered or overly complex, not smart).


I dunno, Ive seen people try to violate "dont prematurely optimize" probably a thousand times (no exaggeration) and never ONCE seen this happen:

1. Somebody verifies with the users that speed is actually one of the most burning problems.

2. They profile the code and discover a bottleneck.

3. Somebody says "no, but we shouldnt fix that, that's premature optimization!"

Ive heard all sorts of people like OP moan that "this is why pieces of shit like slack are bloated and slow" (it isnt) when advocating skipping steps 1 and 2 though.

I dont think they misunderstand the rule, either, they just dont agree with it.

Did pike really have to specify explicitly that you have to identify that a problem is a problem before solving it?


>1. Somebody verifies with the users that speed is actually one of the most burning problems.

Sometimes this is too late.

C++98 introduce `std::set` and `std::map`. The public interface means that they are effectively constrained to being red-black trees, with poor cache locality and suboptimal lookup. It took until C++11 for `std::unordered_map` and `std::unordered_set`, which brought with them the adage that you should probably use them unless you know you want ordering. Now since C++23 we finally have `std::flat_set` and `std::flat_map`, with contiguous memory layouts. 25 years to half-solve an optimisation problem and naive developers will still be using the wrong thing.

As soon as the interface made contact with the public, the opportunity to follow Rob Pike's Rule 5 was lost. If you create something where you're expected to uphold a certain behaviour, you need to consider if the performance of data structures could be a functional constraint.

At this point, the rule becomes cyclical and nonsensical: it's not premature if it's the right time to do it. It's not optimisation if it's functional.


You've inadvertently made an argument for deprecation, not ignoring rob's rule.

When building interfaces you are bound to make mistakes which end users will end up depending on (not just regarding optimization).

The correct lesson to learn from this is not "just dont make mistakes" but to try and minimize migration costs to prevent these mistakes from getting tightly locked in and try to detect these mistakes earlier on in the design process with more coordinated experimentation.

C++ seems pretty bad at both. It's not unusual, either - migration and upgrade paths are often the most neglected part of a product.


How would you have minimised migration costs for std::map?


> the opportunity to follow Rob Pike's Rule 5 was lost.

std::set/std::map got into trouble because they chose the algorithm first and then made the data model match. Rule 5 suggests choosing the right data model first, indicating that it is most important.


There's probably tons of stuff people use that is way slower than it needs to be but speed isn't one of their self-reported burning problems because they don't even realize it could be faster. pip vs uv?


Exactly!

I wish Knuth would come out and publicly chastise the many decades of abuse this quote has enabled.


To be fair, I think human nature is probably a bigger culprit here than the quote. Yes, it was one of the first things told to me as a new programmer. No, I don't think it influenced very heavily how I approach my work. It's just another small (probably reasonable) voice in the back of my head.


Yep. If one is implementing quicksort for a library where it will be used and relied on, I'd sure hope they're optimizing it as much as they can.


A state machine is a perfect example of a case where you would benefit from linear types.

Some things just need precise terminology so humans can communicate about them to humans without ambiguity. It doesn't mean they're inherently complex: the article provides simple definitions. It's the same for most engineering, science and language. One of the most valuable skills I've learned in my career is to differentiate between expressions, statements, items, etc. - how often have you heard that the hardest problem in software development is coordinating with other developers? If you learn proper terminology, you can say exactly what you mean. Simple language doesn't mean more clear.

I wasn't born knowing Rust, I had to learn it. So I'm always surprised by complaints about Rust being too complex directed at the many unremarkable people who have undergone this process without issues. What does it say, really? That you're not as good as them at learning things? In what other context do software people on the internet so freely share self-doubt?

I also wonder about their career plans for the future. LLMs are already capable of understanding these concepts. The tide is rising.


...a far better place than 2011.


Come on, the 6th word after 5 English words is obviously going to be a rare word in your second keyboard language, not another common English word.


On top this my own language is Englishifying so it’s understandable it gets confused. But still, the whole thing is infuriating when you type something correct and boom, it becomes a word in another language.


>You see, the shapes of roads in real life come from an underlying essential fact: the wheel axles of a vehicle. No matter how you drive a car, the distance between the left and right wheels remains constant. You can notice this in tyre tracks in snow or sand. Two perfectly parallel paths, always the same distance apart maintaining a consistent curved shape.

Emphasis mine - that's not really true

- Cars have differentials, so the wheel speed can differ between wheels

- Steering geometry isn't parallel! The tyres turn a different amount when you turn the wheel

- Unless you're going in a straight line, cars don't go where the tyres point! Tyres 'pull' the car towards their slip angle

What you will actually see in tracks in snow or sand is non-parallel tracks, describing curves of different radii. You can also see this in racetracks, where the path more closely resembles the limits of car physics without care for passenger comfort or tyre wear. The example 'fail' image looks not dissimilar from a hairpin turn.


Author here: i agree the article oversimplified here and in practice vehicle tracks can sometimes be non concentric. But it has nothing to do with differentials. Differentials just allow wheels to rotate at different speeds (as perimeter of a smaller radius circle is smaller the outer one). Non parallelism usually comes from the Ackerman steering or slipping. But rear tracks of a vehicle going slowly through sand will always be concentric. Parallelism is important for engineers because it also allows multiple vehicles to go in parallel on a multi lane road. You can see how that is failing with bezier paths in the CS2 ss in the article. Also hairpin turns are usually designed with arcs in mind as well


Railroad cars don't have differentials and travel on parallel tracks. How?

https://www.reddit.com/r/Skookum/comments/47sbri/richard_fey...


For those that don't read the link, by having conical wheels.


It’s much cooler when Richard Feynman explains it.


See https://www.thecontactpatch.com/ for some interesting reading material related to that.


Is your take that the way we view sexuality today is not meaningfully different from the Victorian era?


Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: