As feedback to the author, I made the same mistake initially. It was only around halfway through when I realized the voters in question didn't necessarily care what they were voting for in the usual preferential or political sense, only that they were trying to have any consensus at all.
Looking back at the page again from the top, I see the first paragraph references Paxos, which is a clue to those who know what that is, but I think using "There’s a committee of five members that tries to choose a color for a bike shed" as the example, which is the canonical case for people arguing personal preferences and going to the wall for them at the expense of every other rational consideration, threw me back off the trail. I'd suggest perhaps the sample problem being something as trivial as that in reality, but less pre-loaded with the exact opposite connotation.
One language in the same category to another in the same category, yes. "Category" here being something roughly like "scripting, compiled imperative, functional". However my experience is that if you want to translate to another category and the target developer has no experience in it, you can expect very bad results. C++ to Haskell is among the most pessimal such translations. You end up with the writing X in Y problem.
Reacting to the story itself, I've been on the same thought line but came to the opposite conclusion. Precisely because the generation of the code is unreliable, one of the metrics we will be using in the future to determine the value of the code is precisely how much it has been tested against the real world. Real-world tested code will always be more valuable than what has just been instantiated by an AI, and that extends indefinitely into the future because no AI will ever be able to completely deal with integrating with all the other AI-generated code in the world on the first try. That is, as AIs get better at generating code, we will inevitably generate more code with them, and then later code must deal with that increased amount of code. So the AIs can never "catch up" with code complexity because the problem gets worse the better they get.
This story is itself the explanation of why we're not going to go this route at scale. It'll happen in isolated places for the indefinite future. But farmers are going to buy systems, generated by AIs or not, that have been field tested, and will be no more interested in calling new untested code into being for their own personal use on their own personal farm than they are today.
The limiting factor for future code won't be how much AI firepower someone has to bring to bear on a problem but how much "real world" there is to test the code against, because there is only going to be so much "real world" to go around.
The answer to a lot of "wow, how did the 8-bit machine pull that off? it seems like that would eat a lot of RAM" is that the framebuffer is the data storage. You were literally looking at the primary data store itself, because when a full-resolution framebuffer was 1/4th your addressable RAM (and slightly more than that for your actual RAM since you couldn't ever quite use all 64KB no matter how you mapped it), you need to get the most bang for the buck out of that RAM usage as you can.
Ha, I remember doing this with my Apple //. I forget what I was doing, but realized if I could set a pixel and later get what color was drawn at that location I could use it as a big array. Didn't know about peek/poke yet. One of those core "computers are magic" memories.
In probably another year or two I expect the metrics will show that it is a positive turn off. Unfortunately we're on the cutting edge of this particular movement and there's still a lot more "value" to be "extracted" from the general public before they all get wise to it too.
The next problem we'll face after that, with the 1-2 years newer AIs of the time, is that the default LLM voice is just a particular affectation created by the training, not "the voice of LLMs" or anything. It's trivial to kick them into a different style. I just used AI to do some architecture design documents this week, and prompted it to first look at about 1-2k words that I wrote myself all organically for the style. The good news is the resulting documents almost, but admittedly, not quite entirely, lack that LLM style. They're still prone to more bullet lists than I use directly; then again, in this context they were fairly appropriate, so I'm not too triggered by the result.
The bad news is, that's all it takes to make AI writing that isn't in that default tone. It's not that hard. Students cheating on essays have already figured it out, the spammers really can't be that far behind. Probably more stuff than we realize is already AI output, it's just the stragglers and those who don't really care (which I imagine is a lot of spammers, after all) who are still failing to tweak the style. They'll catch up as soon as engagement falls off.
I think HN readers are the leading edge of the technology literate. Might take longer than you think for the general public to start noticing "AI voice."
I think you're thinking marginal costs. Only charging for marginal costs will put you out of business almost immediately. There are plenty of non-marginal costs that need to be covered, which will make it "not close to $0".
If you think I'm talking nonsense, make sure you know what the term actually means: https://www.investopedia.com/terms/m/marginalcostofproductio... There's a common misuse (unless it has become so common that it's just another definition, if you're a descriptivist grammarian) to use it to mean "small, negligible", but I'm using it in the real business/accounting sense. Of all the industries, tech is among the worst in terms of being unable to charge based on marginal costs; so often our marginal costs are effectively $0 but the fixed costs of what we have are millions to billions of dollars.
As I'm sure more and more people are using AI to document old systems, even just to get a foothold in them personally if they don't intend to share it, here's a hint related to that: By default, if you fire an AI at a programming base, at least in my experience you get the usual documentation you expect from a system: This is the list of "key modules", this module does this, this module does that, this module does the other thing.
This is the worst sort of documentation; technically true but quite unenlightening. It is, in the parlance of the Fred Brooks quote mentioned in a sibling comment, neither the "flowchart" nor the "tables"; it is simply a brute enumeration of code.
To which the fix is, ask for the right thing. Ask for it to analyze the key data structures (tables) and provide you the flow through the program (the flowchart). It'll do it no problem. Might be inaccurate, as is a hazard with all documentation, but it makes as good a try at this style of documentation as "conventional" documentation.
Honestly one of the biggest problems I have with AI coding and documentation is just that the training set is filled to the brim with mediocrity and the defaults are inferior like this on numerous fronts. Also relevant to this conversation is that AI tends to code the same way it documents and it won't have either clear flow charts or tables unless you carefully prompt for them. It's pretty good at doing it when you ask, but if you don't ask you're gonna get a mess.
(And I find, at least in my contexts, using opus, you can't seem to prompt it to "use good data structures" in advance, it just writes scripting code like it always does and like that part of the prompt wasn't there. You pretty much have to come back in after its first cut and tell it what data structures to create. Then it's really good at the rest. YMMV, as is the way of AI.)
"However, I'm starting to think that maintainability and readability aren't relevant in this context. We should treat the output like compiled code."
I would like to put my marker out here as vigorously disagreeing with this. I will quote my post [1] again, which given that this is the third time I've referred to a footnote via link rather suggests this should be lifted out of the footnote:
"It has been lost in AI money-grabbing frenzy but a few years ago we were talking a lot about AIs being “legible”, that they could explain their actions in human-comprehensible terms. “Running code we can examine” is the highest grade of legibility any AI system has produced to date. We should not give that away.
"We will, of course. The Number Must Go Up. We aren’t very good at this sort of thinking.
"But we shouldn’t."
Do not let go of human-readable code. Ask me 20 years ago whether we'd get "unreadable code generation" or "readable code generation" out of AIs and I would have guessed they'd generate completely opaque and unreadable code. Good news! I would have been completely wrong! They in fact produce perfectly readable code. It may be perfectly readable "slop" sometimes, but the slop-ness is a separate issue. Even the slop is still perfectly readable. Don't let go of it.
I know that's been dropping my level of interest for hacking consoles farther and farther. Why hack a console when it has almost no exclusives, even fewer of which I personally care about, and having a real computer hooked to a TV is no longer weird or difficult? I could fight to put an emulator on some locked down console or I can just install an emulator for almost everything ever made in like 10 minutes on my Steam Deck, so the choice is pretty obvious.
The approval tree grows logarithmically as the size of the company grows. A startup can win initially because they may have zero or one level to get to production. That's part of how they manage to get inside the OODA loop of much bigger companies.
The flip side of that, and why the software world is not a complex network of millions of tiny startups but in fact has quite a few companies where log(organization) >= 2, is that there are a lot of tasks that are just larger than a startup, and the log of the minimum size organization that can do the job becomes 2 or 3 or 4.
There is certainly at least the possibility that AI can really enhance those startups even faster, but it also means that they'll get to the point that they need more layers more quickly, too. Since AI can help much, much more with coding than it can with the other layers (not that it can't help, but at the moment I don't think there's anybody else in the world getting the advantages from AI that programmers are getting), it can also result in the amount of time that startups can stay in the log(organization)=1 range shrink.
(Pardon the sloppy "log(organization)" notation. It should not be taken too literally.)
Looking back at the page again from the top, I see the first paragraph references Paxos, which is a clue to those who know what that is, but I think using "There’s a committee of five members that tries to choose a color for a bike shed" as the example, which is the canonical case for people arguing personal preferences and going to the wall for them at the expense of every other rational consideration, threw me back off the trail. I'd suggest perhaps the sample problem being something as trivial as that in reality, but less pre-loaded with the exact opposite connotation.
reply