It's not that numbers are a farce but different industry segments are doing better or worse than other.
HN being a tech forum that now increasingly skews East and Midwest (heck, it's not even 7am yet in the West, but look at the degree of engagement on here) means most HNers are impacted by a slowdown in tech hiring, which exacerbates the sense of pessimism.
And tbf, if you aren't working in a tech hub like the Bay or NYC, you are going to be screwed if you are laid off - employers increasingly restrict remote work to those employees who have proven internal track records, and inshoring hubs like in RTP, Denver, Atlanta, etc are on the chopping block.
2. Translating between two different coding languages (migration)
I have a game written in XNA
100% of the code is there, including all the physics that I hand-wrote.
All the assets are there.
I tried to get Gemini and Claude to do it numerous times, always with utter failure of epic proportions with anything that's actually detailed.
1 - my transition from the lobby screen into gameplay? 0% replicated on all attempts
2 - the actual physics in gameplay? 0% replicated none of it works
3 - the lobby screen itself? non-functional
Okay so what did it even do? Well it put together sort of a boilerplate main menu and barebones options with weird looking text that isn't what I provided (given that I provided a font file), a lobby that I had to manually adjust numerous times before it could get into gameplay, and then nonfunctional gameplay that only handles directional movement and nothing else with sort of half-working fish traveling behavior.
I've tried this a dozen times since 2023 with AI and as late as late last year.
ALL of the source code is there every single thing that could be translated to be a functional game in another language is there. It NEVER once works or even comes remotely close.
The entire codebase is about 20,000 lines, with maybe 3,000 of it being really important stuff.
So yeah I don't really think AI is "really good" at anything complex. I haven't really been proven wrong in my 4 years of using it now.
I crave to see people saying "Here's the repo btw: ..." and others trying out porting it over, just so we see all of the ways how AI fails (and how each model does) and maybe in the middle there a few ways to improve its odds. Until it eventually gets included in training data, a bit like how LLMs are oddly good at making SVGs of pelicans on bicycles nowadays.
And then, maybe someone slightly crazy comes along and tries seeing how much they can do with regular codegen approaches, without any LLMs in the mix, but also not manual porting.
Agreed -- coding agents / LLMs are definitely imperfect, but it's always hard to contextualize "it failed at X" without knowing exactly what X was (or how the agent was instructed to perform X)
I'm sure someone who regularly programs games in the destination language I want who also has worked with XNA in the past as a game developer could port it in a week or something yeah
It's okay shills and those with stock in the AI copies brainwashed themselves inside out and will spam forever on this website that if you just introduce the right test conditions agents can do anything. Never mind engineering considerations if it passes the tests it's good homie! Just spent an extra few hundred or thousand a month on it! Especially on the company I have stock in! Give me money!
Yes, you are right: amongst the four points, migration is the most contentious one. You need to be fairly prudent about migration and depending on the project complexity, it may or may not work.
But I do feel this is a solvable problem long term.
In those situations you basically need to guide llm to do it properly. It rarely one shots complex problems like this, especially in non web dev, but could make it faster than doing it manually.
I've done this multiple times in various codebases, both medium sized personal ones (approx 50k lines for one project, and a smaller 20k line one earlier) and am currently in the process of doing a similar migration at work (~1.4 million lines, but we didn't migrate the whole thing, more like 300k of it).
I found success with it pretty easily for those smaller projects. They were gamedev projects, and the process was basically to generate a source of truth AST and diff it vs a target language AST, and then do some more verifier steps of comparing log output, screenshot output, and getting it to write integration tests. I wrote up a bit of a blog on it. I'm not sure if this will be of any use to you, maybe your case is more difficult, but anyway here you go: https://sigsegv.land/blog/migrating-typescript-to-csharp-acc...
For me it worked great, and I would (and am) using a similar method for more projects.
"I also wanted to build a LOT of unit tests, integration tests, and static validation. From a bit of prior experience I found that this is where AI tooling really shines, and it can write tests with far more patience that I ever could. This lets it build up a large hoard of regression and correctness tests that help when I want to implement more things later and the codebase grows."
The tests it writes in my experience are extremely terrible, even with verbose descriptions of what they should do. Every single test I've ever written with an LLM I've had to modify manually to adjust it or straight up redo it. This was as recent as a couple months ago for a C# MAUI project, doing playwright-style UI-based functionality testing.
I'm not sure your AST idea would work for my scenario. I'd be wanting to convert XNA game-play code to PhaserJS. It wouldn't even be close to 95% similar. Several things done manually in XNA would just be automated away with PhaserJS built-ins.
Ya I could see where framework patterns and stuff will need a lot of corrections in post after that type of migration. For mine it was the other direction and only the server portion (Express server written in typescript for a Phaser game, and porting to Kestrel on C#, which was able to use pretty much identical code and at the end after it was done I just switch and refactor ba few things to make it more idiomatic C#).
For the tests, I'm not sure why we have such different results but essentially it took a codebase I had no tests in, and in the port it one shot a ton of tests that have already helped me in adding new features. My game server for it runs in kubernetes and has a "auto-distribute" system that matches players to servers and redistributes them if one server is taken offline. The integration tests it wrote for testing that auto-distribute system found a legit race condition that was there in both the old and new code (it migrated it accurately enough that it had the same bugs) and as part of implementing that test it fixed the bug.
Of course I wouldn't use it if it wasn't a good tool but for me the difference between doing this port via this method versus doing it manually in prior massive projects was such an insane time save that I would have been crazy to do it any other way. I'm super happy with the new code and after also getting the test infra and stuff like that up it's honestly a huge upgrade from my original code that I thought I had so painstakingly crafted.
The only model that works well for complex things is Opus, and even then barely (but it does and you need to use api/token pricing if you want guarantee it’s the real thing).
patterns that may help increase subjective perception of reliability from non-deterministic text generators trained on the theft of millions of developer's work for the past 25 years.
I think it's nonsensical to insist that it would only be a subjective improvement. The tests either exist and ensure that there aren't bugs in certain areas, or they don't. The agent is either in a feedback loop with those tests and continues to work until it has satisfied them or it doesn't.
Red-Green TDD is one of the main "agent patterns" Simon proposes, so it seemed relevant.
Also, the same thing applies to feedback loops with compilers and linters as well: they provide objective feedback that then the AI goes and fixes, verifiably resolving the feedback.
Even with less verifiable things like using specifications, the fact that it relies on less objective grounding metrics doesn't mean there's no change in the model's behavior. I'm sure if you looked at the code that a model produced and the amount of intervention necessary to get there for a model that was asked to produce something without a specification versus with one, you would definitely see an objective difference on average. We're already getting objective studies regarding AGENTS.MD
You should consider that there is a rather large gulf between low-effort crap and using agentic LLMs to make more sophisticated games faster before you downvote me.
It's just not black and white, and to treat it as such devalues the conversation.
ofc there’s a spectrum in LLM usage, but the usage scenario of LLMs as a “supervised force multiplier” (paraphrasing your linked comment) isn’t how most near anyone uses them. I can’t imagine gamedev is any different.
what a vast majority of LLM usage appears to me is as a unsupervised slop multiplier. any social media platform, including hacker news is rife with a deluge of unpolished LLM generated turds that creators pass off as their own work when they can’t even explain half of how it works or what it does.
circling back with gamedev specifically and art more generally. sure if LLMs are just one part of the process to push out some grander, well thought out vision who am I to really care. again thought, that isn’t what I see. I only see untalented lazy “excommunicated devs” passing off the most bottom of the barrel trash as “games”
> the usage scenario of LLMs as a “supervised force multiplier” (paraphrasing your linked comment) isn’t how most near anyone uses them. I can’t imagine gamedev is any different.
That's just it: you don't know this. You're speculating from within your confirmation bias bubble. Everything you're saying is completely anecdotal.
reply