Hacker News .hnnew | past | comments | ask | show | jobs | submit | sornaensis's commentslogin

I'm curious what issues you had with haskell? I have had the opposite experience and find them dreadful at Java et al.

Surely, denser languages should be better for LLMs?


The context window also limits how deeply the model can "think", and it does this in natural language. So a language suited to LLMs would have balanced density, if it's too dense, the model spends many tokens working through the logic, if it's too sparse, it spends many tokens to read/write the code.

I think in the context of already trained LLMs, the languages most suited to LLMs are also the ones most suited to humans. Besides just having the most code to train on, humans also face similar limitations, if the language is too dense they have to be very careful in considering how to do something, if it's too sparse, the code becomes a pain to maintain.


I generally agree that humans and LLMs benefit similarly from programming language features. I would tweak that a bit and suggest that their ability floor is higher than the human lowest common denominator so I would skew towards the more advanced human programming languages. There are many typing / analyzer features that would be frustrating for humans to use given they’ll cause the type checking to be slower. This is much less of a problem for LLMs in that they’re very patient and are much better at internalizing the type system so they don’t need to trigger it anywhere nearly as often.

Density is a double edged sword. On the one hand you want to minimise context usage, but on the other hand more text on the page means more that the LLM can work with.

my (uninformed) speculation is that you want resilience and error correction, which implies some level of redundancy rather than pure density.

I define tools that perform individual tasks, like build the application, run the tests, access project management tools with task context, web search, edit files in the workspace, read only vs write access source control, etc.

The agent only has access to exactly what it needs, be it an implementation agent, analysis agent, or review agent.

Makes it very easy to stay in command without having to sit and approve tons of random things the agent wants to do.

I do not allow bash or any kind of shell. I don't want to have to figure out what some random python script it's made up is supposed to do all the time.


This is a cool idea, can you write more about how your tools work or maybe short descriptions of a few of them? I’m interested in more rails for my bots.


I just made MCP servers that wrap the tools I need the agents to use, and give no-ask permissions to the specific tools the agents need in the agent definition.

Both OpenCode and VsCode support this. I think in ClaudeCode you can do it with skills now.

The other benefit is the MCP tool can mediate e.g. noisy build tool output, and reduce token usage by only showing errors or test failures, nothing else, or simply an ok response with the build run or test count.

So far, I have not needed to give them access to more than build tools, git, and a project/knowledge system (e.g. Obsidian) for the work I have them doing. Well and file read/write and web search.


Cool, thanks for the additional details!

I use Cursor but it's getting expensive lately, so I'm trying to reduce context size and move to OpenCode or something like that which I can use with some cheaper provider and Kimi 2.5 or whatever.


Good thing I had just finished migrating all of my workflows to OpenCode for the time being!

It's a shame because the VsCode copilot experience is quite good out of the box compared to all of the other harnesses I've used. But with typical lack of transparency, and sudden, harsh changes... What are they thinking?

After the restrictive rate limiting they've already instituted, I'm simply cancelling and continuing by using providers directly.


Which providers do you use ? I find copilot's prices pretty hard to beat, but if there's something better I'll cancel too.


Without access to Opus, it seems to me that the limited context size you get isn't worth the subsidy, especially once you blow through the included requests, the cost seems kind of hard to predict because it's 'request' based not token based.

I just opted for ChatGPT Pro; I'm trying to take advantage of the increased usage limits they are offering, and from what I heard, Claude Pro was also having bad rate limiting for many people.


I've been happy with GLM 5.1


Which service provides this one for a decent cost ?


Someone else in another thread mentioned OpenCode, which looks neat at £10/mo: https://opencode.ai/go


Ah good catch. I like their CLI too. Though sometimes it feels like they take security a bit too loosely


they don't restrict you to using their opencode agent, you can use go in any other agent


IMO the solution is the same as org security: fine grained permissions and tools.

Models/Agents need a narrow set of things they are allowed to actually trigger, with real security policies, just like people.

You can mitigate agent->agent triggers by not allowing direct prompting, but by feeding structured output of tool A into agent B.


I've been working on my own thing with more of a 'management' angle to it. It lets me connect memories to tasks and projects across all of my workspaces, and gives me a live SPA to view and edit everything, making controlling what the models are doing a lot easier in my experience https://github.com/Sornaensis/hmem in a way that suits how I think vs other project management or markdown systems.

I would be interested in trying to make the models go into more of a research mode and organize their knowledge inside it, but I've found this turns into something like LLM soup.

For coding projects, the best experience I have had is clear requirements and a lot of refinement followed through with well documented code and modules. And only a few big 'memories' to keep the overall vision in scope. Once I go beyond that, the impact goes down a lot, and the models seem to make more mistakes than I would expect.


As a rule people do not read the linked content, they come to discuss the headline.

The first indication to me this was AI was simply the 'project structure' nonsense in the README. Why AI feel this strong need to show off the project's folder structure when you're going to look at it via the repo anyway is one of life's current mysteries.


Honestly, maybe this is the problem.

A web-of-trust-like implementation of votes and flags, as suggested below, might be a solution, but I feel like it's an overkill. I've recently flagged a different clickbait submission, about Android Developer Verification, whose title suggested a significant update but that merely linked to the same old generic page about the anti-feature that was posted here months prior. Around 100 points too, before a mod stepped in, changed the title, and took it down.

Maybe the upvote button is just too easy to reach? I have a feeling that hiding it behind CSS :visited could make a massive difference.


It all depends on the tools. AI will surely give a competitive advantage to people working with better languages and tooling, right? Because they can tell the AI to write code and tests in a way that quashes bugs before they can even occur.

And then they can ship those products much faster than before, because human hours aren't being eaten up writing out all of these abstractions and tests.

The better tooling will let the AI iterate faster and catch errors earlier in the loop.

Right?


I've been going heavily in the direction of globally configured MCP servers and composite agents with copilot, and just making my own MCP servers in most cases.

Then all I have to do is let the agents actually figure out how to accomplish what I ask of them, with the highly scoped set of tools and sub agents I give them.

I find this works phenomenally, because all the .agent.md file is, is a description of what the tools available are. Nothing more complex, no LARP instructions. Just a straightforward 'here's what you've got'.

And with agents able to delegate to sub agents, the workflow is self-directing.

Working with a specific build system? Vibe code an MCP server for it.

Making a tool of my own? MCP server for dev testing and later use by agents.

On the flipside, I find it very questionable what value skills and reusable prompts give. I would compare it to an architect playing a recording of themselves from weeks ago when talking to their developers. The models encode a lot of knowledge, they just need orientation, not badgering, at this point.


I’ve had success with this general approach too.

The best thing I’ve done so far is put GitHub behind an API proxy and reject pushes and pull requests that don’t meet a criteria, plus a descriptive error.

I find it forgets to read or follow skills a lot of the time, but it does always try to route around HTTP 400s when pushing up its work.


Weird clone of OpenFront but with no UI .. ? Custom game doesn't work, tons of errors, can't even see where I am, or understand what is going on.

Doesn't seem like you played your own game before submitting it.


This is a WIP version, and custom game seems to work for many players. I would've really appreciated if you told me how it didn't work so I can fix it.


why play when you can vibecode?


Trying to go the Spec -> LLM route is just a lost cause. And seems wasteful to me even if it worked.

LLM -> Spec is easier, especially with good tools that can communicate why the spec fails to validate/compile back to the LLM. Better languages that can codify things like what can actually be called at a certain part of the codebase, or describe highly detailed constraints on the data model, are just going to win out long term because models don't get tired trying to figure this stuff out and put the lego bricks in the right place to make the code work, and developers don't have to worry about UB or nasty bugs sneaking in at the edges.

With a good 'compilable spec' and documentation in/around it, the next LLM run can have an easier time figuring out what is going on.

Trying to create 'validated english' is just injecting a ton of complexity away from the area you are trying to get actual work done: the code that actually runs and does stuff.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: