HN2new | past | comments | ask | show | jobs | submit | CharlieDigital's commentslogin

Problem with this approach is that it is contextually expensive because even for a 1 sentence change, the LLM will need to read whole docs into context.

Better solution I mentioned above: inline, idiomatic code comments for your language and have the LLM dump reasoning into the `<remarks>`, `@remarks`, etc. of the comment block.

Now you get free, always up-to-date, platform agnostic, zero-infrastructure long-term memory which works on any agent that every agent is forced to read when it reads the method. It will never miss like it can with secondary docs.

It saves context because instead of reading a 2000 token document for 100 tokens of relevant context, it just reads the comments for the specific method and hydrates long term memory just-in-time with almost certain activation rate without additional prompting.


Here's an easier solution that actually works, gives any agent FREE long term memory (platform agnostic and zero infrastructure!), always accurate context that is self-maintained by the LLM.

Use the idiomatic comments for your language.

Here is a snippet of our prompt for C# (and similar one for TS):

    - Use idiomatic C# code comments when writing public methods and properties
    - `<summary>` concise description of the method or property
    - `<remarks>` the "why"; provide domain or business context, chain of thought, and reasoning; mention related methods, types, and files
    - `<param>` document any known constraints on inputs, special handling, etc.
    - `<return>` note the expected return value
    - `<example>` provide a brief example of correct usage
    - Use inline comments sparingly where it adds clarity to complex code
    - Update comments as you modify the code; ensure they are consistent with the intent of the code
What happens: when the LLM stumbles upon this code in the future, it reads the comments and basically "re-hydrates" some past state into context. The `<remarks>` one is doing heavy lifting here because it is asked to provide its train of thought and mention related artifacts (future LLM knows where else to look).

You already know the agents are going to read your code again when it gathers context so just leave the instructions and comments inline.

The LLM is very good at keeping these up-to-date on refactors (we are still doing human code reviews) and a bonus is that it makes it very easy to review the code to see why the LLM generated some function or property because the reasoning is right there for the human as well.


I wish IDEs had better support for quick toggling the display of comments. You make a good point about it making sense for the docs to live alongside the code, I just don't usually want to see giant comment blocks everywhere while I'm operating in a codebase.

JetBrains Rider has this as does VS Code: CMD+K, CMD+/ to fold, CMD+K, CMD+J to unfold.

    > It’s very disputable whether BEVs are industry’s future 
As a technology, ICE is pretty much close to its peak. It's very hard to imagine significant improvements in ICE as far as efficiency, weight, and power output goes.

The same cannot be said for batteries and electric motors. We are still quite far from technological limits for the platform. It doesn't seem disputable at all that a platform that can still evolve and improve with significant room for growth will eventually overtake one that has peaked.


Also, oil will only get more expensive in the long term, and electricity is going to get cheaper, with more and more solar panels generating electricity locally "for free".

> oil will only get more expensive in the long term

Will it? Oil price will reach an equilibrium because lower demand due to electrification will lead to lower prices, which will increase higher consumption.

There are enough low cost oil producers like Saudi Arabia to keep the pipes filled with oil at whatever the prevailing market price is.


The price of gas is $9 per gallon in Germany right now.

Oil (prior to the current pointless war) was pretty cheap, though. Adjusted inflation cheaper than e.g. it was at any point between 1975 and 85.

False dichotomy. What ICEs are or aren't has zero impact on whether BEVs are the future.

You can pretty much see the writing on the wall; sure, one can bury one's head in the sand, but it is almost certain that in say 15-20 years, BEV will become the majority of vehicles on the road even if the US and Germans fight to the very end to keep ICE alive.

Yes, Betamax is undoubtedly the future.

> As a technology, ICE is pretty much close to its peak. It's very hard to imagine significant improvements in ICE as far as efficiency, weight, and power output goes

What makes you say that? Jet engine manufacturers are constantly making improvements, and one of the biggest recent breakthroughs has been around using proprietary alloys in the construction of some parts to make them lighter and able to operate at hotter temperatures, thus increasing efficiency. I'm not working on any engines, but from a layman's perspective I don't see why there couldn't be material science improvements made to combustion engines.


Your example is from a high-value engine where the switch to proprietary alloys has significant savings in a jet engine.

ICE for cars? Do you think the same constraints apply?


No, but I think if material science improvements can be made in jet engines, there is no reason to think combustion engines for car are "complete" and nothing around them could be improved. They're much less expensive, but at a much larger volume, and they have at least a few decades of future - even if we assume all of the developed world moves to EVs in the next decade or so, which is unrealistic already, there is all of the rest of the world. Most African countries don't have stable power for all of their populations, EVs are simply not going to work there as the main vehicle type. Then add in trucks, where the weight of the batteries makes them impractical for heavy duty long distance trucking. There are improvements, but it will still be many years before they are available, and decades before they've replaced everything already existing.

    >  They're much less expensive, but at a much larger volume
It is precisely because they are much less expensive that they've reached a realistic ceiling of advancement. If you're going to produce an ICE that yields a 10% improvement in efficiency, decreased weight, increased reliability, decreased maintenance effort -- but at a cost of an extra $xxxx per unit, then it may as well never happen.

As a logical exercise, let's consider what are some of the top technological advancements in ICE and ICE drivetrain components over the last 20 years. CVTs? Nissan's VVEL? What do you think are noteworthy automotive ICE technological advancements we can look forward to in the consumer space? Where exactly do you see automotive applications of ICE advancing in the next 10 years?

On the other hand, it seems pretty easy to see the exact opposite with electric/hybrid drivetrains. Many innovations and advancements. It's also easy to see the roadmap for advancements (see what the Chinese are doing). Battery swaps? New chemistries? Solid state batteries? Compact axial flux motors? Faster charging electrical architectures? Endless space for technological advancement and growth because it's so early.


The easiest thing to do is to have the LLM leave its own comments.

This has several benefits because the LLM is going to encounter its own comments when it passes this code again.

    > - Apply comments to code in all code paths and use idiomatic C# XML comments
    > - <summary> be brief, concise, to the point
    > - <remarks> add details and explain "why"; document reasoning and chain of thought, related files, business context, key decisions.
    > - <params> constraints and additional notes on usage
    > - inline comments in code sparingly where it helps clarify behavior
(I have something similar for JSDoc for JS and TS)

Several things I've observed:

1. The LLM is very good at then updating these comments when it passes it again in the future.

2. Because the LLM is updating this, I can deduce by proxy that it is therefore reading this. It becomes a "free" way to embed the past reasoning into the code. Now when it reads it again, it picks up the original chain-of-thought and basically gets "long term memory" that is just-in-time and in-context with the code it is working on. Whatever original constraints were in the plan or the prompt -- which may be long gone or otherwise out of date -- are now there next to the actual call site.

3. When I'm reviewing the PR, I can now see what the LLM is "thinking" and understand its reasoning to see if it aligns with what I wanted from this code path. If it interprets something incorrectly, it shows up in the `<remarks>`. Through the LLM's own changes to the comments, I can see in future passes if it correctly understood the objective of the change or if it made incorrect assumptions.


In my experience, LLM-added comments are too silly and verbose. It's going to pollute its own context with nonsense and its already limited ability to make sense of things will collapse. LLMs have plenty of random knowledge which is occasionally helpful, but they're nowhere near the standard of proper literacy of even an ordinary skilled coder, let alone Dr. Knuth who defined literate programming in the first place.

The output of an LLM is a reflection of the input and instructions. If you have silly and verbose comments, then consider improving your prompt.

Almost nothing in a Claude Code session has to do with "your prompt", it works for an hour afterwards and mostly talks to itself. I've noticed if you give it small corrections it will leave nonsensical comments referring to your small correction as if it's something everyone knows.

It has everything to do with your prompt and why Claude Code has a plan mode: because the quality of your planning, prompting, and inputs significantly affects the output.

Your assertion, then, is that even a 1 sentence prompt is as good as a 5 section markdown spec with detailed coding style guidance and feature, by feature specification. This is simply not true; the detailed spec and guidance will always outperform the 1 sentence prompt.


No, I use plan mode and have several rounds of conversation with it, but lately I've been doing tasks where it does tons of independent research and finds complicated conclusions in an existing old codebase. I don't really feel like either of those count as "a prompt".

The plan mode is useful because if you do corrections during development mode it does that silly thing where it leaves comments referring to your corrections.


"then consider improving all your training data and reinforcement feedback"

Fixed that for you.

The input is sooo much more than your prompt, that's kind of the point.


How do you deal with the comments sometimes being relatively noisy for humans? I tend to be annoyed by comments overly referring to a past correction prompt and not really making sense by themselves, but then again this IS probably the highest value information because these are exactly the things the LLM will stumble on again.

    > How do you deal with the comments sometimes being relatively noisy for humans?
To extents, that is a function of tweaking the prompt to get the level of detail desired and signal/vs noise produced by the LLM. e.g. constraining the word count it can use for comments.

We have a small team of approvers that are reviewing every PR and for us, not being able to see the original prompt and flow of interactions with the agent, this approach lets us kind of see that by proxy when reviewing the PR so it is immensely useful.

Even for things like enum values, for example. Why is this enum here? What is its use case? Is it needed? Having the reasoning dumped out allows us to understand what the LLM is "thinking".

(Of course, the biggest benefit is still that the LLM sees the reasoning from an earlier session again when reading the code weeks or months later).


Inline comments in function body: for humans.

Function docs: for AI, with clear trigger (“use when X or Y”) and usage examples.


I really hate its tendency to leave those comments as well. I seem to have coached it out with some claude.md instructions but they still happen on occasion.

Interesting observation. After a human is done writing code, they still have a memory of why they made the choices they made. With an LLM, the context window is severely limited compared to a brain, so this information is usually thrown away when the feature is done, and so you cannot go back and ask the LLM why something is the way it is.

Yup; in the moment, you can just have the LLM dump its reasoning into the comments (we use idiomatic `<remarks></remarks>` for C# and JSDoc `@remarks`).

Future agents see the past reasoning as it `greps` through code. Good especially for non-obvious context like business and domain-level decisions that were in the prompt, but may not show in the code.

I can't prove this, but I'm also guessing that this improves the LLM's output since it writes the comment first and then writes the code so it is writing a mini-spec right before it outputs the tokens for the function (would make an interesting research paper)


Somehow made me think I should enforce a rule agents should sign their conment so it's identifiable at first glance

MCP is the future in enterprise and teams.

It's as you said: people misunderstand MCP and what it delivers.

If you only use it as an API? Useless. If you use it on a small solo project? Useless.

But if you want to share skills across a fleet of repos? Deliver standard prompts to baseline developer output and productivity? Without having to sync them? And have it updated live? MCP prompts.

If you want to share canonical docs like standard guidance on security and performance? Always up to date and available in every project from the start? No need to sync and update? MCP resources.

If you want standard telemetry and observability of usage? MCP because now you can emit and capture OTEL from the server side.

If you want to wire execution into sandboxed environments? MCP.

MCP makes sense for org-level agent engineering but doesn't make sense for the solo vibe coder working on an isolated codebase locally with no need to sandbox execution.

People are using MCP for the wrong use cases and then declaring them excess when the real use case is standardizing remote delivery and of skills and resources. Tool execution is secondary.


You sound more like you like skills than MCP itself. Skills encapsulate the behavior to be reused.

MCP is a protocol that may have been useful once, but it seems obsolete already. Agents are really good at discovering capabilities and using them. If you give it a list of CLI tools with a one line description, it would probably call the tool's help page and find out everything it needs to know before using the tool. What benefit does MCP actually add?


So just to clarify, in your case you're running a centralized MCP server for the whole org, right?

Otherwise I don't understand how MCP vs CLI solves anything.


Correct.

Centralized MCP server over HTTP that enables standardized doc lookup across the org, standardized skills (as MCP prompt), MCP resources (these are virtual indexes of the docs that is similar to how Vercel formatted their `AGENTS.md`), and a small set of tools.

We emit OTEL from the server and build dashboards to see how the agents and devs are using context and tools and which documents are "high signal" meaning they get hit frequently so we know that tuning these docs will yield more consistent output.

OAuth lets us see the users because every call has identity attached.


Sandboxing and auth is a problem solved at the agent ("harness") level. You don't need to reinvent OpenAPI badly.

    > Sandboxing and auth is a problem solved at the agent ("harness") level
If you run a homogeneous set of harnesses/runtimes (we don't; some folks are on Cursor, some on Codex, some on Claude, some on OpenCode, some on VS Code GHCP). The only thing that works across all of them? MCP.

Everything about local CLIs and skill files works great as long as you are 1) running in your own env, 2) working on a small, isolated codebase, 3) working in a fully homogeneous environment, 4) each repo only needs to know about itself and not about a broader ecosystem of services and capabilities.

Beyond that, some kind of protocol is necessary to standardize how information is shared across contexts.

That's why my OP prefaced that MCP is critical for orgs and enterprises because it alleviates some of the friction points for standardizing behavior across a fleet of repos and tools.

    > You don't need to reinvent OpenAPI badly
You are only latching onto one aspect of MCP servers: tools. But MCP delivers two other critical features: prompts and resources and it is here where MCP provides contextual scaffold over otherwise generic OpenAPI. Tools is perhaps the least interesting of MCP features (though useful, still, in an enterprise context because centralized tools allows for telemetry)

For prompts and resources to work, industry would have to agree on defined endpoints, request/response types. That's what MCP is.


This is my take as well.

Way easier to set up, centralized auth and telemetry.

Just use it for the right use cases.


How do you think this compares to Google and the AI search?

Probably oversold here because if you read the fine print, the savings only come in cases when you don't need the bytes in context.

That makes sense for some of the examples the described (e.g. a QA workflow asking the agent to take a screenshot and put it into a folder).

However, this is not true for an active dev workflow when you actually do want it to see that the elements are not lining up or are overlapping or not behaving correctly. So token savings are possible...if your use case doesn't require the bytes in context (which most active dev use cases probably do)*


Just symlink CONTRIBUTING as AGENTS


This only works on systems that support symlinks. It also pollutes the root folder with more files.

I understand the sentiment, but it is really strange that the people that are pushing for agents.md haven't seen https://contributing.md/

Is it even mentioned at GitHub docs https://docs.github.com/en/communities/setting-up-your-proje...


This has been my observation with self-generated docs as well.

I have seen some devs pull out absolutely bad guidance by introspecting the code with the LLM to define "best practices" and docs because it introduces its own encoded biases in there. The devs are so lazy that they can't be bothered to simply type the bullet points that define "good".

One example is that we had some extracted snippet for C#/.NET that was sprinkling in `ConfigureAwait(false)` which should not be in application code and generally not needed for ASP.NET. But the coding agent saw some code that looked like "library" code and decided to apply it and then someone ran the LLM against that and pulled out "best practices" and placed them into the repo and started to pollute the rest of the context.

I caught this when I found the code in a PR and then found the source and zeroed it out. We've also had to untangle some egregious use of `Task.Run` (again, not best practice in C# and you really want to know what you're doing with it).

At the end of it, we are building a new system that is meant to compose and serve curated, best practice guidance to coding agents to get better consistency and quality. The usage of self-generated skills and knowledge seems like those experiments where people feed in an image and ask the LLM to give back the image without changing it. After n cycles, it is invariably deeply mutated from the original.

Agentic coding is the future, but people have not yet adapted. We went from punch cards to assembly to FORTRAN to C to JavaScript; each step adding more abstractions. The next abstraction is Markdown and I think that teams that invest their time in writing and curating markdown will create better guardrails within which agents can operate without sacrificing quality, security, performance, maintainability, and other non-functional aspects of software system.


> Agentic coding is the future, but people have not yet adapted. We went from punch cards to assembly to FORTRAN to C to JavaScript; each step adding more abstractions.

I don't completely disagree (I've argued the same point myself). But one critical difference between the LLM layer and all of those others you listed, is that LLMs are non-deterministic and all those other layers are. I'm not sure how that changes the dynamic, but surely it does.


The LLM can be non-deterministic, but in the end, as long as we have compilers and integration tests, isn't it the same? You go from non-deterministic human interpretation of requirements and specs into a compiled, deterministic state machine. Now you have a non-deterministic coding agent doing the same and simply replacing the typing portion of that work.

So long as you supply the agent well-curated set of guidance, it should ultimately produce more consistent code with higher quality than if the same task were given to a team of random humans of varying skill and experience levels.

The key now is how much a team invests in writing the high quality guidance in the first place.


The unspoken truth is that tests were never meant to cover all aspects of a piece of software running and doing its thing, that's where the "human mind(s)" that had actually built the system and brought it to life was supposed to come in and add the real layer of veracity. In other words, "if it walks like a duck and quacks like duck" was never enough, no matter how much duck-related testing was in place.


Compilers can never be error free for non trivial statements. This is outlined in Rices theorem. It’s one of the reasons we have observability/telemetry as well as tests.


That's fine, but this also applies to human written code and human written code will have even more variance by skill and experience.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: