HN2new | past | comments | ask | show | jobs | submit | epec254's commentslogin

Huge +1. This loop consistently delivers great results for my vibe coding.

The “easy” path of “short prompt declaring what I want” works OK for simple tasks but consistently breaks down for medium to high complexity tasks.


Can you help me understand the difference between "short prompt for what I want (next)" vs medium to high complexity tasks?

What i mean is, in practice, how does one even get to a a high complexity task? What does that look like? Because isn't it more common that one sees only so far ahead?


I’m curious if anyone has considered (or is) putting a proxy/gateway in front of every MCP used by their company to “guardrail” data that goes in and out eg checks for sensitive PII, prompt infection, etc?


We've been exploring solutions. MCP registry/gateway, everything kind of sucks at the moment. The other problem is unless you have an extremely good enterprise endpoint approach nothing is going to stop users from not using your orgs MCP gateway. GitHub has the MCP registry setting but that only works if you are logged into vscode. Any other MCP client can still do whatever and as you probably know worst case an MCP client can be vibed in no time.

Trying to catalog MCP use at a users endpoint is an exercise in either scraping for client settings.json or traffic inspection. Crowdstrike recently acquired pangea and are developing these capabilities for example.


The key part IMO is buried in the article - there happened to be an existing, perfectly accurate database containing all the required info about each employee - the same info that previously had to be manually found for each retirement.

Without this, this effort would not have been possible.

> Fortunately, we stumbled on a critical clue. While poring over old documentation, we discovered that OPM actually had data warehouses that stored historic information about every federal employee. Apparently, these warehouses were created as part of a modernization effort in 2007, and HR and payroll offices all across government have supposedly been regularly reporting into it.

> For some reason however, this was not well known at OPM, and those that knew about it didn’t know what data it held, nor considered how it could be used to simplify retirement processing. Not many had seen the data, and administrators were initially resistant to sharing access.

> From a software perspective, this was the holy grail: a single source of truth that held all the information that the manual redundant steps were meant to review. Because the information was regularly reported by HR and payroll, by the time an employee retired, OPM should already have everything needed to process the retirement, without anyone re-entering or re-verifying information.


Yes! Whoever built the data warehouses and keeps the data pipelines running would seem to be the real heros of this story. I sure hope that group did not get gutted by DOGE.


If the people who set up the DW and set up the data to flow into it did not also build any applications to actually take advantage of the data, they didn’t complete their job properly. It’s been 18 years, that’s long enough to document the existence of it. Some might say 18 years is even enough time to build at least one useful application powered by it.

I’m sure in reality the people who built this system were smart, and wanted people to use it, but were just buried under layers of technology-unaware management and bureaucrats who felt threatened, afraid it would marginalize or eliminate their paper-pushing jobs. But this very likely reality is just more proof that the government needs significant restructuring. Most people in management at the government are there purely because of tenure, not because they’re great leaders, nor subject matter experts in how complex things are efficiently built and run outside the government world.


> If the people who set up the DW and set up the data to flow into it did not also build any applications […] they didn’t complete their job properly

1. That’s a whole extra level of responsibility / management / bureaucracy. At some point, somebody near the top needs to care or it doesn’t all get done. The existence of this DB says somebody cared, they just didn’t have enough power.

2. I’m curious how this compares to experiences at Big Old Corp, like IBM or GM, not just the SV darlings.


Thanks for pointing this out. I think it does a good job of also highlighting that most problems aren’t technical; they are either people or organizational.


Yeah this stuck out at me - the hubris of the stateless web stack supersedes the 18 years of hard unsung work at building and end to end stateful pipeline that ties out to the penny and handles all the complex business logic and reconciliations seamlessly across god knows how many integrations. No fancy diagrams or pictures of the nameless faceless heroes that had accomplished that act of heroism. For sure recognizing the value is something to trumpet, but that’s the Herculean hero story I want to hear - the DOGE bros who tied it all together with JavaScript frameworks, yawn.


The only way the database could be harnessed to do something useful is after all the people who were standing in the way in management for the last 18 years likely having been sacked. You can bet any useful project to put it to use was blocked by paper-pushers threatened by the spectre of automation, until most people had forgotten about it.

Nobody believes the database sprung forth from the earth or was created accidentally. The fact that 18 years later that project had borne no visible fruit, and that most people who could have used it, didn’t even know about it, is proof of the problem. It’s a problem of terrible management. That is what, regardless of your politics, is being slightly jostled by DOGE. Personally I have dealt with enough of our absurd government processes that I don’t think they can make anything much worse, and it cannot be less efficient.


How do you know what the people involved did? Let’s not pretend speculation is fact.


any process can be made less efficient, especially by firing those who are aware of how the system actually works...


They seemed to have replaced Mega Bloks with Legos, not skyscraper building materials.


[flagged]


> There's a dangerous trend I've noticed with GenZ, they're quick (sometimes to the point of seeming rushed) to show off hyperbolic-sounding achievements that are mostly hot air

Isn’t it true for every so-called edge that CEOs pitch to shareholders?


> show off hyperbolic-sounding achievements which are mostly hot air and many times even work stolen from others.

Steve Jobs was born in 1955, the ball has been rolling for a while now. Gen Z might just be the crowd that recognizes how lucrative it is to scam people.


We must have different definitions of hot air, if yours includes a 4-trillion-dollar company.

Edit: you edited your comment so now my reply doesn't make sense. I would re-post your old comment but I didn't save it. I won't change mine because I'm not like that.


The creation in the article appears to be genuinely useful and impressive as well. It certainly benefitted a great deal from other people's work, but so did apple and linux and whatever else.


It was 10x smaller when Jobs retired in 2011, not that it really matters for this analysis.

Musk was born in 1971, for example.


With all due respect, the edit only added the second sentence. Your response is not a refutation of what I meant, and I'm not going to retract the edit so you can better mock my point.


This is not new to Gen z.


Didn't say it was.

Edit:

>I've noticed a lot of crime in [city].

>Crime is not new to [city].

>Didn't say it was.

Come on, the quality of this discourse is abhorrent.


You're literally being downvoted for stereotyping an entire generation. The word stereotype implies it, but it's not remotely close to true.

Like, the easiest, most obvious example in the world is trump: he hyperbolically brags constantly about things he didn't do or actively tried to stop and it would be real hard to argue that he's genZ.

When you single out a specific group for your observation, it has strong implications about the other groups you didn't mention.

As in this case: why did you only mention genZ?


>[A]s are [B]s

>But that doesn't imply all [B]s are [A]s

Come on, dude. This gets covered in the first 10 minutes of any entry-level course to logic ...


Yeah and that would be a lot more relevant if we were talking about, dunno, programming circuits or constructing proofs.

Instead we're writing english language sentences to be read by humans. Where connotations and implications and other such "unspoken" things absolutely matter.


>[GenZ]s are [Hyperbolic]s

>But that doesn't imply all [Hyperbolic]s are [GenZ]s

Seems clear to me.


Are you trolling? The implication is clearly that GenZ is unusually hyperbolic. That their predilection for hyperbole is somehow unusual or notable, otherwise WHY MENTION IT.


Yes, I think GenZ is unusually hyperbolic.

Why'd you think otherwise?


Speaking personally, the Summer of Love and 1990s counterculture is much more unusual and hyperbolic. I'd be curious to hear where you're seeing Gen Z surpass those generations.


Unusual yes, but I wouldn't call them hyperbolic (in the context of its meaning in this thread).

Also, wrt. to the Summer of Love, I would think its values are in the complete opposite side of what's being discussed here.

Excerpt from its Wikipedia page [1]:

"Many opposed the Vietnam War, were suspicious of government, and rejected consumerist values. In the United States, counterculture groups rejected suburbia and the American way and instead opted for a communal lifestyle. Some hippies were active in political organization, whereas others were passive and more concerned with art (music, painting, poetry in particular) or spiritual and meditative practices."

That doesn't sound compatible with "young people these days are so desperate to show off their skills, to the point of faking it, to get jobs in the government or the industry".

But I am now curious to hear about how you think both cohorts are related.

1. Although I think Wikipedia is trash.


That's what saying "noticed with Gen Z" means.

Reply to edit: generations are sequential; if you've noticed something with one generation it means that you're not accusing the prior generations of the same thing, otherwise you would've used different wording.


> There's a dangerous trend I've noticed with GenZ

Those are some mighty fine hares you're splitting.


I can hear you adjusting that fedora from here.


> It's sad, but I think our generation is partly to blame, since we demanded that from them.

At least you can recognize that much. Too many people involved in building an economy based around narcissism are suddenly wondering why there are a ton of narcissists.


A key challenge with HTML is client side trust. How do I enable an agent platform (say Gemini, Claude, OpenAI) to render UI from an untrusted 3p agent that’s integrated with the platform? This is a common scenario in the enterprise version of these apps - eg I want to use the agent from (insert saas vendor) alongside my company’s home grown agents and data.

Most HTML is actually HTML+CSS+JS - IMO, accepting this is a code injection attack waiting to happen. By abstracting to JSON, a client can safely render UI without this concern.


If the JSON protocol in question supports arbitrary behaviors and styles, then you still have an injection problem even over JSON. If it doesn't support them you don't need to support those in an HTML protocol either, and you can solve the injection problem the way we already do: sanitizing the HTML to remove all/some (depending on your specific requirements) script tags, event listeners, etc.


> A key challenge with HTML is client side trust. How do I enable an agent platform (say Gemini, Claude, OpenAI) to render UI from an untrusted 3p agent that’s integrated with the platform?

Just like you do with your web browser. A web browser is a Remote Code Execution engine.


Perhaps the protocol, is then html/css/js in a strict sandbox. Component has no access to anything outside of component bounds (no network, no dom/object access, no draw access, etc).


I think you can do that with an iframe, but it always makes me nervous


Right this makes sense, I wonder if it would then be a good idea to abstract html to JSON, making it impossible to include css and js into it


Curious to learn more what you are thinking?

One challenge is you do likely want JS to process/capture the data - for example, taking the data from a form and turning it into json to send back to the agent


If you play with A2UIs generator that's effectively what it does, just layer of abstraction or two above what you're describing.


That's what I thought too skimming through the documentation, my thinking is that since it does that, which makes sense to avoid script injection, why not do it with "jsonized" html.


I was thinking that raw html might be too verbose, but canned components have signatures and types.


Same team! AGUI uses a2UI as the protocol under the hood.


Hi, one of the AG-UI authors here.

AG-UI is a launch partner of A2UI, but it is a separate project by CopilotKit, not google.

We have a day-0 handshake between AG-UI & A2UI


And thank you CopilotKit team!

I think AG UI is great if you are building the UI and the Agent at the same time and want a high bandwidth sync between them and the UI supports AG UI as an adaptor layer (they have done a lot of work making this easier for folks).

A2UI is most interesting for it's LLM generation options (not tools but structured output), it's remote message passing options (if you don't own the UI), and it's general-purpose-ness (same fairly simple standard which can work for many models, transports, and renderers).

They do fit nicely together. Sorry the naming conventions are complicated. There are 2 hard things in computer science: naming things, cache invalidation, off by one errors.


Plus 1! We’d love community contributions here!


I tell you what, I'll add Svelte/Kit to the list we want to target:

https://github.com/google/A2UI/pull/352

Thanks for the recommendation.


Google PM here. Right now, it’s designed for rendering UI widgets inline with a chat conversation - it’s an extension to a2a that lets you stream JSON defining UI components in addition to chat messages.


Google SWE working in this space here. Look up my username (minus the digit) on Moma, let's talk. I can't ID you from your HN handle.


I LOVE this, exactly what I’ve been looking for.

Here’s the issue - all my meetings have confidential, sensitive info. I can’t use a version you host (or well, I could, but you won’t be willing to do the 6 month security review I need).

Can you give me a version I can host (or run locally) and I give you some $ one time or per year?


yes, absolutely. I've been thinking about this too, I did plan to offer self hosted if there's demand and have built it with that in mind.

My email is davnicwil at gmail, please email me and let's talk about it!


Same here: no self-host, no way. Best would be a one-time payment option.


thank you. I will definitely offer this. Please email me (davnicwil at gmail) just to connect and I'll update you when self host is available.


Same issue. Sent you an email just now. Awesome product!


thank you very much, replied :-)


What 6 month security review?


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: