More

xyzzy123 · 2026-04-14T13:20:51 1776172851

Is it the industry making this choice or the customer?

You could make a car that's safer than others at 10x the price but what would the demand look like at that price?

Would you pay 2x for your favourite software and forego some of the more complex features to get a version with half the security issues?

1dontnkow_ · 2026-04-15T18:16:03 1776276963

Thats very true and I think about often because even in everyday task e.g. "We need a new feature to download reports" then you get a ticket but still depending how much that is used, desired, invested in or marketed and all those things, how flexible should I make, whats all the errors, whats the data, how secure and million other small or bigger decisions.

The point isn't how user stories are written, but realizing that everything is a compromise for practicality since if I wanted it to be the most secure thing ever, Id say "Lets just not offer that feature at all".

But now back to cars, I like that I heard some companies promised or started to build cars again with real buttons. In our world its when I try to use more and support the tools which aren't build on Electron (and slow as hell) but actually care about performance. Open source and decent enough security is already the minimum requirement.

HappMacDonald · 2026-04-15T01:38:53 1776217133

Sometimes I want VSCode and sometimes I want Notepad.

Well.. except that I never want either of those. So sometimes I want Kate editor and sometimes I want Akelpad.

xyzzy123 · 2026-04-14T00:05:03 1776125103

I dunno if Steve was autistic or no but there's a subtype of people who develop a "special interest" in reading, influencing and manipulating people and get quite good at it. They study people carefully - each encounter is a new way to find out how people tick, their motivations and levers. They tend to have unusually good memories of specific encounters and conversations. They can quickly and reliably size people up. You need many many encounters to build this kind of intuition up, so there are specific professions they cluster in that give them lots of practice, cops / detectives, sales, even retail - but you need lots of (ideally non-trivial) interactions to supply the intuition pump.

xyzzy123 · 2026-04-10T09:33:43 1775813623

My main complaint with mcp is that it doesn't compose well with other tools or code. Like if I want to pull 1000 jira tickets and do some custom analysis I can do that with cli or api just fine, but not mcp.

prohobo · 2026-04-10T09:40:47 1775814047

Right, that feels like something you'd do with a script and some API calls.

MCP is more for a back and forth communication between agent and app/service, or for providing tool/API awareness during other tasks. Like MCP for Jira would let the AI know it can grab tickets from Jira when needed while working on other things.

I guess it's more like: the MCP isn't for us - it's for the agent to decide when to use.

xyzzy123 · 2026-04-10T11:10:43 1775819443

I just find that e.g. cli tools scale naturally from tiny use cases (view 1 ticket) to big use cases (view 1000 tickets) and I don't have to have 2 ways of doing things.

Where I DO see MCPs getting actual use is when the auth story for something (looking at you slack, gmail, etc) is so gimped out that basically, regular people can't access data via CLI in any sane or reasonable way. You have to do an oauth dance involving app approvals that are specifically designed to create a walled garden of "blessed" integrations.

The MCP provider then helpfully pays the integration tax for you (how generous!) while ensuring you can't do inconvenient things like say, bulk exporting your own data.

As far as I can tell, that's the _actual_ sweet spot for MCPs. They're sort of a technology of control, providing you limited access to your own data, without letting you do arbitrary compute.

I understand this can be considered a feature if you're on the other side of the walled garden, or you're interested in certain kinds of enterprise control. As a programmer however I prefer working in open ecosystems where code isn't restricted because it's inconvenient to someone's business model.

SOLAR_FIELDS · 2026-04-10T23:12:43 1775862763

The auth angle is pretty interesting here. I spend a fair amount of time helping nontechnical people set up AI workflows in Claude Cowork and MCP works pretty well for giving them an isolated external system that I can tightly control their workflow guardrails but also interestingly give them the freedom to treat what IS exposed as a generic api automation tool. That combined with skills lets these non technical people string together zapier like workflows in natural language which is absolutely huge for the level of agency and autonomy it awards these people. So I find it quite interesting for the use case of providing auth encapsulated API access to systems that would normally require an engineer to unlock. The story around “wrap this REST API into a controlled variant only for the end users use case and allow them to complete auth challenges in every which way” has been super useful. Some of my mcp servers go through an oauth challenge response, others provide them guidance to navigate to the system and generate an api key and paste it into the server on initial connection.

hadlock · 2026-04-10T17:37:08 1775842628

>while ensuring you can't do inconvenient things like say, bulk exporting your own data

I think this is the key; I want my analysts to be able to access 40% of the database they need to do their job, but not the other 60% parts that would allow them to dump the business-secrets part of the db, and start up business across the street. You can do this to some extent with roles etc but MCP in some ways is the data firewall as your last line of protection/auth.

michaelbuckbee · 2026-04-10T13:29:02 1775827742

MCPs are for documentation. CLI->API is for interaction.

Twirrim · 2026-04-11T01:28:19 1775870899

Weird... I've been happily using Atlassian's MCP for this kind of thing just fine?

somnium_sn · 2026-04-10T15:42:37 1775835757

Give the model a REPL and let it compose MCP calls either by using tool calls structured output, doing string processing or piping it to a fast cheap model to provide structured output.

This is the same as a CLI. Bash is nothing but a programming language and you can do the same approach by giving the model JavaScript and have it call MCP tools and compose them. If you do that you can even throw in composing it with CLis as well

insin · 2026-04-10T10:13:07 1775815987

You can make it compose by also giving the agent the necessary tools to do so.

I encountered a similar scenario using Atlassian MCP recently, where someone needed to analyse hundreds of Confluence child pages from the last couple of years which all used the same starter template - I gave the agent a tool to let it call any other tool in batch and expose the results for subsequent tools to use as inputs, rather than dumping it straight into the context (e.g. another tool which gives each page to a sub-agent with a structured output schema and a prompt with extraction instructions, or piping the results into a code execution tool).

It turned what would have been hundreds of individual tool calls filling the context with multiple MBs of raw confluence pages, into a couple of calls returning relevant low-hundreds of KBs of JSON the agent could work further with.

__alexs · 2026-04-10T12:17:12 1775823432

The agent cannot compose MCPs.

What it can do is call multiple MCPs, dumping tons of crap into the context and then separately run some analysis on that data.

Composable MCPs would require some sort of external sandbox in which the agent can write small bits of code to transform and filter the results from one MCP to the next.

csallen · 2026-04-10T14:44:11 1775832251

This is confusing to me. What is composability if not calling a program, getting its program, and feeding it into another program as input? Why does it matter if that output is stored in the LLM's context, or if it's stored in a file, or if it's stored ephemerally?

Maybe I'm misunderstanding the definition of composability, but it sounds like your issue isn't that MCP isn't composable, but that it's wasteful because it adds data from interstitial steps to the context. But there are numerous ways to circumvent this.

For example, it wouldn't be hard to create a tool that just runs an LLM, so when the main LLM convo calls this tool it's effectively a subagent. This subagent can do work, call MCPs, store their responses in its context, and thereby feed that data as input into other MCPs/CLIs, and continue in this way until it's done with its work, then return its final result and disappear. The main LLM will only get the result and its context won't be polluted with intermediary steps.

This is pretty trivial to implement.

cruffle_duffle · 2026-04-11T03:09:03 1775876943

> Why does it matter if that output is stored in the LLM's context

Context window is expensive and precious. Much better to offload to some medium where it isn’t.

somnium_sn · 2026-04-10T15:44:34 1775835874

Give the model an interpreter like mlua and let it write code to compose MCP calls together. This is a well established method.

It’s the equivalent to calling CLIs in bash, except mlua is a sandboxes runtime while bash is not.

insin · 2026-04-10T12:45:03 1775825103

At the level of the agent, it knows nothing about MCP, all it has is a list of tools. It can do anything the tools you give it let it do.

__alexs · 2026-04-10T13:48:49 1775828929

It cannot do "anything" with the tools. Tools are very constrained in that the agent must insert into it's context the tool call, and it can only receive the response of the tool directly back into its context.

Tools themselves also cannot be composed in any SOTA models. Composition is not a feature the tool schema supports and they are not trained on it.

Models obviously understand the general concept of function composition, but we don't currently provide the environments in which this is actually possible out side of highly generic tools like Bash or sandboxed execution environments like https://agenttoolprotocol.com/

hrimfaxi · 2026-04-10T12:30:48 1775824248

They can already do this, no? MCPs regularly dump their results to a textfile and other tools (cli or otherwise) filter it.

cruffle_duffle · 2026-04-11T03:08:06 1775876886

At that point might as well just use CLI

I totally agree that mcp not being compostable is a very big issue.

losvedir · 2026-04-10T11:49:51 1775821791

But in the context of this discussion, Atlassian has a CLI tool, acli. I'm not quite following why that wouldn't have worked here. As a normal CLI you have all the power you need over it, and the LLM could have used it to fetch all the relevant pages and save to disk, sample a couple to determine the regular format, and then write a script to extract out what they needed, right? Maybe I don't understand the use case you're describing.

insin · 2026-04-10T12:43:51 1775825031

Not all agents are running in your CLI or even in any CLI, which is why people are arguing past each other all over the topic of MCP.

I implemented this in an agent which runs in the browser (in our internal equivalent of ChatGPT or Claude's web UI), connecting directly to Atlassian MCP.

xyzzy123 · 2026-04-10T11:04:42 1775819082

Hmm, but you can't write a standard MCP (e.g. batch_tool_call) that calls other MCPs because the protocol doesn't give you a way to know what other MCPs are loaded in the runtime with you or any means to call them? Or have I got that wrong?

So I guess you had to modify the agent harness to do this? or I guess you could use... mcp-cli ... ??

jmcodes · 2026-04-10T12:27:20 1775824040

I don't maintain this anymore but I experimented with this a while back: https://github.com/jx-codes/lootbox

Essentially you give the agent a way to run code that calls MCP servers, then it can use them like any other API.

Nowadays small bash/bun scripts and an MCP gateway proxy gets me the same exact thing.

So yeah at some level you do have to build out your own custom functionality.

xyzzy123 · 2026-04-10T03:15:49 1775790949

The DCs are in VA because its fed / spook central, which induces a large number of well paid policy / compliance / paperwork / tech jobs etc.

NoVa literally one of the richest regions on the planet and it's all sorta tied up in the same thing. Seems unfair to say DCs "bring nothing", the whole ecosystem is a manifestation of concentrated defense spend.

xyzzy123 · 2026-04-05T05:37:45 1775367465

Conversely I find it strange how many people seem to want US servicemen to come to harm because it would validate their political views.

CamperBob2 · 2026-04-05T06:17:24 1775369844

My political views are pretty simplistic: whoever initiates violence is the bad guy.

That's problematic here, because neither the US nor Iran are strangers to initiating violence. So a further refinement is necessary, based on a principle that I haven't had to fall back on very often: whoever initiates violence during negotiations is the bad guy.

defrost · 2026-04-05T06:28:32 1775370512

Four out of Five Eyes agree.

verdverm · 2026-04-05T06:44:18 1775371458

not just negotiations, but also a ceasefire after the 12 day war

anovikov · 2026-04-05T05:47:43 1775368063

It's simple "us vs them". I see nothing wrong about it.

xyzzy123 · 2026-04-05T05:49:02 1775368142

[flagged]

verdverm · 2026-04-05T06:40:12 1775371212

What are those objectives? Can they be achieved by dropping bombs from the sky?

The US, and the rest of the world, are still waiting to know the answer to the first. Why was this war on a whim needed now? What was so imminent that the ongoing negotiations could not have continued?

The second answer is a definitive no, it's never been except for that one time that ended of WW2 when no one else had nukes.

spaghetdefects · 2026-04-05T17:51:56 1775411516

The US/Israel lobs cluster munitions, YOLO fires missiles (and in facts targets schools and hospitals) and has long range nuclear missiles.

defrost · 2026-04-05T05:53:26 1775368406

> a country that lobs cluster munitions

https://www.npr.org/2023/07/11/1186949348/us-cluster-munitio...

> and YOLO fires missiles at its neighbours

Perfidy in international waters? Double tap kill strikes on fishing boats?

So far, Pot meet Kettle.

xyzzy123 · 2026-04-05T06:01:28 1775368888

I'm not claiming the US is "good"! I'm saying I would prefer a world where IRGC doesn't have long range nuclear weapons.

remarkEon · 2026-04-05T15:21:01 1775402461

[flagged]

tastyface · 2026-04-05T16:14:26 1775405666

[flagged]

remarkEon · 2026-04-05T16:18:44 1775405924

[flagged]

defrost · 2026-04-06T00:18:33 1775434713

And the evidence for this is where, exactly?

Oh, that's right, it's all extrajudicial "trust me bro" with no congressional oversight.

remarkEon · 2026-04-06T03:36:45 1775446605

Not really. I'm simply noting the characteristics of the vehicle as seen in the videos released by the War Department. Are you a fisherman? Usually when I go fishing I take, ya know, fishing gear with me.

spaghetdefects · 2026-04-05T17:48:42 1775411322

Is joining a military that has constantly been engaged in terror a "political view"? I'm not the one killing innocent people, it's just basic humanity to feel justice when murderers are stopped. I'm neither a Democrat nor Republican, this is not a partisan issue. I would vote for any politician that said they were closing down our bases all over the world and sanctioning Israel.

xyzzy123 · 2026-04-04T10:25:49 1775298349

We're not really calorie constrained anymore and most humans live in much denser environments than they used to. You would expect rate of exposure, the rate of mutation / change and the rate at which new pathogens appear to be higher than in the past.

Consequently, you wouldn't necessarily expect ancestral "defaults" to be optimal for modern environments.

isodev · 2026-04-04T12:50:53 1775307053

> you wouldn't necessarily expect ancestral "defaults" to be optimal

I like the term ancestral defaults and indeed, we've come a long way since then and our biological and environmental reality is substantially different.

There is this book series Mortal Coil by Emily Suvada which imagines a future where technology has advanced enough to allow one to tweak their genome as easily as we use apps on our phone today. It was a fascinating read.

qsera · 2026-04-04T12:53:23 1775307203

>our biological and environmental reality is substantially different.

As true as it might be, that does not mean that it is possible to work around evolution to change ourselves to fit better with the new reality.

qsera · 2026-04-04T12:32:28 1775305948

>We're not really calorie constrained anymore

Why do you bring this up? It seems a weird hypothesis to bring up given that the parent comment did not suggest the possibility...

ndsipa_pomu · 2026-04-04T14:00:48 1775311248

Not the OP, but evolution will often select for less energy expenditure as most animals are calorie constrained or at least during certain times of the year (c.f. animals that hibernate to survive low calorie availability).

It seems plausible to me that the immune system might be calorie intensive to be on full alert all the time. However, I suspect that having the immune system be more active will likely lead to other complications such as autoimmune disorders or even something as common as hayfever.

qsera · 2026-04-04T17:33:31 1775324011

An always full alert immune system won't have any lee way to address any onset of an infection, and also considering that pathogens also evolve to by pass the increased alertness, this will probably be catastrophic for the species.

That is unless of course like bats, this was the result of evolution and natural selection. But bolting on like this vaccine do it, yea, going to be pretty bad.

xyzzy123 · 2026-04-01T12:08:26 1775045306

I don't really see why most existing equipment would be usable in this way. When you automate a thing you often have to rethink the entire problem. But more generally, automation is for _repeatable_ things and a lot of research is... not that.

The expensive equipment is usually a small (but crucial!) part of research activity, which involves things like talking to a lot of people, getting permission to do weird or new things, going out into the environment and collecting things in very specific ways, storing and transporting them carefully, observing, etc. Building or modifying existing lab instruments, doing various things with animals that are not co-operative ... and CLEANING. Who does all the cleaning?

Definitely use cases when you have a specific protocol you want to scale, but I'm also not sure how safe I would feel around AI with a license to experiment and access to dangerous reagents, high temperatures, etc. Or, god help us, an oligonucleotide synthesizer. Which is definitely going to happen (if it has not already).

fuzzfactor · 2026-04-01T18:44:42 1775069082

>Who does all the cleaning?

In some cases that would be the same person that does the most advanced innovative and/or creative work.

The idea behind the fully automated system is that fewer hired hands are needed for efforts that are routine enough. But not zero, you still need one person who can do everything at a minimum, if called upon for mission-critical operation.

In the case of the creative work and planning where it is out of the league for AI, these things need to always be done too, but they are not exactly "routine".

Once most of the tedious routine tasks are well-automated though, then the human brain behind the lab can finally relax a bit, with eurekas flowing at the same rate without needing a full 40 or 50 addititonal hours at the bench any more, while even more results are generated than they could do single-handedly too.

Which gives them the time to do the cleaning also, otherwise they would need two humans to serve their only automated system.

dyauspitr · 2026-04-01T14:39:07 1775054347

Probably like a ATLAS or a unitree robot.They are beginning to get very good.

xyzzy123 · 2026-03-27T04:53:56 1774587236

Great discussion on colour theory but I think it skips over the fact that chemistry and economics were also real constraints. Paints that don't fade easily, chemically inert, durable etc and can be produced in enormous volume are not that common. Practical stuff like, how easy does it catch fire, do solvents degrade it. Most important: is it CHEAP?

Also by 1944 there would have been a ready made supply chain due to demand from the navy, which would have picked it for similar reasons and consumed it in enormous volume.

I think, practically, control rooms are chrome oxide green (you get to add as much titanium as you like - thats cheap as dirt too EDIT: it would have been lead actually in the 40s) for much the same reason that barns are red.

justonceokay · 2026-03-27T04:58:12 1774587492

I doubt the manhattan project had an issue funding paint

xyzzy123 · 2026-03-27T05:15:44 1774588544

Think for a minute about why they worked in shacks.

Also worth noting that Birren was a paid consultant for DuPont - the company that made the paint. From that perspective alone you would kinda expect he is gonna pick colours they already produce reliably at scale.

He also consulted for army, coast guard and navy so whatever colours you pick for hazard marking in an industrial setting have to be _vaguely_ congruent or you're going to cause accidents.

xyzzy123 · 2026-03-27T01:30:38 1774575038

I'm super confused. The small model "cost field" `rag-api/geometric_lens/cost_field.py` was trained on PASS_TASKS like "Write a function that counts vowels in a string." and FAIL_TASKS like "Write a function that converts a regular expression string to an NFA using Thompson's construction, then converts the NFA to a DFA.".

So it seems like it's a difficulty classifier for task descriptions written in English.

This is then used to score embeddings of Python code, which is a completely different distribution.

Presumably it's going to look at a simple solution, figure out it lands kinda close to simple problems in embedding space and pass it.

But none of this helps you solve harder problems, or distinguish between a simple solution which is wrong, and a more complex solution which is correct.

yogthos · 2026-03-27T01:58:15 1774576695

I think the goal is to have a light heuristic that helps find plausibly useful solutions. They're still going to go through a testing phase as a next step, so this is just a very simple filter to decide what's even worth testing.

naasking · 2026-03-27T12:48:24 1774615704

> But none of this helps you solve harder problems, or distinguish between a simple solution which is wrong, and a more complex solution which is correct.

It does because hallucinations and low confidence share characteristics in the embedding vector which the small neural learns to recognize. And the fact that it continuously learns based on the feedback loop is pretty slick.

xyzzy123 · 2026-03-26T02:29:04 1774492144

Having shell is extremely handy for further discovery. SO handy that if they were just gonna patch the bug and lock you out, you would simply not disclose it.

freerobby · 2026-03-26T16:03:01 1774540981

This is what happened. Tesla security received tons of bug reports that required root access to identify, yet they got a vanishingly small number of root vulnerability reports. This policy fixes that misincentive.