It’s interesting to compare how the agentic search performs, with these targeted reads and lots of tool calls in the stream, versus the older but still valid paradigm of using a high-reasoning model like GPT-X-pro and feeding in all the relevant files at once with no tools.
I have found that the “pro” approach is much more holistic and able to tackle rather “creative” problems that require very careful design and the overall artifact is tight and self-consistent. — Claude Code by comparison is incredible in exploration and targeted implementation but indeed is not great at seeing the forest.
Do you know if this exploit works on Docker containers? And if so, I assume it just allows escalation WITHIN the container? So this attack is scary for Linux desktops and servers, but a fully containerized system like common on CI/CD should be good. Right?
I’ve had great success structuring requirements using Johnny Decimal. Well organized and numbered requirements for each project is mega helpful for agents and humans throughout the progress. Oh and I love açaí.
They also (at the cantonal level) have disparate education systems, with classes and grade levels mismatching between neighboring cantons. Yet, if you check what typical Swiss high school students are actually leaning (say at College de Candolle in Geneva), they are learning 3–5 languages, real literary analysis, and set theory. So somehow it’s working despite not having some perfect plan handed down by central authority. Hmm.
OK, also pretty wild to just say "typical Swiss high school" without mentioning the selective system that steers people into and, overwhelmingly, away from the collèges.
Yeah, basically 20-25% are going to gymnase, and the rest are split between professional and "generalist" student.
In Vaud, they merged the generalist class with the professional ones.
Literacy is dog shit even in the so call native language.
Until 11-12, what they cover at school is barely better than what kids learn at 8-9 in other countries. The change in middle school for the 12yo+ are huge, and 2-3 years are caught back within less than a year.
Kids often struggle because of that huge difference. Needless to say, the bottom 75% are in even worse place, trying to study with kids who have no places at school.
The flip side of this is that you can't possibly use a canton Zurich 1st grade arithmetic exercise book in a school in canton Aargau, despite 2+2 not depending on the canton (it would if the Swiss had any choice in the matter).
I love going to Geneva and seeing the personification statues of the Republic of Geneva and of the Swiss Confederation standing side by side with the same height.
Well, without advocating that municipalities would be compelled to use it, isn't there at least some national service that they could opt into? I am sure that most of the red on this map is because it's a cinch to get Microsoft or Google to host your email. Of course in California we consider GSuite itself to be the green choice.
There’s also a paper [0] from many well known researchers that serves as a kind of informal agreement not to make the CoT unmonitorable via RL or neuralese. I also don’t think Anthropic researchers would break this “contract”.
> There is no cohort of senior product leaders who developed their judgment in conditions where their teams were expected to demonstrate financial return, because those conditions did not exist during the years when that cohort was learning the craft.
There totally is such a cohort. There are plenty of bootstrapped companies or startups that took only an angel round and did not benefit from the low rate environment, in fact they suffered because of the very high price of SWE labor. But those engineering managers exist and are out there right now still building efficiently, quietly growing, passionately serving customers, and keeping a close eye on the bottom line and risks because that’s their livelihood.
I have found that the “pro” approach is much more holistic and able to tackle rather “creative” problems that require very careful design and the overall artifact is tight and self-consistent. — Claude Code by comparison is incredible in exploration and targeted implementation but indeed is not great at seeing the forest.
reply