More

tacoooooooo · 2026-03-24T22:52:21 1774392741

> Yes I understand airports are complex. So you have a computer and a human, and they work together.

how do you think it works today? some guy with binoculars?

Maxious · 2026-03-25T07:38:29 1774424309

One of the issues being explored is that although US radar is aged, surface vehicles can be equipped with ASDE-X transponders to be more visible to ATC systems. https://www.faa.gov/air_traffic/technology/asde-x

The vehicle that crashed into the plane did not have one and thus no automated alert was triggered.

tacoooooooo · 2026-03-24T13:43:57 1774359837

pydantic-ai

tacoooooooo · 2026-03-24T12:43:44 1774356224

whats wrong with .env files?

tacoooooooo · 2026-02-04T13:56:42 1770213402

Humans are also bad at math! So we created calculators. LLMs should have the same access to tools to give them the best opportunity to succeed

jqpabc123 · 2026-02-04T14:36:50 1770215810

LLMs have access to the same tools --- they run on a computer.

The problem here is the basic implementation of LLMs. It is non-deterministic (i.e. probabilistic) which makes it inherently inadequate and unreliable for *a lot* of what people have come to expect from a computer.

You can try to obscure the problem but you can't totally eliminate it without redesigning LLMs. At this time, the only real cure is to verify everything ---which nullifies a lot of the incentive to use LLMs in the first place.

dragonwriter · 2026-02-04T15:31:40 1770219100

> LLMs have access to the same tools --- they run on a computer.

That doesn't give them access to anything. Tool access is provided either by the harness that runs the model or by downstream software, if it is provided at all, either to specific tools or to common standard interfaces like MCP that allow the user to provide tool definitions for tools external to the harness. Otherwise LLMs have no tools at all.

> The problem here is the basic implementation of LLMs. It is non-deterministic (i.e. probabilistic) which makes it inherently inadequate and unreliable for a lot of what people have come to expect from a computer.

LLMs, run with the usual software, are deterministic [ignoring hardware errors ans cosmic ray bit flips, which if considered make all software non-deterministic] (having only pseudorandomness if non-zero temperature is used) but hard to predict, though because implementations can allow interference from separate queries processed in a batch, and the end user doesn't know what other typical hosted models are non-deterministic when considered from the perspective of the known input being only what is sent by one user.

But your problem is probably actually that the result of untested combinations of configuration and input are not analytically predictable because of complexity, not that they are non-deterministic.

tacoooooooo · 2026-02-04T15:19:34 1770218374

LLMs absolutely do not have access to the same tools unless they're explicitly given access to them. Running on a computer means nothing.

It sounds like you don't like LLMs! In that case, you may be more interested in our REST Api. All the same functions, but designed for edge computing, where dependency bloat is a real issue https://tinyfn.io/edge

jqpabc123 · 2026-02-04T16:44:17 1770223457

It sounds like you don't like LLMs!

I tend to prefer predictability/consistency and reliability. I fail to understand why anyone would prefer otherwise.

tacoooooooo · 2026-02-04T19:41:11 1770234071

same--this is exactly why I built this

tacoooooooo · 2026-02-03T22:19:47 1770157187

They're building a moat with data. They're building their own datasets of trusted sources, using their own teams of physicians and researchers. They've got hundreds of thousands of physicians asking millions of questions everyday. None of the labs have this sort of data coming in or this sort of focus on such a valuable niche

simianwords · 2026-02-03T22:26:01 1770157561

> They're building their own datasets of trusted sources, using their own teams of physicians and researchers.

Oh so they are not just helping in search but also in curating data.

> They've got hundreds of thousands of physicians asking millions of questions everyday. None of the labs have this sort of data coming in or this sort of focus on such a valuable niche

I don't take this too seriously because lots of physicians use ChatGPT already.

some_random · 2026-02-03T23:05:48 1770159948

Lots of physicians use ChatGPT but so do lots of non-physicians and I suspect there's some value in knowing which are which

tacoooooooo · 2026-02-02T22:57:41 1770073061

saying they aren't pioneering is very different than saying they aren't a major player in the space. There're only like 5-7 players with a foundational model that they can serve at scale. xAI is one of them

tacoooooooo · 2026-01-14T19:19:46 1768418386

https://alex-jacobs.com

tacoooooooo · 2026-01-12T19:44:15 1768247055

This is an interesting read, and while I support being nice to every_thing_ in principle. Most of the research into this actually shows that being mean yeilds better results

bubblegumcrisis · 2026-01-12T20:27:32 1768249652

I've read the blurb from previous years about doing one-shots with threats of death, or etc - but I've never seen that for long many prompt sessions.

I wonder - if you hired a programmer for a day, trapped them in a cage, and then threatened them, maybe it would be more productive for a while. I mean, if I were writing that book, I could see how they would do great work for a bit.

tacoooooooo · 2026-01-12T19:37:57 1768246677

This looks pretty cool. I keep seeing people (an am myself) using claude code for more an more _non-dev_ work. Managing different aspects of life, work, etc. Anthropic has built the best harness right now. Building out the UI makes sense to get genpop adoption

ai-christianson · 2026-01-12T20:16:02 1768248962

Yeah, the harness quality matters a lot. We're seeing the same pattern at Gobii - started building browser-native agents and quickly realized most of the interesting workflows aren't "code this feature" but "navigate this nightmare enterprise SaaS and do the thing I actually need done." The gap between what devs use Claude Code for vs. what everyone else needs is mostly just the interface.

tacoooooooo · 2026-01-10T16:55:26 1768064126

fizsh sounds really cool, but the last commit was 7+ years ago. do you run into any issues? https://github.com/zsh-users/fizsh

tmsbrg · 2026-01-10T20:14:41 1768076081

I never noticed any issues, actually. I guess the zsh base is solid and stable.