Daniel is a very impressive guy. Well within the realm of “fund the people not the idea” that YC seems to do. Got a few bucks from them and probably earning from collaborations etc. Odds of them not figuring out a business model seem slim.
Companies have no idea what they are doing, they know they need it, they know they want it, engineers want it, they don’t have it in their ecosystem so this is a perfect opportunity to come in with a professional services play. We got you on inference training/running, your models, all that, just focus on your business. Pair that with huggingface’s storage and it’s a win/win.
I found that when I have “infinite” tokens my behaviour changed. 3-5 tabs so I’m not waiting, free side quests, huge review skills over whole codebase, skills that wrap 10 other skills. It’s like going from expensive data to uncapped.
I think these token doubles are there to kick you into a abundance mindset (for want of a better term) so going back feels painful. Stop counting tokens, focus on your project and the cost of your own time.
I have this as a skill Claude created to run the rest. It mentions each skill in turn, see below. It’s not deterministic but it definitely runs each skill and it’s raised a bunch of issues, which I then selectively deal with. Where I can, once an issue is identified, I make deterministic tests.
Text includes:
Invoke each review/audit skill in sequence. Each skill runs its own comprehensive checks and returns findings. Capture the findings from each and incorporate them into the final report.
IMPORTANT: Invoke each skill using the Skill tool. Each skill is independently runnable and will produce its own detailed output. Summarize findings per skill into the unified report format.
And China is likely to do to Tesla robots what they’ve done to the cars. I assume the bans will be incoming, because the US can’t have millions of Chinese kung fu robots sitting about pouring tea, waiting for critical mass.
Every so often my YouTube logs out and I’m exposed to the view a “random visitor” would see. Instantly visible because it’s filled with stupid content and sexual provocation.
I manage the shit out of FB and YouTube. You need to block a few things so it stops testing a few segment ideas.
Got a friend who is in the high frequency trading industry and uses both Java and C#. I asked about GC. Turns out you just write code that doesn’t need to GC. Object pools, off-heap memory etc.
It won’t do the absolute fastest tasks in the stack quite as well but supposedly the coding speed and memory management benefits are more important, and there’s no GC so it’s reliable.
> Turns out you just write code that doesn’t need to GC. Object pools, off-heap memory etc.
Some GCd languages make this easier than others. Java and C# allow you to use primitive types. Even just doing some basic arithmetic in Python (at least CPython) is liable to create temporary objects; locals don't get stack-allocated.
Dictionaries are not a great analogy, because the standout feature of LLMs is that their output can change based on the context provided by individual users.
Differences between dictionaries are decided by the authors and publishers of the dictionaries without taking individual user queries into account.
Totally. Surely the IDE’s like antigravity are meant to give the LLM more tools to use for eg refactoring or dependency management? I haven’t used it but seems a quick win to move from token generation to deterministic tool use.
As if. I’ve had Gemini stuck on AG because it couldn’t figure out how to use only one version of React. I managed to detect that the build failed because 2 versions of React were being used, but it kept saying “I’ll remove React version N”, and then proceeding to add a new dependency of the latest version. Loops and loops of this. On a similar note AG really wants to parse code with weird grep commands that don’t make any sense given the directory context.
It’s a $90k engineer that sometimes acts like a vandal, who never has thoughts like “this seems to be a bad way to go. Let me ask the boss” or “you know, I was thinking. Shouldn’t we try to extract this code into a reusable component?” The worst developers I’ve worked with have better instincts for what’s valuable. I wish it would stop with “the simplest way to resolve this is X little shortcut” -> boom.
It basically stumbles around generating tokens within the bounds (usually) of your prompt, and rarely stops to think. Goal is token generation, baby. Not careful evaluation. I have to keep forcing it to stop creating magic inline strings and rather use constants or config, even though those instructions are all over my Claude.md and I’m using the top model. It loves to take shortcuts that save GPU but cost me time and money to wrestle back to rational. “These issues weren’t created by me in this chat right now so I’ll ignore them and ship it.” No, fix all the bugs. That’s the job.
Still, I love it. I can hand code the bits I want to, let it fly with the bits I don’t. I can try something new in a separate CLI tab while others are spinning. Cost to experiment drops massively.
Claude code has those "thoughts" you say it never will. In plan mode, it isn't uncommon that it'll ask you: do you want to do this the quick and simple way, or would you prefer to "extract this code into a reusable component". It also will back out and say "Actually, this is getting messy, 'boss' what do you think?"
I could just be lucky that I work in a field with a thorough specification and numerous reference implementations.
I agree that Claude does this stuff. I also think the Chinese menus of options it provides are weak in their imagination, which means that for thoroughly specified problem spaces with reference implementations you're in good shape, but if you want to come up with a novel system, experience is required, otherwise you will end up in design hell. I think the danger is in juniors thinking the Chinese menu of options provided are "good" options in the first place. Simply because they are coherent does not mean they are good, and the combinations of "a little of this, a little of that" game of tradeoffs during design is lost.
I recently asked Claude to make some kind of simple data structure and it responded with something like "You already have an abstraction very similar to this in SourceCodeAbc.cpp line 123. It would be trivial to refactor this class to be more generic. Should I?" I was pretty blown away. It was like a first glimpse of an LLM play-acting as someone more senior and thoughtful than the usual "cocaine-fueled intern."
Great. There’s no reason why all countries don’t start preferring locally or regionally developed software. Of course interoperability is always a thing but there needs to be another option between “one company” and “everyone host your own instance”.
https://www.ycombinator.com/companies/unsloth-ai
reply