Cluade is an Electron app because this is a cultural issue, not a technological one. Electron could make sense if you are a startup with limited development force. For big companies that want to make a difference, to hire N developers and maintain N native apps is one of the best bet on quality and UX you can do, yet people don't do it even in large companies that have the ability, in theory, to do it. Similarly even if with automatic programming you could do it more easily, still it is not done. It's a cultural issue, part of the fact that who makes software does not try to make the best possible software anymore.
What I love of the Pico overclock story is that, sure, not at 870Mhz, but otherwise you basically give for granted that at 300Mhz and without any cooling it is rock solid, and many units at 400Mhz too.
Thanks for your comment! Maybe it's time for picol2.c, same amount of lines of code, but floats and [expr]. I'm going to use a few hours today to get there. Such small changes would make it more practical for real use cases where one wants to retain understandability.
It depends on the use case. For instance you open a socket, write there any value you have without serialization, read it in the other side, data transfer done.
The convenience of not having to marshal data over a network is certainly a use case. But I'll admit that two of the worst programs I ever saw were written in Perl and TCL. Somehow just a big jumble of inscrutable regexes.
When "everything is a string" then you have no choice but to literally treat everything as a string. Painful. Very cool project though.
The objections to tcl I see most often seem to reflect an outdated or incomplete understanding of the language and its capabilities.
As with many interpreted languages, maybe it's just too easy to iterate interactively with code until it "just works", rather than taking the time to design the code before writing it, leading to perception that it is a "write-only" language.
However, despite its reputation, even Perl can be written in a way that is human-readable. I agree with you that it is more a reflection of the programmer or the work environment than of the language itself.
> When "everything is a string" then you have no choice but to literally treat everything as a string.
As someone who has been developing tcl almost daily for more than 30 years, both object oriented and imperative code, I have not found it necessary to think this way.
Can you explain what leads you to this conclusion?
It is absolutely possible and even not so hard. If you use Redis Vector Sets you will easily see 20k - 50k (depending on hardware) queries per second, with tens of millions of entries, but the results don't get much worse if you scale more. Of course all that serving data from memory like Vector Sets do. Note: not talking about RedisSearch vector store, but the new "vector set" data type I introduced a few months ago. The HNSW implementation of vector sets (AGPL) is quite self contained and easy to read if you want to check how to achieve similar results.
Author here. Thanks for sharing—always great to see different approaches in the space. A quick note on QPS: throughput numbers alone can be misleading without context on recall, dataset size, distribution, hardware, distance metric, and other relevant factors. For example, if we relaxed recall constraints, our QPS would also jump significantly. In the VectorDBBench results we shared, we made sure to maintain (or exceed) the recall of the previous leader while running on comparable hardware—which is why doubling their throughput at 8K QPS is meaningful in that specific setting.
You're absolutely right that a basic HNSW implementation is relatively straightforward. But achieving this level of performance required going beyond the usual techniques.
Yep you are right, also: quantization is a big issue here. For instance int8 quantization has minimal effects on recall, but makes dot-product much faster among vectors, and speedups things a lot. Also the number of components in the vectors make a huge difference. Another thing I didn't mention is that for instance Redis implementation (vector sets) is threaded, so the numbers I reported is not about a single core. Btw I agree with your comment, thank you. What I wanted to say is simply that the results you get, and the results I get, are not "out of this world", and are very credible. Have a nice day :)
Appreciate the thoughtful breakdown—you're absolutely right that quantization, dimensionality, and threading all play a big role in performance numbers. Thanks for the kind words and for engaging in the discussion. Wishing you a happy Year of the Horse—新春快乐,马年大吉!
Things are a bit more complicated. Actually Redis the company (Redis Labs, and previously Garantia Data) offered since the start to hire me, but I was at VMWare, later at Pivotal, and just didn't care, wanted to stay actually "super partes" because of idealism. But actually Pivotal and Redis Labs shared the same VC, It made a lot more sense to move to Redis Labs, and work there under the same level of independence, so this happened. However, once I moved to Redis Labs a lot of good things happened, and made Redis maturing much faster: we had a core team all working at the core, I was no longer alone when there were serious bugs, improvements to make, and so forth. During those years many good things happened, including Streams, ACLs, memory reduction stuff, modules, and in general things that made Redis more solid. In order to be maintained, an open source software needs money, at scale, so we tried hard in the past to avoid going away from BSD. But eventually in the new hyperscalers situation it was impossible to avoid it, I guess. I was no longer with the company, I believe the bad call was going SSPL, it was a license very similar to AGPL but not accepted by the community. Now we are back to AGPL, and I believe that in the current situation, this is a good call. Nobody ever stopped to: 1. Provide the source on Github and continue the development. 2. Release it under a source available license (not OSI approved but practically very similar to AGPL). 3. Find a different way to do it... and indeed Redis returned AGPL after a few months I was back because maybe I helped a bit, but inside the company since the start there was a big slice that didn't accept the change. So Redis is still open source software and maintained. I can't see a parallel here.
The search for speed is vain. Often Claude Code Opus 4.6, on hard enough problems, can do the impression of acting fast without really making progresses because of lack of focus on what matters. Then you spin the much slower GPT 5.3-Codex and it fixes everything in 3 minutes of doing the right thing.
What codex often does for this, write a small python script and execute that to bulk rename for example.
I agree that there is use for fast "simpler" models, there are many tasks where the regular codex-5.3 is not necessary but I think it's rarely worth the extra friction of switching from regular 5.3 to 5.3-spark.
I will always take more speed. My use of LLMs always comes back to doing something manually, from reviewing code to testing it to changing direction. The faster I can get the LLM part of the back-and-forth to complete, the more I can stay focused on my part.
disagree. while intelligence is important, speed is especially important when productionizing AI. it’s difficult to formalize the increase in user experience per increase in TPS but it most definitely exists.
Hi! This model is great, but it is too big for local inference, Whisper medium (the "base" IMHO is not usable for most things, and "large" is too large) is a better deal for many environments, even if the transcription quality is noticeable lower (and even if it does not have a real online mode). But... It's time for me to check the new Qwen 0.6 transcription model. If it works as well as their benchmarks claim, that could be the target for very serious optimizations and a no deps inference chain conceived since the start for CPU execution, not just for MPS. Since, many times, you want to install such transcription systems on server rent online via Hetzner and other similar vendors. So I'm going to handle it next, and if it delivers, really, time for big optimizations covering specifically the Intel, AMD and ARM instructions sets, potentially also thinking at 8bit quants if the performance remain good.
Same experience here with Whisper, medium is often not good enough. The large-turbo model however is pretty decent and on Apple silicon fast enough for real time conversations. The addition of the prompt parameter can also help with transcription quality, especially when using domain specific vocabulary. In general Whisper.cpp is better with transcribing full phrases than with streaming.
And not to forget, for many use cases more than just English is needed. Unfortunately right now most STT/ASR and TTS focus on English plus 0-10 other languages. Thus being able to add with reasonable effort more languages or domain specific vocabulary would be a huge plus for any STT and TTS.
reply