Note that wasm is still lean and fast - WASI is not part of core wasm, but layered on top.
That is, it is possible to implement wasm without WASI. That is also true for other wasm proposals like WasmGC. It is very possible that parts of the ecosystem will not implement certain proposals if they don't make sense there (e.g. parts of the embedded ecosystem may never add GC, etc.).
I have seriously attempted to write my own WebAssembly 3.0 implementation recently, and while I did finish the whole thing [1] that left me a bitter taste about WasmGC which turned out to be very annoying to implement. In fact, I originally wanted to avoid GC but spectest assumed that GC is always available and I had no other option but implementing one in order to make use of spectest in the first place.
> I believe that artificial intelligence has three quarters to prove itself before the apocalypse comes, and when it does, it will be that much worse, savaging the revenues of the biggest companies in tech. Once usage drops, so will the remarkable amounts of revenue that have flowed into big tech, and so will acres of data centers sit unused, the cloud equivalent of the massive overhiring we saw in post-lockdown Silicon Valley.
We have seen 8 quarters since. Has any of that come to pass?
if you can't predict when it will pop then you should really not predict anything. I can also predict that Google will pop. I won't tell you when but I'll tell you that it will. I'll remain thoroughly unfalsifiable and I'll keep pushing the dates.
> Suppose one copies an LLM into AoE II and feeds into the AoE II-LLM ‘I feel lonely’ as an input. This AoE II-LLM replies: ‘I feel bad for you, maybe catch up with a friend? Closeness always helps in these situations’. One would be hard-pressed to make a convincing argument that, because of this response, an AoE II-LLM knows what helps in these situations
I don't see why one would be any more hard-pressed to make that conclusion about this system than a "normal" LLM.
That it is harder to "read" the data out is the only difference (the AoE II-LLM's output is encoded in game elements). But is ease of decoding an actual issue? If we can't understand a group of people that speak another language, does that say anything about them, or about us?
> [LLMs], as turbo-charged statistical models (recall their formal relation to logistic regression) can only but provide correlations.
And, of course, the Stochastic Parrot paper is the classic example in this area. It is from 5 years ago, but "LLMs only do statistics / can't understand" is very much alive and active among academics, even if it is a minority position.
I had the same question. I think that could be answered by using the predicted activation, but I don't see that in the paper.
That is, rather than just translate activation to text, then text to activation, that final activation could then be applied to the neural network, and it would be allowed to continue running from there.
If it kept running in a similar way, that would show that the predicted activation is close enough to the original one. Which would add some confidence here.
But a lot better would be to then do experiments with altered text. That is, if the text said "this is true" and it was changed to "this is false", and that intervention led to the final output implying it was false, that would be very interesting.
This seems obvious but I don't see it mentioned as a future direction there, so maybe there is an obvious reason it can't work.
> But a lot better would be to then do experiments with altered text. That is, if the text said "this is true" and it was changed to "this is false", and that intervention led to the final output implying it was false, that would be very interesting.
They do essentially that with the rhyming example, changing "rabbit" in the explanation to "mouse" and generating text that's consistent with that change.
A carb counting app might use API calls to these frontier models and then do some kind of analysis. It could see if different models agree or not, or multiple calls, and with how much variance.
So it would be more accurate to test the apps rather than the APIs, unless the goal is to warn people that just open chatgpt and ask there.
The open source app could in theory do that, but the paper's authors would be able to determine whether it did or not by reading its code, which they evidently did to replicate the API calls it made with their own script.
(And of course it would also be far more tedious to submit each picture 500 times manually using an app and manually log the response than using a script which is designed to collect the data automatically as fast as API rate limits permit)
Another way to put it: if training a model cost 72,000 tons of carbon, and it then gets used by 100 million people (typical of major models), the cost per person is 0.00072 tons.
Per the article, the average human uses over 5 tons per year (Americans: 18). Adding 0.00072 to 5 is not really noticeable.
Note that wasm is still lean and fast - WASI is not part of core wasm, but layered on top.
That is, it is possible to implement wasm without WASI. That is also true for other wasm proposals like WasmGC. It is very possible that parts of the ecosystem will not implement certain proposals if they don't make sense there (e.g. parts of the embedded ecosystem may never add GC, etc.).
reply