psarna's comments

psarna · 2026-01-20T18:29:29 1768933769

psarna · on Jan 25, 2023

First of all, let me start by reiterating the first sentence from the bottomless repo - it's work in heavy progress (and we'll move it under the sqld/ repo soon, to keep everything in one place). Now, answers:

> - Looks like bottomless does automatic restoring of the database? Litestream seems to avoid this, I assume because of concerns about accidentally creating multiple writers on a rolling-deploy (or similar mistake). Any concerns about this possible footgun?

It's a valid concern, but what always happens on boot is starting a new generation (a generation is basically a snapshot of the main file + its continuously replicated WAL), distinguished by a uuid v7, the timestamped one. So even if a collision happens, it would be recoverable - e.g. one of the stray generations should be deleted.

> - Bottomless is a sqlite extension, not a separate process? Pros and cons compared to litestream?

The only con I see is that a bug in the extension could interfere with the database. As for pros: way less maintenance work, because everything is already embedded, we're also hooked into the database via a virtual WAL interface (libSQL-only), so we have full control over when to replicate, without having to observe the .wal file and reason about it from a separate process.

> - Similar questions for sqld. How does sqld handle the lifecycle of deploys and multiple readers/writers talking to the database? Anything a new user should be concerned about?

There are going to be multiple flavors of sqld, but the rough idea would be to trust the users to only launch a single primary. In the current state of the code, replicas contact the primary in order to register themselves, so the model is centralized. Once we build something on top of a consensus algorithm, leader election will be pushed to the algorithm itself.

maxmcd · on Jan 25, 2023

Very cool, thank you for the insights

psarna · on Jan 13, 2022

Indeed - 1ms was a nice round number, but it was also not that far from the real, sub-1ms number. But note that by "latency" here I mean not only the round trip, but the total measured time of executing a single request, including the time its task spends in Tokio (and its queues), and until its response is fully processed as well. Since the program sent requests with configurable concurrency set to ~1024 or more, the overall throughput was still satisfying.

On the one hand it was (and still is) concerning that the observed latency per simple request was that high, on the other - it never really came up in our distributed tests, since there the network latency imposes a few milliseconds anyway. When we figure out the real source of this behavior, I'll be happy to describe the investigation process in a blog post (:

psarna · on Jan 12, 2022

That's great news! Especially that the observed performance of the test program based on FuturesUnordered, even though it stopped being quadratic, it was still considerably slower than the task::unconstrained one, which suggests there's room for improvement. Probably due to the fact that you still pay with a constant number of polls (32) each time you go out of budget.

carllerche · on Jan 12, 2022

IMO FuturesUnordered should stop executing futures when it sees a "yield". An explicit yield signals control should be returned to the runtime. FuturesUnordered does not respect this.

dathinab · on Jan 13, 2022

IMHO the main problem is tokio introducing preempting behaviour which in subtle ways can mess with all kinds of normally fully valid rust futures.

Sure it sometimes magically fixes problematic code, but it's in effect still a magic extension to how rust polling works which can have surprising side effects.

In a certain way tokios coperative-preempting is not adequate to handle any future which multiplexes multiple futures but such futures are a stable of rust async since the get to go.

carllerche · on Jan 13, 2022

I don't think it is an issue w/ the pre-emption code. I believe FuturesUnordered is just doing the wrong thing: not respecting yields.

psarna · on Jan 13, 2022

wrt. preemption - in Seastar we have maybe_yield(), which gives up the cpu, but only if the task quota (more or less a semantic equivalent for Tokio's budget) has passed. Wouldn't it make sense to have a similar capability in Tokio? Then, if somebody is not a big fan of the default preemption, they could run their tasks under tokio::task::unconstrained and only check the budget in very specific, explicitly chosen places - where they call maybe_yield(). That could of course also be open-coded by trying to implement maybe_yield on top of yield_now and some time measurements, but since the whole budgeting code is already there... Do you think it's feasible?

dathinab · on Jan 15, 2022

The problem is it requires you to write your code in a way which now _only_ will work with tokio.

Which isn't an option for a lot of libraries.

While the rest of the rust eco-system is increasingly moving to have increasingly more parts runtime independent...

Furthermore I think `maybe_yield` wouldn't be quite the right solution. The problem is that tokios magic works based on the assumption that a single task (future scheduled by the runtime) represents a single logical strang of exexution. (Which isn't guaranteed in rust.)

So I think a better tokio specific solution would be to teach tokio about the multiplexing effect in some way.

For example you could have some way which snapshots the budged when reaching the multiplexer, and reset the budget to the snapshot before every multiplexed feature is called. With this each logical strange of execution would have it's "own" budget (more or less).

dataangel · on Jan 15, 2022

Any extension of executors will require having a trait abstracting the executor used, and there just isn’t one in std yet. Your code already has to be tokio specific if you do something as mundane as spawn a task.

dathinab · on Jan 16, 2022

> mundane as spawn a task.

There are only a few things you need the specific runtime for:

- spawn

- IO

- timeout

But you can mix the executor and reactor doing the IO (not recommended but you can).

Similar you can run your own timer.

And you can abstract in various ways about all of this, sure with limitation, hence why there is no std-abstraction. But there are enough high profile libraries which do support multiple runtimes just fine.

But tokios preemting-cooperated threads to require any code which does any form of future multiplexing to:

- be tokio specific (which btw. isn't fully solving the problem)

- add a bunch of additional complexity, including memory and runtime overhead

If you multiplex features on tokio you must:

- use custom wakers to detect yields

- (and) do not poll futures in a "repeating" order (preferable fully random order).

This is a lot of additional complexity for something like a join_all (for a "small" N).

(reminds me I should check if I need to open an issue with futures-rs, as their join_all impl. was subtle broken last time I checked).

And even with that you have the problem that the multiplexed futures as subtle de-prioritized as they share a budged.

The problem I have with this feature is not that it's a bad idea, it isn't it's in general a good idea. The problem is that it completely overlooks the multiplexing case. And worse, further in subtle ways divides the ecosystem (that is what I'm worried about).

So maybe we could find a way to provide a std (or just common-library) standardized way for just that feature. (I mean it's a task local int, it might not even need to be atomic, maybe. So there might be a way which doesn't have the problem async-std standardization has).

psarna · on Jan 15, 2022

maybe_yield may or may not be the right solution here, but I think it may be useful in general - e.g. when you have long I/O-less computations. In such a case, I'd like to be able to say "yield here if my budget is drained, but continue otherwise and don't put my task at the end of the queue". Although for that the only thing I really need is a way to peek at your budget - with that, open-coding maybe_yield is trivial

carllerche · on Jan 13, 2022

It wouldn’t be too hard. The trickiest bit would be putting together a consistent API.

psarna · on Jan 13, 2022

Now that I think of it, it would probably be beneficial even outside of the unconstrained scope, especially for long computations. When iterating over millions of elements, it would be great to have a mechanism for maybe yielding if we're past the budget, but we don't really want to force-yield on every X iterations and put the task at the back of the queue. If the maybe_yield API is potentially controversial, a sufficient building block would be a function that allows peeking into the state of your budget - and then, if you're out of it, you just explicitly call yield_now().

heftig · on Jan 13, 2022

How could it respect them? The Future trait doesn't let it distinguish between "please yield now" and "please poll again".

carllerche · on Jan 13, 2022

It wouldn’t be too hard to tell it apart. A yield is defined as the task waking itself vs something else waking it. The yield methods already do this.

heftig · on Jan 13, 2022

Again, FuturesUnordered cannot know the difference between a task wanting to yield and a task that wants to be polled immediately. The waker does not get this information, either. It cannot distinguish.

carllerche · on Jan 13, 2022

Here is the PR: https://github.com/rust-lang/futures-rs/pull/2551

Yield = wake the `waker_ref`. Avoiding the yield would be clone().wake().

That said, "poll immediately" isn't actually a thing nor was it ever a thing except in incorrect implementations.

dathinab · on Jan 15, 2022

But polling multiple futures independent of each other inside of a future is a thing (like join, race, etc.).

And that means that just because you get "Pending" (i.e. not ready) from one of the futures, doesn't mean you should return Pending now. I only means this future is not ready, but other futures might still be ready.

But in tokio it means this future is not ready, and we magically as a side-effect might have forced all other futures to be non-ready even if they are.

Which means tokio redefined what Pending means in a subtle but potentially massively-braking way.

Which is a problem.

And not a problem of futures-rs, but one of tokio.

And forcing all of the eco-system to increase the complexity of their code by trying to subtile detect weather something yielded or was force yielded IS NOT OK. That's not how rust standarized yielding or polling.

dathinab · on Jan 15, 2022

> wrong thing: not respecting yields

This is not quite right.

As far as I understand they did respect yield in the way it's defined by rust.

There yields mean just that your future returns `Pending`.

Which means only _this_ future is not ready.

But in tokio returning `Pending` means "not-ready" and "maybe as a _magic side effect_ also make all other futures return pending even if it is ready internally".

So lets step away from FuturesUnordered for a moment and instead just look at a future which multiplexes X-futures and polls each (not completed one) once and then yields, which should be 100% fine.

But with tokio it isn't as after just polling the first few, the "budget" might be consumed and polling all other will forcefully fail where it shouldn't, adding a lot of overhead. Worse if you just poll them in the same order every time you will de-facto starve all futures later in the list. Which means you need to add a bunch of complexity which should be unnecessary just because tokio changed what `Poll::Pending` means.

Also if you write scheduler independent futures (you should if you can) then you can not opt-out of it. Generally in a multiplexed future you don't want to opt. out anyway, you want to have a budged per logical strange of execution, which tokio doesn't provide.

Instead tokio assumes a task (which in the end is just a future polled by it) represents a single logical strange of execution. But that is simple NOT how rust futures are designed!! (Through often it happens to play out that way.)

This doesn't mean that tokios idea is bad, it's just not compatible with rusts future design in subtle edge cases.

I think it would be a nice thing to add it to the future design, but then you would need a standardized way to properly handle multiplexing. (Which as a side not tokio doesn't provide, it only provides opt. out `unconstrained`, but what you need is to snapshot and restore the budget or something similar, i.e. snapshot it when entering the multiplexer and restoring it after each multiplexed future or similar, the only solution futures unordered could take is speculative adding yields, but that's _not_ a proper solution as it subtle de-prioritize multiplexed futures compared to spawned futures and also can easily fall apart if you nest multiplexed futures....).

Also I have no idea why they call it cooperative scheduling. Futures are cooperative scheduling. What they do is preemting futures (i.e. force full yield them), but only in places where they could have yielded in context of cooperative scheduling. So it's some in between solution.

psarna · on Jan 12, 2022

Hi, author of the post here. I'll also be talking about this issue in a little more detail at an online Rust Warsaw meetup tomorrow, feel free to join, especially if you're up for a live discussion later: https://www.meetup.com/Rust-Warsaw/events/282879405/

petr_tik · on Jan 12, 2022

Can you please shed some light on rust's adoption at scylla given the cumulative team expertise in Cpp and unsafe programming like thread-per-core architectures.

In my experience so far, highly proficient Cpp programmers tend to find rust constraints overly restrictive, because they kind of internalise the borrow checker and are comfortable enough to bend the rules sometimes.

how receptive are fellow scyllians (is that the term?) to rust? how much scope is there for future projects to be in Rust over Cpp?

Thanks for the honest and detailed write-up!

psarna · on Jan 13, 2022

> In my experience so far, highly proficient Cpp programmers tend to find rust constraints overly restrictive

For me it's the exact opposite - I find Rust righteously restrictive exactly in places I wish the C++ compiler would complain. For instance, the fact that variables are legal to use after move is an obvious C++ footgun - you almost never want it to actually happen in your code. And then, if one really knows better, Rust has `unsafe`, which I actually never used yet except for providing C/C++ bindings to link a Rust project with Scylla.

So, to sum up, I think that Rust has much better defaults (e.g. move the value by default, everything is const by default, etc.), but still lets you bend the rules explicitly, while in C++ the rules are already slightly bent for you, just in case you need it, which is instead a common source of bugs.

And the adoption of Rust is going great at ScyllaDB. More of it is hopefully coming soon, including a rewrite of user-defined functions support in Rust, which would allow us to fully utilize wasmtime (a very neat Rust project) as the WebAssembly engine.

tialaramex · on Jan 13, 2022

> comfortable enough to bend the rules sometimes.

Mmm. The world we actually have suggests that this comfort is dangerous.

Robert Browning was a famous poet. Certainly competent enough with English that you'd assume he knows what he's doing and can bend the rules. So, Dictionary makers were a little confused by the passage in Pippa Passes, "Then owls and bats / Cowls and twats / Monks and nuns in a cloister's moods / Adjourn to the oak-stump pantry". Why did Browning use the word "Twat" in this context? Turns out that he just had just assumed it was a word meaning a hat nuns wear...

Oops. Of course this goof just means a high school literature teacher covering Browning has to decide whether to skip this part of Pippa Passes maybe "for time" and hope nobody asks about it, or explain to the class that even the best of us screw up sometimes. In Computer Software our mistakes are often not treated so kindly.

psarna · on Jan 20, 2022

In case somebody stumbles upon this old thread, here's the recording of the meetup: https://youtu.be/pgzjPeIQ3Us