Then the test is still flaky. If there's a bug you want the test to consistently...

9rx · 2026-03-12T15:33:36 1773329616

The parent is talking about when the implementation is flaky, not the test. When you go to fix the problem under that scenario there is no reason for you to modify the test. The test is fine.

rkangel · 2026-03-12T15:57:58 1773331078

What you're describing is the every day reality but what you WANT is that if your implementation has a race condition, then you want a test that 100% of the time detects that there is a race condition (rather than 1% of the time).

fcarraldo · 2026-03-12T18:29:32 1773340172

If your test can deterministically result in a race condition 100% of the time, is that a race condition? Assuming that we're talking about a unit test here, and not a race condition detector (which are not foolproof).

rkangel · 2026-03-12T20:28:32 1773347312

> Assuming that we're talking about a unit test here

I think the categorisation of tests is sometimes counterproductive and moves the discussion away from what's important: What groups of tests do I need in order to be confident that my code works in the real world?

I want to be confident that my code doesn't have race conditions in it. This isn't easy to do, but it's something I want. If that's the case then your unit test might pass sometimes and fail sometimes, but your CI run should always be red because the race test (however it works) is failing.

This is also hints at a limitation of unit tests, and why we shouldn't be over-reliant on them - often unit tests won't show a race. In my experience, it's two independent modules interacting that causes the race. The same can be true with a memory bug caused by a mismatch in passing of ownership and who should be freeing, or any of the other issues caused by interactions between modules.

9rx · 2026-03-12T20:47:54 1773348474

> I think the categorisation of tests is sometimes counterproductive

"Unit test" refers to documentation for software-based systems that has automatic verification. Used to differentiate that kind of testing from, say, what you wrote in school with a pencil. It is true that the categorization is technically unnecessary here due to the established context, but counterproductive is a stretch. It would be useful if used in another context, like, say: "We did testing in CS class". "We did unit testing in CS class" would help clarify that you aren't referring to exams.

Yeah, Kent Beck argues that "unit test" needs to bring a bit more nuance: That it is a test that operates in isolation. However, who the hell is purposefully writing tests that are not isolated? In reality, that's a distinction without a difference. It is safe to ignore old man yelling at clouds.

But a race detector isn't rooted in providing verifiable documentation. It only observes. That is what the parent was trying to separate.

> I want to be confident that my code doesn't have race conditions in it.

Then what you really WANT is something like TLA+. Testing is often much more pragmatic, but pragmatism ultimately means giving up what you want.

> often unit tests won't show a race.

That entirely depends on what behaviour your test is trying to document and validate. A test validating properties unrelated to race conditions often won't consistently show a race, but that isn't its intent so there would be no expectation of it validating something unrelated. A test that is validating that there isn't race condition will show the race if there is one.

9rx · 2026-03-12T20:03:40 1773345820

You can use deterministic simulation testing to reproduce a real-world race condition 100% of the time while under test.

But that's not the kind of test that will expose a race condition 1% of the time. The kinds of tests that are inadvertently finding race conditions 1% of the time are focused on other concerns.

So it is still not a case of a flaky test, but maybe a case of a missing test.

throw_await · 2026-03-12T15:35:03 1773329703

But also a flaky test is a bug by itself.