Real-world examples of unit test failure due to race conditions in modern codebases
Examples of unit test failure due to race conditions in shared state
Let’s start with the most common example of unit test failure due to race conditions: shared mutable state. The code “works” most of the time, but a tiny timing difference between threads or async tasks exposes a bug.
Shared in-memory cache updated by multiple tests
Imagine a singleton in-memory cache used across your application:
public class UserCache {
private static final Map<String, User> CACHE = new HashMap<>();
public static void put(String id, User user) {
CACHE.put(id, user);
}
public static User get(String id) {
return CACHE.get(id);
}
public static void clear() {
CACHE.clear();
}
}
``
Two tests run in parallel:
```java
@Test
void testUserIsCached() {
UserCache.clear();
UserCache.put("42", new User("Alice"));
assertEquals("Alice", UserCache.get("42").getName());
}
@Test
void testCacheIsInitiallyEmpty() {
UserCache.clear();
assertNull(UserCache.get("42"));
}
On your laptop, they often pass. On CI, where the test runner executes tests concurrently, these become textbook examples of examples of unit test failure due to race conditions:
- Test A clears, puts “Alice”.
- Test B clears right after A puts.
- Test A asserts and gets
null, failing intermittently.
The fix is not just adding synchronized everywhere. A better pattern is to avoid process‑wide singletons in tests and inject a fresh cache per test instance. Frameworks that support per‑test dependency injection (e.g., Spring’s test context in Java or fixtures in Python’s pytest) help isolate state and reduce these examples of failure.
Race conditions in static counters and global IDs
Another classic example of unit test failure due to race conditions is a global counter used to generate IDs:
_counter = 0
def next_id():
global _counter
_counter += 1
return _counter
Two async tests call next_id() concurrently and assert on exact values. Under CPython’s GIL this might appear safe, but once you move to multiprocessing, or to a different runtime (e.g., Jython, PyPy, or a C extension that releases the GIL), increments can interleave. Your test suite now contains hidden examples of race‑driven failures that only emerge in certain environments.
The better design: use thread‑safe primitives (like AtomicInteger in Java or itertools.count with proper locking in Python), or avoid asserting specific ID values in tests. Instead, assert properties such as uniqueness and ordering.
Async and event-loop based examples of unit test failure due to race conditions
Race conditions aren’t just about low‑level threads. Async/await and event loops introduce their own flavors of timing bugs.
JavaScript Jest tests that forget to await
A surprisingly common example of unit test failure due to race conditions appears in JavaScript tests:
test('saves user profile', () => {
saveUserProfile({ id: 1, name: 'Alice' });
const user = getUserProfile(1);
expect(user.name).toBe('Alice');
});
If saveUserProfile is async and writes to IndexedDB, a remote API, or even just setTimeout, then getUserProfile might run before the write completes. On a fast local machine, the event loop timing masks the bug. On slower CI runners, this becomes an example of flakiness driven by race conditions.
The test should be:
test('saves user profile', async () => {
await saveUserProfile({ id: 1, name: 'Alice' });
const user = await getUserProfile(1);
expect(user.name).toBe('Alice');
});
The underlying pattern: tests that don’t properly synchronize with async work are prime examples of examples of unit test failure due to race conditions. They pass when the event loop cooperates and fail when timers or I/O behave differently.
Python asyncio tests with lingering background tasks
In Python, asyncio tests often spawn background tasks:
async def start_worker(queue):
while True:
item = await queue.get()
await process(item)
async def test_worker_processes_item(asyncio_queue):
worker_task = asyncio.create_task(start_worker(asyncio_queue))
await asyncio_queue.put("hello")
await asyncio.sleep(0.01)
assert was_processed("hello")
On some runs, process("hello") hasn’t finished by the time the assertion executes. Increasing sleep time just hides the race, it doesn’t fix it. These are subtle examples of unit test failure due to race conditions because they rely on arbitrary timing instead of deterministic synchronization.
A better approach:
- Expose a hook or event to await when processing is complete.
- Or use a queue and wait until it’s empty before asserting.
The broader lesson: if your test uses sleep() to “wait for things to settle,” you probably have a race. Modern testing guidelines from organizations like NIST emphasize deterministic, repeatable tests in software assurance; sleeping and hoping is the opposite of that.
File system and I/O based race condition examples
Disk and network I/O are slower and more variable than memory. That variability is exactly what makes them fertile ground for examples of race‑driven unit test failures.
Tests that assert on files before writes flush
Consider a logging component:
func (l *Logger) Log(msg string) {
go func() {
l.file.WriteString(msg + "\n")
l.file.Sync()
}()
}
And a test:
func TestLoggerWritesToFile(t *testing.T) {
logger := NewLogger("test.log")
logger.Log("hello")
data, _ := os.ReadFile("test.log")
if !strings.Contains(string(data), "hello") {
t.Fatalf("expected log message")
}
}
Log spawns a goroutine; the test immediately reads the file. Sometimes the goroutine wins the race, sometimes the test does. This is a textbook example of unit test failure due to race conditions in I/O.
The fix: avoid fire‑and‑forget goroutines in code paths you want to test deterministically. Provide a synchronous path for tests, or expose a Flush() method that blocks until all buffered writes complete.
Temporary directories reused across parallel tests
Many test runners now default to parallel execution to speed up suites. That’s great for throughput, but a nightmare when multiple tests write to the same temp directory or file.
Imagine two tests both writing to /tmp/app-config.json. Each expects to read back its own config. On a single‑threaded run, they pass. Under parallelism, they become real examples of examples of unit test failure due to race conditions:
- Test A writes config A.
- Test B writes config B.
- Test A reads and sees config B.
The right pattern is per‑test isolation: randomize temp paths, use test framework fixtures to create unique directories, and aggressively clean up. This mirrors patterns recommended in secure coding and testing guidance from organizations like CISA and NIST.
Multi-threaded service examples of unit test failure due to race conditions
Modern backends rely heavily on concurrency: worker pools, message queues, and reactive pipelines. That means more opportunities for subtle timing bugs.
Thread pool workers updating shared metrics
Say you have a worker pool that processes jobs and increments a shared metric:
public class JobProcessor {
private int _processedCount = 0;
public void Process(Job job) {
// work...
Interlocked.Increment(ref _processedCount);
}
public int ProcessedCount => _processedCount;
}
The implementation uses Interlocked.Increment, which is thread‑safe. But your unit test is not:
[Fact]
public async Task ProcessesAllJobs() {
var processor = new JobProcessor();
var jobs = Enumerable.Range(0, 100).Select(_ => new Job()).ToList();
var tasks = jobs.Select(job => Task.Run(() => processor.Process(job)));
// Oops: we forgot to await all tasks deterministically
await Task.Delay(10);
Assert.Equal(100, processor.ProcessedCount);
}
On a busy CI agent, 10 ms is not enough; some tasks are still running when the assertion fires. These are subtle examples of unit test failure due to race conditions because the production code is correct, but the test’s synchronization is not.
Fix: await Task.WhenAll(tasks) instead of Task.Delay. Never rely on arbitrary delays to “wait” for concurrency.
Event-driven microservices and flaky integration-style unit tests
In 2024–2025, more teams are blurring the line between unit and integration tests, especially around message brokers like Kafka, RabbitMQ, or cloud queues. A common pattern:
- Test publishes a message.
- Service under test consumes it on a background thread.
- Test immediately queries the database and asserts on the side effect.
Under light load this works; under heavier load or on a slower CI node, the consumer lags behind. These tests are modern examples of examples of unit test failure due to race conditions in distributed systems.
Stabilizing them usually requires:
- Polling with a timeout instead of a single immediate query.
- Exposing an internal hook for tests to await processing.
- Or categorizing them as integration tests and giving them a different execution model and timeout budget.
Data races in modern languages: Go and Rust examples
Modern languages try to make data races more visible, but they don’t magically prevent race‑driven unit test failures.
Go tests that pass until you run -race
Go’s race detector is great at surfacing data races that may not crash but still cause flaky behavior. An example of unit test failure due to race conditions in Go:
var cache = map[string]string{}
func Set(k, v string) { cache[k] = v }
func Get(k string) string { return cache[k] }
A test fires off concurrent writers and readers. Without -race, the test might appear to pass. With go test -race, you get warnings about concurrent map writes. While this doesn’t always show up as a failing assertion, it’s still an example of race‑condition‑driven test instability that can manifest later as sporadic panics.
Best practice in 2024–2025: run your Go unit tests with -race in CI at least periodically. Many teams run a nightly or pre‑release pipeline with race detection enabled to catch these examples of failure before production.
Rust tests and logical races
Rust’s type system prevents data races at the memory level, but it doesn’t prevent logical races. For instance, two async tasks might both check a condition and then act on it, assuming exclusivity. If your unit test asserts that only one task performs the action, but your code lacks proper synchronization (e.g., using Mutex or channels), you can still get intermittent test failures.
These are newer examples of examples of unit test failure due to race conditions: not memory corruption, but incorrect ordering of logically concurrent operations.
How to recognize and prevent these examples of race-condition test failures
Across all these scenarios, the patterns repeat:
- Tests share mutable global or static state.
- Tests rely on arbitrary timing (
sleep,Delay,setTimeout) instead of explicit synchronization. - Tests assume synchronous behavior from inherently async or concurrent code.
- Tests run in parallel without isolating resources like files, ports, or environment variables.
In 2024–2025, with more CI systems defaulting to parallel runs and more codebases adopting async and microservices, you should expect more—not fewer—examples of unit test failure due to race conditions.
Some practical strategies:
- Isolate state per test using dependency injection, fixtures, and factory functions.
- Avoid global singletons in tests; if you must have them, reset them safely before each test.
- Replace sleeps with signals: events, latches, futures, channels, or explicit
awaiton known completion points. - Run with sanitizers and race detectors (e.g., Go
-race, ThreadSanitizer, or language‑specific tools) at least in a scheduled pipeline. - Tag and quarantine flaky tests so they don’t block deploys while you investigate the underlying race.
Organizations that care about software reliability—including government and academic institutions—have been pushing for more formal testing and verification practices. For instance, NIST’s guidance on software verification and validation highlights the need for repeatable, deterministic tests as part of software assurance and supply chain security efforts (NIST V&V overview). These ideas map directly onto the messy, real‑world examples of race‑condition‑driven unit test failures you see in modern codebases.
FAQ: Common questions about race-condition unit test failures
Q: Can you give a simple example of a race condition causing a unit test to fail?
Yes. A simple example of this is two tests writing to and reading from the same global variable or singleton at the same time. One test sets the value and asserts it, while another test resets it or sets a different value. Depending on timing, the first test might see the wrong value and fail intermittently.
Q: How do I know if a flaky test is caused by a race condition or something else?
Look for patterns: failures only on CI, only under high load, or only when tests run in parallel are strong indicators. If adding arbitrary delays makes the test “more stable,” that’s another red flag. Tools like Go’s race detector or ThreadSanitizer can also reveal low‑level data races.
Q: Are these examples of race-condition failures limited to multi-threaded code?
No. Async/await, event loops, background tasks, and even process‑level concurrency (multiple processes sharing files or network ports) can all produce examples of unit test failure due to race conditions. The common thread is non‑deterministic ordering of operations, not just threads.
Q: What’s the best way to write tests for concurrent code?
Design your APIs so tests can observe clear completion points—callbacks, promises, futures, or explicit Flush/Stop methods. Avoid relying on timing. Use per‑test instances of stateful components. And where possible, structure concurrency through higher‑level primitives (actors, channels, queues) that are easier to reason about.
Q: Should I disable flaky tests that are failing due to race conditions?
Temporarily quarantining them can be reasonable to unblock deployments, but treat that as a short‑term move. Each flaky test is a signal of a real concurrency or test‑design issue. Track them, prioritize them, and fix the underlying race rather than living with permanent nondeterminism.
Related Topics
Real-world examples of unit test failure due to race conditions in modern codebases
Best examples of unit test failure examples: incorrect test setup
Real-world examples of unit test failures from outdated libraries
Real-world examples of unit test failure due to improper exception handling
Real-world examples of unit test failures: mocking issues examples for modern codebases
Best examples of unit test failure examples: test doubles in real projects
Explore More Unit Testing Failures
Discover more examples and insights in this category.
View All Unit Testing Failures