Processes

You run your tests in two terminal windows at once. What happens? Why?

Nov 11, 2023

Let’s suppose you’re working in a codebase that has some tests (lucky you!). Whenever you want to run the tests, you type a command into the terminal. One day, just for fun, you open two terminals at the same time and run the tests in both windows simultaneously.

This post tells the story of what happens next, and why. There are several possible endings to the story, and each one implies something different about your tests. Along the way, we’ll learn a bit about how operating systems keep track of concurrent computations as processes, and how processes can collaborate (and conflict) with each other.

What might happen?

Your mad science experiment has a few possible results:

The tests might run completely normally, and pass in both terminals.
You might see test failures in one or both terminals.
The tests might completely crash. I.e. instead of a “pass” or a “fail,” you might get an error message from the test runner (e.g. jest or mocha) complaining that what you did is SO BONKERS that it can’t even run your tests.

Leaving aside scenarios 1 and 2 for the moment, I can pretty confidently state that scenario 3 is highly unlikely. Two instances of the same test runner should be able to coexist on one machine in peace and harmony.

Now, why exactly am I so confident in making that prediction?

Processes: Isolated Computations

Linux, macOS, and Windows all have a concept of processes. A process is the running instantiation of a program.

A program is just code: a static sequence of instructions. It doesn’t do anything, on its own. Like an armchair philosopher, it just describes actions that might hypothetically be done someday.

While programs are static, processes are dynamic. Processes put programs’ plans into action. Each process is like a little virtual organism that is born, lives for a while, and eventually dies. While the process lives, it faithfully executes the instructions in the program given to it at birth.

A photograph of a tardigrade under a microscope — "Tardigrade" by Philippe Garcelon is licensed under CC BY 2.0.

When you launch a program (whether through the GUI or the terminal) the operating system creates a process and hands it two things: the program’s code, and a chunk of memory1. The process executes each instruction of the program in turn, using the memory to keep track of the variables and data structures the program tells it to create. It keeps executing instructions until either the program itself or a signal from outside tells it to stop. (The “signal from outside” mechanism is how ctrl+C in your terminal kills runaway processes.) Thus, the process dies, and the operating system reclaims its memory so future processes can use it.

A key fact about processes is that they are, by default, isolated from all other processes. One process cannot reach willy-nilly into another’s memory to change its variables or call its functions. If two processes want to talk to each other, they have to do so in a civilized, orderly way. More on exactly how that works later.

Implications for concurrent test runs

What does all this have to do with our double-test-run thought experiment?

Well, our test processes are probably just reading some files off disk and running the code they find therein. If that’s all they do, they’re not going to step on each others’ toes. The mere fact that a program is being run by a process doesn’t affect the program file in any externally visible way. Nor does it affect any other process running the same program.

In general, any number of processes can be concurrently running the same program, and they won’t conflict. macOS and iOS go to lengths to obscure this fact, by only letting you launch one instance of each app. But the OS’s policies don’t change the fundamental truth of how processes work.

So, now that we know why the test runner itself won’t complain, let’s consider the next obvious question. Why might the tests fail? Why might they pass? And whichever they do… what does that imply about the tests themselves?

Why might the tests fail?

Earlier, I hinted that processes are not totally isolated from one another. They can communicate—just in restricted ways. One way that processes can interact is by reading and writing files.

If I’m editing a file in VS Code and then I open the same file in TextEdit, those two editors are, naturally, two different processes. If I change the file in one editor and save it, the other editor can notice the change and refresh. So the two processes are interacting, via the contents of the file.

By the same token, if tests fail when run concurrently, one possible cause is that the tests are reading and writing files.

An example: imagine our test suite contains the following test for an appendToFile function:

describe("appendToFile", () => {
  it("adds a line to the end of a file", async () => {
    // Arrange: create a one-line file
    await fs.writeFile("/tmp/foo.txt", "Hello!");

    // Act: append a second line
    await appendToFile("/tmp/foo.txt", "Goodbye!");

    // Assert: the file now has both lines
    const fileContents =
        await fs.readFile("/tmp/foo.txt", "utf8");
    expect(fileContents).toBe("Hello!\nGoodbye!\n");
  });
});

Now, let’s suppose two processes were executing this test concurrently. It might happen that immediately after the first process runs the line:

await fs.writeFile("/tmp/foo.txt", "Hello!");

the second process does so too. In that case, the second process’s writeFile would have no effect: it would simply overwrite “Hello!” with “Hello!”.

The first process would then run the next line:

await appendToFile("/tmp/foo.txt", "Goodbye!");

adding a line to the end of the file. And a moment later, the second process would do so too.

After this sequence of events, the file would read:

Hello!
Goodbye!
Goodbye!

Egads! Three lines instead of two! The test assertion is not going to like this. Both test runs will fail.

The name for this phenomenon is test pollution. A test that mucks with global state (in this case, a file) can pollute other tests’ inputs and cause them to fail.

Concurrently-running tests can pollute each other in a multitude of ways, but the most common pollution vectors are files, databases, and web services. Any reservoir of state that is external to the test process can be polluted by other processes and cause test failures.

The polluting process doesn’t even have to be a test. E.g. if your tests invoke code that uses a database, and you manually make changes to that database while the tests are running, you’re likely to see failures.

Why might the tests pass?

Now let’s consider the happier scenario, where all the tests pass in the face of concurrent runs. Let’s suppose we prove to ourselves that we didn’t just get lucky: we can repeat our double-test-run experiment many times, and even run dozens of test processes at once, and all the tests stay green.

If that happens, there are a couple likely explanations:

Maybe the tests have no assertions, so they pass no matter what! (Just trying to cover all my bases :D)
The tests might truly be isolated.

Isolated tests are a beautiful thing. An isolated test is one that can neither pollute nor be polluted by other tests. We can achieve test isolation in two ways:

We can design our code and tests so they don’t interact with shared state (like files, databases, and web services) at all.
We can give each test its own “sandbox” of files/databases/services to play in.

In practice, we almost always have to use a combination of these two approaches. The first approach is best, but it’s not always practical. Some code in our system is probably going to have to interact with files, databases, or web services. If we want that code to have isolated tests, we have to figure out how to sandbox it.

How might we sandbox the appendToFile test we saw earlier? Here’s one way: randomly generate the name of the file we use. That way, each test run will choose a different file. Et voilà, no pollution.

describe("appendToFile", () => {
  it("adds a line to the end of a file", async () => {
    const path = `/tmp/${Math.random()}.txt`;
    // Arrange: create a one-line file
    await fs.writeFile(path, "Hello!");

    // Act: append a second line
    await appendToFile(path, "Goodbye!");

    // Assert: the file now has both lines
    const fileContents =
        await fs.readFile(path, "utf8");
    expect(fileContents).toBe("Hello!\nGoodbye!\n");
  });
});

There’s still a small chance that two concurrently-running tests will choose the same filename and pollute each other, but in practice, this strategy works well enough. A more reliable solution would be to use something like NodeJS’s fs.mkdtemp() function, which creates a temporary directory guaranteed to have a unique name.

What’s the point?

When our tests are isolated from each other, new possibilities open up. We can speed up test runs by splitting the tests into groups and running them concurrently (some test frameworks, like Jest, do this automatically). We also gain confidence that we can run just a subset of our tests, or reorder them, without causing failures.

Test isolation is neat, but it’s only one part of a bigger picture. Test isolation is possible only because the operating system isolates processes from one another—and the OS does that because isolation-by-default is just a heck of a good idea. As we’ll see in future posts, tending the balance between isolation and communication is fundamental to the sustainable growth of all complex systems.

The OS actually gives the process (many) other resources, too, but let’s keep our description simple for now.

Ben’s Guide to Software Development

Discussion about this post