Generalizing the Code

Testing around effects, Part 3

Dec 11, 2023

In the last two posts, we explored backdoor stubbing, a technique for testing code that has effects. That was the first of five techniques we’ll be looking at in this series on “testing around effects.” Today’s post is about the second technique: redesigning the code to solve a more general problem.

"Hubble Views a Tranquil Galaxy with an Explosive Past" by NASA Hubble is licensed under CC BY 2.0.

Generalization

Before we dive into testing, we should address the elephant in the room: the idea of generalization itself. Programmers’ habit of over-generalizing solutions has become something of a running joke in our industry.

An XKCD comic. Panel 1: a person sitting at a table eating says "can you pass the salt?" Panel 2: the person waits with fork poised. Panel 3: "I said—". From off camera: "I know! I'm developing a system to pass you arbitrary condiments." "It's been 20 minutes!" "It'll save time in the long run!" — “The General Problem.” From XKCD. Licensed under CC BY-NC 2.5.

Generalization can have long-term benefits, but often comes with short-term costs: solving the general problem is sometimes significantly harder and more expensive than solving just the problem that’s right in front of you. The trend of over-generalizing has been countered in recent decades: In Extreme Programming Explained, Kent Beck urged readers to bias their designs toward simplicity, and envision “the simplest thing that could possibly work.” That exhortation then generated its own backlash as readers misinterpreted what Kent was saying.

I don’t really think I can clear up all the misunderstanding in a few short paragraphs, but I’m idealistic enough to try. Here are my thoughts on generalization, distilled from experience:

Generalization is not all or nothing. A solution can be generalized on some dimensions and not others.
Generalization is not just about flexibility, or adapting code to new use cases. Some types of generalization—especially those that separate concerns—can also make code easier to understand.
Solutions can be made more general over time. You can start by passing the salt, then generalize to handle both salt and pepper, and later generalize to all condiments.
The cost of each incremental generalization can be very low. A few minutes of effort can yield just enough generalization for the present moment.
General-purpose code does not have to be fancy or complicated.
Generalizing does not always make the design better.
Generalizing does not always make the design worse.

As we work through the code example, I hope it will become clear how these ideas apply in practice.

Recap: sendSignupReport

Here’s the sendSignupReport function we tasked ourselves with testing in the post on Effects:

async function sendSignupReport() {
  const newUsers = await database.query("users")
    .where("signup_date", ">", weeksAgo(1));

  const reportData = breakdownByCountry(newUsers);

  const reportHtml = formatReportAsHtml(reportData);

  await sendEmail({
    from: "noreply@example.com",
    to: "eliza@example.com",
    subject: "Weekly signup report",
    body: reportHtml,
  });
}

This code is hard to test because it interacts with the outside world in three places:

It looks at the current time to compute weeksAgo(1).
It fetches user records from the production database.
It sends an email.

Generalizing to different recipients

The main difficulty with testing this code is that sendEmail sends a real email to our colleague Eliza. In order to verify that the correct email was sent, we’d need access to her inbox.

What if, instead of hardcoding Eliza’s email address, we generalized our code to be able to send mail to an arbitrary recipient?

async function sendSignupReportTo(recipient: string) {
  // ...

  await sendEmail({
    from: "noreply@example.com",
    to: recipient,
    subject: "Weekly signup report",
    body: reportHtml,
  });
}

After this redesign, Eliza’s email address will be hardcoded at the callsite: i.e. the production code that uses this function will look like sendSignupReportTo(“eliza@example.com”) instead of sendSignupReport(). Our tests can pass any email address that’s convenient, e.g. sendSignupReportTo(“testaccount@example.com”).

We’ve solved the problem of spamming our coworker with test emails, but now we need to create a separate email account just for testing. We need to figure out how to log into that account from our tests. And actually, we don’t want emails from different test runs to be sent to the same inbox, since that could cause test pollution if multiple developers are running the tests. And we still don’t have a way to pin down what data the email should contain! And…

Stop. Deep breath. This feeling of exploding complexity is a sign that we’re trying to solve too many problems at once. Let’s take a step back and re-assess our options.

Who said anything about automated tests?

This whole time, we’ve been assuming (well, at least, I’ve been implying) that in order to test our code properly, we need fully automated tests right from the get-go.

That simply isn’t true.

Yes, I know: manual testing is tedious, slow, error-prone, easy to “forget” to do under time pressure, etc. etc. None of that is particularly relevant at this moment. If we’re drowning in uncertainty, any ability to test is a breath of fresh air. At this moment, we’re not seeking perfection. We’ll take any perceptible improvement.

With that in mind, here’s how I might write a test for our new code:

test("sendSignupReportTo", {
  async "delivers an email"() {
    // This is a semi-manual test. To run it, set testEmailAccount
    // to your own email, and comment out the return below.
    const testEmailAccount = "changeme@example.com";
    return;
    await sendSignupReportTo(testEmailAccount);
  },
});

The return statement ensures that the test does nothing during automated test runs. In order to run this test, you need to manually intervene and comment out the return. But once you’ve done that, you can use your usual test-runner tools to run this test. The signup report will appear in your inbox, where you can inspect it to see if the data looks reasonable.

Is this test technical tebt? Absolutely. But the untested code was tech debt, too. Pick your poison. One way to look at it is that by writing this test, we’ve restructured our debt so we can pay it off more easily. The semi-automated test gives us just enough confidence to refactor the code into a shape that’s more amenable to fully-automated testing.

That said, one aspect of this test in particular makes me uneasy: it’s decidedly not exemplary. That is, I wouldn’t want newbie developers to treat this test as a model of “good testing”, and duplicate its approach all over the codebase. There are multiple ways to mitigate this issue, but they all boil down to communication. A one-line comment, or a 5-minute presentation to the team, would probably suffice. For extra credit, you could also try to convey the following ideas:

“Best practices” are really just good defaults. Expert programmers know when to override the defaults.
To become a master, you have to learn the rules, then break them—and finally, transcend them. See: ShuHaRi.
Don’t let the perfect be the enemy of the good—but also, don’t let “good enough” get in the way of “getting better all the time”.

Generalizing to different data sources

We’ve just seen how to generalize our code to different email recipients. The next problem we have to deal with is that our test is fetching user data from the production database. There are many reasons not to use production data in tests, ranging from security to performance to test reproducibility.

To solve this problem, let’s generalize our code some more, so we can fetch data from an arbitrary database connection:

async function sendSignupReportTo(
  recipient: string,
  database: DatabaseConnection,
) {
  const newUsers = await database.query("users")
    .where("signup_date", ">", weeksAgo(1));

  const reportData = breakdownByCountry(newUsers);

  const reportHtml = formatReportAsHtml(reportData);

  await sendEmail({
    from: "noreply@example.com",
    to: recipient,
    subject: "Weekly signup report",
    body: reportHtml,
  });
}

After making database a parameter, we now have many more options for testing. We can pass in a real connection to a local database, or write a stub or a fake for the database instead. All of these options give us more control over the test setup than connecting to the production database did. Our tests can now determine exactly what data is in the database, so we can test various edge cases. What happens when there are no users at all? Are users who signed up 8 days ago correctly excluded from the report? Our newfound control over the data means we can write tests that precisely answer these questions.

Another way to generalize

Parameterizing the database is not the only way to decouple our tests from production data. Here’s an alternative: we can make newUsers a parameter, and move the database query to our caller:

async function sendSignupReportTo(
  recipient: string,
  newUsers: User[],
) {
  const reportData = breakdownByCountry(newUsers);

  const reportHtml = formatReportAsHtml(reportData);

  await sendEmail({
    from: "noreply@example.com",
    to: recipient,
    subject: "Weekly signup report",
    body: reportHtml,
  });
}

Our tests for sendSignupReportTo will not be able to cover as much functionality as before. We’ll only be able to test the creation of the report and the sending of the email. Depending on our objectives, this could be good or bad. The upside is that our test setup can now be simpler: we just have to create an array of users instead of wrangling a database stub or a real database connection. However, the tests will not be able to tell us if our database query has a bug.

The Dishonest Parameter Antipattern

Testing concerns aside, I can think of at least one reason to be wary of this new design. The email template probably contains language like “here’s a summary of who signed up last week”. The concept of “last week” is thus encoded in two different places: in the text of the email, and in the database query. Those encodings need to stay in sync. E.g. if we change our reporting to run every two weeks, both the email and the query need to change. The farther apart the two encodings are in code-space, the more likely they are to get out of sync because someone updated one and forgot the other.

(if you’re interested in exploring this idea further, I recommend reading up on connascence.)

Thus, the newUsers parameter is an example of an antipattern that I call dishonest parameter. Generally, when you see a parameter, it means you can pass in any value (subject to typechecking and other validations). However, in this case, the parameter is lying to us: we are not allowed to pass in any value we like. The only correct value we can pass is the result of a specific database query. If we pass in anything else in production, we’ll end up sending an email with false, misleading information.

Why is newUsers a dishonest parameter, while recipient and database weren’t? One clue is in the name. The parameter is called newUsers, but sendSignupReportTo really has no way of knowing whether the value it’s given represents new users or not. The word new hints that sendSignupReportTo is making assumptions about its caller—assumptions that it cannot verify. You know what they say about assumptions.

The moral of the story? Avoid creating dishonest parameters when generalizing your code. Consider what will happen when callers pass different values, and ensure that they can sensibly pass different values.

In addition, try to make calls to your code intelligible at the callsite. For example, sendSignupReportTo(“eliza@example.com”) is self-explanatory in a way that sendSignupReport(await getNewUsers()) isn’t. The correctness or incorrectness of the latter isn’t obvious; it depends on what getNewUsers() does.

Wrap-up

In this episode of “testing around effects”, we learned how to make code testable by redesigning it to solve a more general problem. In the next post (or two?), I’ll explore how this idea of generalization relates to separation of concerns, dependency inversion, and dependency injection. If you’d like to be notified when those posts are available, you can drop your email in the box below.

Takeaways from this post:

Generalization, as a “best practice,” is maligned as much as it is recommended. But generalization per se is neither good nor bad.
Generalization isn’t all or nothing, and doesn’t have to happen all at once. To get a test for sendSignupReport working, we didn’t have to radically change the code. We just had to find the right generalization (extracting a recipient parameter) to write a test that barely worked. Then we were able to iterate from there.
There’s often more than one way to generalize. When we wanted to decouple our code from its data source, we had a choice between accepting a database parameter and accepting an array of users. These options make different tradeoffs.
Beware of creating dishonest parameters when generalizing!

See you in the next episode.

Ben’s Guide to Software Development

Discussion about this post