Great code is CEVICHE

Nov 18, 2024

The title of my work-in-progress book is Process to Processes. That’s because software development starts with a human process (team structure and development workflow) and ends with computer processes that effect some desired behavior. The first process fully determines the nature of the second.

Somewhere in between, we have code. There are lots of ways we can structure the code for any given software process. Many, many (infinitely many?) different structures can create identical behavior. Yet the thesis of software design, as a discipline, is that some of these structures are “better” than others, even though behaviorally they are all the same.

This claim seems to take us into the realm of subjective opinions. Who’s to say what’s better? What does “better structure” really mean?

Recall from the previous post that our high-level goal is to make software development less stressful and more joyful for everyone involved. And that a lot of the stress in our line of work is ultimately caused by mental models that don’t match reality. If we could build accurate mental models, software development would be easier and more predictable, and we would suffer fewer mishaps along the way.

As it turns out, the structure of the code can help or hinder our modeling efforts. For the purpose of Process to Processes, “better code” means code that helps us build mental models of the system.1

However, that definition is too abstract to be practical. We need more specifics. It would be nice if we could look at any given chunk of code and say specifically what’s good or bad about it. It would be even nicer if we could turn those evaluations into a to-do list of how to improve the code.

I think we can get there, but it will probably take a book’s worth of text and examples to explain how. In this post, we’ll start our journey with a high level overview.

Introducing CEVICHE

Software people have invented various acronyms to try to characterize what makes code good. We have (at least) SOLID, CUPID, TRUE, and WARMED. Coming up with a coherent acronym seemed like a fun challenge, so I thought I’d try my hand at it. CEVICHE is the (somewhat raw) result. It’s a list of the properties of code that enable developers to build accurate mental models of the system.

A dish of ceviche, surrounded by sauces and sides served on scallop shells — "Ceviche" by T.78UopXx is licensed under CC BY-SA 2.0.

Good code is:

Controllable:
- You can make the code do what you want.
Efficient:
- It doesn’t waste your time.
Verifiable:
- It inspires trust (via tests or otherwise).
Instructive:
- It might teach you something.
Consequential:
- It’s load-bearing.
Harmonious:
- It plays well with the rest of the system.
Extricable:
- It makes sense in isolation. You can confidently use it, repurpose it, and ultimately delete it.

CEVICHE is meant to complement the other acronyms (SOLID, CUPID, TRUE, WARMED), not to replace them. Comparing and contrasting them is left as an exercise for the reader.

Let’s explore the CEVICHE properties in more detail.

Controllable

Great software is controllable. You can find all the knobs and levers, and you know what’s going to happen when you pull them. Whether you’re changing the code, or using existing code, it’s easy to get the software to do what you want. The result is that controllable code does what its authors intended it to do.

Another C-word that I considered for this spot was correct. This seems like an obvious choice. We all want code to be correct, right?

The question is, though: correct by what standard? Usually, in application development, there is no formal spec for what we’re doing, and even if there is, it will probably change by the next release. This kind of “correctness” isn’t stable; rather, it’s based on the whims of users and product managers. The desired state of the software is…

a moving target
not fully knowable (we can’t see inside our users’ heads to know what they want)

Whatever definition of “correct” we set up today is likely to be gone by tomorrow. So instead of asking “is the code correct,” let’s consider a slightly different and more useful question: will it do what we intended?

As David Bryant Copeland writes in “Actual Reasons to Use TDD”:

We can all agree software should do what we expect. Set aside “correctness” (a meaningless term if I’ve ever heard one). Don’t worry about “working software”. Instead think about the question on our minds as we write code, the question we had from our first moment of coding, and that we still ask as we do our jobs today: is the software doing what I expect?
—David Bryant Copeland

If we can keep the software doing what we expect it to do, moment by moment as we program, then we can easily adapt it to whatever definition of “correct” our customers throw at us. In order to do that, we have to control the software and keep it controllable as it grows.

Efficient

I have some qualms about including “efficient” on this list, because programmers are already prone to “optimizing” all the wrong things in all the wrong ways. So remember Donald Knuth’s warning:

Programmers waste enormous amounts of time thinking about, or worrying about, the speed of noncritical parts of their programs, and these attempts at efficiency actually have a strong negative impact when debugging and maintenance are considered. We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.
Yet we should not pass up our opportunities in that critical 3% [. . .] but only after that code has been identified. It is often a mistake to make a priori judgments about what parts of a program are really critical, since the universal experience of programmers who have been using measurement tools has been that their intuitive guesses fail.
—Donald E. Knuth, “Structured Programming with go to Statements”

In spite of my doubts, I am going to advocate for writing fast programs — for a particular definition of “fast.” My motive is simply that I think many programs today are too slow to support an optimal development process. For instance, the popular JavaScript test framework Jest takes longer to start up than Google Chrome! That’s way too slow to run as part of a programmer’s “inner loop.”

Indeed, speed is one of my most common complaints with modern software, and one of the main factors I consider when making “buy vs. build” decisions. I’ve written a test framework and a static site generator because the alternatives were either too slow or too cumbersome for me to feel really comfortable using them.

This isn’t just my pet peeve. In 1982, two researchers at IBM discovered that shortening computer response times to 0.4 seconds (from the accepted standard of 2 seconds) resulted in a disproportionately large improvement in productivity. When people no longer had to wait for the machine, they typed commands faster and spent much less time collecting their thoughts. The critical threshold of 400 milliseconds has become known as the Doherty threshold, after one of the researchers.

Another important threshold is 80–100 milliseconds, the “threshold of noticeability.” While the Doherty threshold is a good target for “conversational” interactions like command line interfaces and web browsing, response times below 80 milliseconds are required for an interaction to feel immediate, like you’re manipulating a physical object. For instance, clicking a button should give you some kind of feedback within 100 milliseconds — at the very least, an animation confirming the click. Similarly, characters typed in a text field should appear in under 80 milliseconds. It’s pretty inexcusable for something so basic to take longer than that.

If you’re writing a game or an interface that needs to be animated, there’s a third threshold at 16 milliseconds—your budget for rendering one frame.

In light of these thresholds, I can explain what I mean when I say that programs should be “efficient.” An efficient program is one that responds fast enough to support an optimal user experience. That means commands should run and webpages should load in 400 milliseconds, GUIs should give you feedback in 100 milliseconds, and animations should run at 60 frames per second.

How can we hit these thresholds? For the most part, don’t worry about optimizing application code. The performance of a typical application is largely determined by three things (in no particular order):

Feature complexity. with fewer, simpler features, your code can run faster. Focus on the most valuable features and drop everything else.
Third-party libraries. The inner loops of your program will probably not be in code you wrote yourself. Choose fast libraries.
Architecture.

“Architecture” includes stuff like:

Is your UX event-driven, request-response, or batch?
What programming language are you using?
What database are you using? (or: how do you organize data at rest?)
When and how does I/O happen? Is it parallel or sequential?
How far does data have to travel over the network?
Are you making many small web requests or a few large ones?
Are you using one thread/process, or many? What is your strategy for inter-thread or inter-process communication?

The “fast” answers to these questions are often counterintuitive. For now, the best heuristic I can offer is the KISS principle:

Keep it simple, silly!

Haskell can be faster than C. Flat files can be faster than a database. Single-threading is often faster than multi-threading unless you’re operating at an astronomical scale. You always want to keep in mind how much data or traffic you really need to handle, and not overbuild. Simple architectures scale farther than you think. Start simple, measure, and optimize the bottlenecks.

Verifiable

Trust, but verify.
—Russian proverb, often quoted by Ronald Reagan

A verifiable program is one that gives us a reason to trust it. When we think of “verification” we often think of “testing,” but the concepts are not synonyms. Software verification includes testing, but also some other techniques:

Static verification techniques
- Typechecking
- Linting
- Code review
- Formal proofs
Dynamic verification techniques
- Manual testing / QA
- Unit testing
- Automated system testing
- Monitoring
- Runtime assertions

You don’t need to use all of these techniques for every program. But they are all useful. As always, pick and choose the most valuable techniques for your context.

Together, the production code and the systems that verify it can be considered a self-verifying system. A codebase that contains its own tests, types, linter rules, etc. is self-verifying.

There’s more to verifiability than just putting tests, types, and other checks into a system. The code itself needs to be shaped to make verification easy. In order to be easy to test, code needs to have a simple, consistent interface, a cohesive purpose, deterministic behavior, and, ideally, no effects. There are also constraints on the shapes of code that can be reliably typechecked, as anyone who has tried to retrofit TypeScript onto a legacy JavaScript codebase can attest.

Verifiability and controllability go together. A self-verifying system resists attempts to bungle it up, and is thus easier to control. When controllability degrades, verifiability helps us get back on track, letting us refactor without fear of breaking everything. When the code isn’t doing exactly what we want, we can isolate the problem in a test. That way, we can be sure the problem is fixed and won’t come back.

Yet controllability and verifiability are also somewhat independent. While controllability is about having reliable knobs and levers on our figurative control panel, verifiability is about the gauges and dials. In a self-verifying system, we can get information about the current state and feedback about our actions. A self-verifying system can even teach us how to use it. Don’t know what a line of code does or whether it’s needed? Delete it and see what tests fail. Want to see where a function is called? Comment it out and see what compile errors you get. Trying to figure out how a function handles an edge case? Read the tests (or write a new one). With the right techniques, verifiability can be more than a safety net: it can help us feel happy, capable, and confident at every moment while we’re programming.

Instructive

You know you’re reading good code when you can’t tell what problem it’s solving or how it solves it. Good code makes you feel confused, stupid, and ignorant.
—no one ever

According to Ward Cunningham, “You know you are working on clean code when each routine you read turns out to be pretty much what you expected.” But what if you’re a noob like me and you don’t know what to expect? It would be nice if the code could teach me how it works.

Much of the code I work with is “easy to read” only if I already know what it does and why. But sometimes, I come across unfamiliar code and have an “aha!” moment. The code is so lucid, so clear and simple, that it has communicated something new to me and I now understand both the problem and its solution more clearly.

Dan North says that good code is domain-based: it uses the language and concepts of the business domain. It’s nice when variable and function names are recognizable to a nontechnical team member, but we can go one step further. Ideally, programmers should be able to learn the business domain by reading the code. If we need to write and read separate documentation to learn about the domain, we’ve missed an opportunity for efficiency.

Code can, of course, communicate in technical domains as well as business ones. Great code teaches new programmers old tricks. I’ve mentioned this example before, but I’ll repeat it because I love it so much: this regex for extracting C strings from a file taught me how to think about backslash escape-sequences:

// A string consists of:
// - a double-quote character
// - any number of "units", where a unit is either:
//   - an escape sequence: a backslash followed by any
//     character
//   - a literal character other than a backslash, quote, or
//     newline
// - a closing double-quote.
const cString = /"(\\.|[^\\"\n])*"/

Another example, more recently discovered: an algorithm for detecting whether two 2-D line segments intersect. The details of the code are obscure, but the explanation on Stack Overflow makes the high-level approach delightfully clear.

export const segmentsIntersect = (
    [[a, b], [c, d]]: Segment,
    [[p, q], [r, s]]: Segment,
): boolean => {
    const determinant = (c - a) * (s - q) - (r - p) * (d - b);
    if (determinant === 0) {
        return false;
    } else {
        const lambda = ((s - q) * (r - a) + (p - r) * (s - b))
            / determinant;
        const gamma = ((b - d) * (r - a) + (c - a) * (s - b))
            / determinant;
        return 0 < lambda && lambda < 1 && 0 < gamma && gamma < 1;
    }
};

Is it significant that both of these examples have commentary longer than the code itself? Maybe. I don’t always comment my code, but when I do, I aim for comments like these: paragraphs that give a high-level explanation of what’s going on. In my experience, these comments are the most empowering for future readers, and the most durable. They’re also a lot of fun to write.

Still, if you were to criticize the examples above for being terse to the point of obfuscation, I wouldn’t disagree. Maybe what you should take away from this section is not just “Good code is instructive” but also “Good instruction comes with code examples.”

Consequential

Good code is consequential. That is, it does something. There is a reason for it to exist.

This point may seem too trivial to mention, but I’ve seen and written a lot of dead, tautological, and otherwise useless code in my career. It’s easier to write inconsequential code than you might think.

Inconsequential code can be subtle. For example, in the following Ruby snippet, the check for transactions.empty? is not needed, because transactions.all? returns true when transactions is an empty array.2

if transactions.empty? || transactions.all?(&:complete?)
  # ...
end

Do-nothing checks like this can morph into actual bugs as the code changes. I recently fixed a bug in some web server code that looked very similar to the above. It went something like this (pseudocode):

if count(jobs) is 0 {
  return
}
for each job in jobs {
  job.perform()
}
doNextStep()

You can probably spot the bug: doNextStep was supposed to happen even when there were no jobs, but the early return caused it to be skipped. The fix was simply to delete the unnecessary empty-array check.

The examples above are simple, but in complex codebases, whole sections of logic can “die” (when their outputs are no longer used) and become inconsequential code. These sections of inconsequential code are hard to find, so they’re rarely removed. In a codebase with low verifiability, inconsequential code is a Chesterton’s Fence. It can be hard to tell that the code isn’t actually needed, so deleting it seems risky. The cost of proving the code safe to delete is higher, in the short term, than the cost of keeping it around. So it sticks around.

When inconsequential code piles up, the codebase dies by a thousand cuts. No single bit of derelict code hurts much, but in aggregate they are a huge impediment to change.

Harmonious

Good code harmonizes with:

Team norms (code style)
Development tools
Architecture

Disharmonious code compromises the wholeness or coherence of the system, making it harder to describe and thus harder to reason about.

The idea of harmonizing code with development tools deserves further explanation. There are lots of different tools to read and modify code: editors, scripts, version control systems, etc. Code needs to be designed to work with the tools the team is actually using.

One of the most common tools people use to work with code is text search, or grep. Code should therefore be structured for searchability. That all but rules out dynamic approaches like this:

// Bad example: this code isn't greppable.
Database["create_" + entityName](id, attributes)

If you want to find out where Database.create_user is called, this code makes it really difficult.

Another simple example of harmonizing code with tools, from my essay on Christopher Alexander:

I was once on a team writing C++. We had a coding convention that dictated that all member variables should be prefixed with an underscore, like _someName. I suggested that we drop this rule, since our IDEs highlighted member variables in a unique shade of purple that made them easy to spot even without a special prefix. My teammates objected: some contributors to our open-source codebase might be using editors that didn’t have that feature. The prefix was for them, not for us. What seemed like good fit to me (accommodating our code style to the capabilities of our tools) wasn’t a good fit in the bigger scheme of things.

Harmony is related to the notion of conceptual coherence emphasized by Fred Brooks in The Mythical Man-Month.

I will contend that conceptual integrity is the most important consideration in system design. It is better to have a system omit certain anomalous features and improvements, but to reflect one set of design ideas, than to have one that contains many good but independent and uncoordinated ideas.
—Fred Brooks, The Mythical Man-Month

Conceptual integrity impacts the development process, because when a design choice is at odds with the wholeness of the system, it tends to lock in other choices and make the code more rigid.

A thought experiment: say we have a codebase with features A, B, and C. These features are independent, in the sense that we can understand, test, use, and improve each one in relative isolation from the others. We can also remove any feature if we find it’s no longer valuable.

Now imagine someone wants to add a disharmonious feature, X. Feature X doesn’t slot nicely into the design. Instead, it cuts across the functionality of A, B, and C, requiring them all to change in different ways. So after adding X, we have a system A’, B’, C’, X. Now, awkwardly, none of the features can be understood in isolation. You need to understand X to understand A’, B’, and C’ in their entirety. Moreover, A’, B’, and C’ can no longer be removed, because they are needed to support X.

With only four features, this might be manageable, but of course real systems are more complex. If you start with a dozen or more harmonious features, and then tack on a few disharmonious ones that lock together the others in various combinations, you’re well on your way to spaghetti code.

Unfortunately, I don’t have a real-world example handy, and so this explanation of why disharmonious changes hurt might be totally bogus (though impressionistically, I’m pretty sure I’ve experienced it happening). It’s just my current hypothesis. An area for future research, I suppose.

One way to deal with this situation is simply not to build feature X. But that’s not very satisfying. What if we really need it? Think about what design would accommodate A, B, C, and X easily. Refactor the system to that design first. Then add feature X. Kent Beck said it best:

For each desired change, make the change easy (warning: this may be hard), then make the easy change.

Extricable

One more attribute of great code, and then we’ll wrap up. Great code is extricable from its context. Because it is extricable, it fits in your head, you can understand it in isolation, and you can reuse it in situations its author never imagined.

My first full-time programming gig was working on a Ruby on Rails app for an e-commerce site. One of the nice / scary things about Ruby on Rails is, it makes it really easy to run ad-hoc code in production. You just shell into your server, type rails console, and bam, you’ve got an interactive Ruby interpreter, ready with all your web server code and a connection to your production database. The nice thing about this, compared to running SQL statements directly on your data, is that you have access to all your business logic: validation, after_save hooks, and even whole workflows if you need them. So the risk of getting your data into an inconsistent state is (arguably) less than if you had a SQL prompt.

On one memorable occasion, there was a bug that got a bunch of customer accounts into a weird state, and it was my job to get them unstuck. I knew there was a Ruby method I could call that would do what I wanted; it was called something like finish_signup. There was just one problem: finish_signup would do a bunch of things that I didn’t want, as well as the thing that I did want. E.g. it would re-send the welcome emails that all of those customers had already received.

I eventually got the situation untangled, but it wasn’t as easy I would have liked. And I narrowly avoided substituting one problem for another. finish_signup was a footgun.3

My takeaway from that experience was this guideline: “write code that you’d trust in a production incident.” When you’re dealing with an emergency, you don’t want to have to think about what hidden side effects might be invoked by a piece of code. You want code that’s predictable and side-effect-free.

In the years that followed, I came to appreciate how this guideline fits in with approaches like ports-and-adapters (a.k.a. hexagonal) architecture, “functional core, imperative shell,” and functional programming more generally. When code is made of pure functions and data is immutable, both code and data are easier to move around.

Data can be sent over the wire or persisted
Logic can move from the client to the server or vice versa.
Application code can be reused in development tools, like a making app or database migration script.
The core of your web app can be extracted to a library and repackaged as a command line tool, desktop app, or mobile app… or simply distributed as a standalone product.4

Mobility of logic and data makes it possible to redesign the architecture of your application — the arrangement of processes and how they communicate, the data storage technology, the web framework you’re using — without making any changes to your business logic code. Likewise, it ensures that changes to business logic don’t cascade into infrastructural code. That’s good, because business and architecture are separate concerns. They change at different rates, at different times, and for different reasons. Having to change one when you change the other makes every change much more expensive.

The property of extricability, though, isn’t just about writing functional code and immutable data, nor about separating business concerns from technical ones. It’s about separation of concerns more generally. Extricable code is meaningful and comprehensible in isolation. You can run it without running the entire system, and you can understand it without knowing how it’s currently used. This property allows you to change code safely, even if you don’t understand every bit of code in the system.

Wrap-up

We all want software development to be more predictable and less stressful. To make it so, we need an accurate mental model of the system. The best way I know of to start building that model is to write code with the CEVICHE properties.

CEVICHE is far from perfect. You’ve probably already thought of things I’ve left out. What are they? Are there any properties you disagree with? Let me know in the comments.

In subsequent posts, we’ll look at specific techniques that help create the CEVICHE properties.

Some might argue “no, good code is code that’s easy to change.” But all code can be changed, trivially. Just open a file and change it. The question is, what effect did that change have? Our ability to answer that question relies on mental modeling.

This is logically sound; see vacuous truth.

That is, a gun to shoot yourself in the foot with.

It might be more realistic to imagine going the other way, from a CLI (written for yourself and other programmers) to a web app for the general public.

Ben’s Guide to Software Development

Discussion about this post