When your codebase is smaller than a few hundred lines, everything can go in one file. But as the code gets more complicated, you’ll probably want to split it into multiple files, so that:
You can more easily see how the larger centers relate (e.g. through imports) and how centers in the same file cohere.
You don’t get distracted by extraneous code as you’re programming.
Once you start creating multiple files, though, you have to answer some hard questions. What should the files be called? How should they be organized?
In this post, I want to talk about my current “Platonic ideal” of how to organize application source code, which for the purposes of this post I’ll call diamond design. Diamond design organizes code into four “facets:”
Product-specific code
Domain and business logic
Platform code
Library code
In this post, we’ll cover product-specific and library code.
“Diamond design” is a work in progress
The ideas in this post are likely to undergo revision over the next few years. One sign: while I’ve tried to employ this organization scheme on several projects, I haven’t adhered to it exactly on any of them. Maybe the concept is fundamentally flawed, or maybe reality is just too messy for any rigid organization scheme to work perfectly. In any case, use these ideas with caution, adapt them to your context, and expect change.
I’m publishing this post anyway, in the hope of clarifying my own thoughts and getting feedback. If you figure out a variant of diamond design that works better, I’d love to hear about it!
Product-specific code
Product-specific means the code knows about both…
the platform, i.e. how your product is delivered to users (Is it a web app? A command line tool? A mobile game? A GraphQL API? A browser extension?)
the domain, i.e. what the product is “about.” (Music production? Graphic design? Finance? Dungeons and Dragons?)
Product-specific code is the glue that holds these other facets together. It generally includes the entrypoint (main()
), any application frameworks, installers, or other scaffolding, the user interface, and possibly other stuff.
Product-specific code depends on and connects every other part of your program. Because it depends on so many things, it is constantly changing.1 If not managed carefully, these dependencies make it complex, hard to reason about, and hard to test.
In spite of these downsides, the great majority of application code is product-specific. That’s probably inevitable to some extent — gluing a domain to a user interface just seems to require a lot of words somehow. Still, “big” is not the same thing as “complicated,” and it’s complexity that really hurts. The purpose of diamond design is to reduce the complexity of product-specific code as much as possible, by moving logic into the other facets. The hypothesis (supported by my experience) is that this will result in a system that is easier to maintain and improve.
Naming
I typically put product-specific code in a folder called app/
. Not the most creative, I know. But it gets the job done.
Organizing product-specific code
Since product-specific code tends to be bulky, it is likely to need some amount of internal organization. We can apply a simple principle here:
Things that change together should live together.
Things that change at different times, at different rates, or for different reasons should live apart.
I remember hearing this advice from Sandi Metz, though I don’t have the exact source handy. The reason we want things that change together to live together is simple: it makes it easier to find the code you have to change. Plus, if you want to cleanly remove a feature (something I suspect software companies will be doing a lot of in a few years) it’s much easier to do if all the code is in one folder.
This is an argument against the Ruby on Rails approach of e.g. having separate folders for models, controllers, and views. You’re rarely going to change all of your models at the same time, or all your controllers at the same time. But almost every feature you build will touch a model, a controller, and a view.
Therefore, if you want changes to cluster, it’s better to have a separate folder per feature, or per page. You might, for example, have a bunch of code related to logging in and out, resetting passwords, and so on. That can go in login/
. The code for user settings can go in settings/
. The code for discussion posts can go in discussions/
. Et cetera.
What about code that’s used by multiple features? Well, maybe you need an app/shared/
directory… but quite possibly that code shouldn’t go in app/
at all. Remember, there are three other facets of the diamond we haven’t even talked about yet! So let’s look at those alternatives.
Library code
We can go to the opposite extreme from app-specific code and write code that is coupled to neither our domain nor our platform. That’s library code. Library functions often deal with language-level primitives (e.g. strings and arrays) or open standards (e.g. JSON).
Elephant in the room: why can’t we use a third-party library? Why can’t we use the standard library that comes with the programming language?
If you can get the code you need that way, go ahead and do it. But:
If you’re shopping for third-party libraries: beware of supply-chain attacks. Choose libraries with few or no dependencies of their own. Choose something that’s been around long enough to have a track record. Check that the maintainer’s incentives align reasonably well with yours. Installing a library is a bigger commitment than most people realize, and an NPM package that someone wrote in a weekend and abandoned is probably not something you want to marry your application to.
If you’re using the standard library: consider whether your application code could express itself more eloquently in terms of a simpler abstraction or a differently-shaped API. Standard libraries often prioritize exposing low-level functionality over ergonomics — sometimes for performance reasons, sometimes because the general problem really is that complicated.2 You know your needs better, and can build a library that’s just right for you.
When neither the standard library nor a third-party library will fit your needs, you need a place to put your own internal library code. This facet gives you a place.
Here’s an example of when you might write library code: in JavaScript, there’s no built-in function to find all occurrences of one string within another. Weird, right? It seems like that would be a pretty commonly-used function. If you want it, though, you’ll have to write it yourself.
// I, Ben Christel, am the author of this code block.
// I hereby release it into the public domain.
const NOT_FOUND_INDEX = -1
/**
* Finds each occurrence of `needle` in `haystack`.
*
* @yields each index within `haystack` at which an occurrence of
* `needle` starts. Occurrences can overlap. Indices are yielded in
* ascending order.
*/
function *indicesOf(needle, haystack) {
for (let i = 0; i <= haystack.length; i++) {
i = haystack.indexOf(needle, i)
if (i === NOT_FOUND_INDEX) {
break
}
yield i
}
}
When you write functions like this, you should put them in an easily-findable place separate from the rest of your code. The reason is simple: someday, someone else is going to want that indicesOf function you just wrote, and if they can’t find it they’re going to implement it all over again. If, on the other hand, they can easily find it by browsing src/lib/strings.js
, they’ll be delighted. Even if the function they want turns out to be slightly different from the one you wrote (maybe they need nonOverlappingIndicesOf
or reverseIndicesOf
), seeing your code will ease the task of writing their own.
A word of caution: “library code” is not equivalent to the notorious utils/
folder. When I say “library code,” I’m using that as shorthand for code that…
is domain- and platform-agnostic
manipulates language primitives (strings, arrays), open standards (URLs, XML, JSON Schema), or other ubiquitous concepts (e.g. dates and times, or geospatial data)
Here’s the litmus test for deciding whether something is library code. When you look at just that code in isolation, can you tell what application it’s from? Can you infer anything about the domain, or the kind of program it is, whether it’s a web service or a Minecraft mod or what? If you can, it ain’t library code. It goes in one of the other facets — probably domain/
or platform/
.
Naming
I generally put library code in src/lib/
. That name is confusing, though, if the project is itself a library (e.g. an NPM package) so I’ve sometimes used general-purpose/
or gp/
instead.
Organizing library code
I generally put library functions in a file named after the most structured type on which the function operates. For example, these functions operate on both numbers and strings:
function parseNumber(s: string): number | undefined
function formatNumber(x: number): string
Since numbers are “more structured” than strings, in the sense that some (but not all) strings represent numbers, but all numbers can be written as strings, the functions would live in src/lib/numbers.ts
.3
Organizing library code that deals with open standards is easy: the file is named after the standard. Code that deals with URLs goes in src/lib/urls.ts
.
Up next: platform and domain code
In the next post, I’ll talk about the other two facets of diamond design: platform code and domain code. Stay tuned.
This follows from the definition of dependency in one of my previous posts.
This is why JavaScript’s Array#sort
method sorts in-place rather than returning a sorted copy of the array. Sorting a copy is more often what you want, but if the standard lib made the copy for you, there would be no way to recover the more efficient in-place sort from that API.
To get a sense of what “more structured” means, see Alexis King’s post “Parse, don’t validate.”