The Dependency Graph

...is not the call graph!

Jan 03, 2024

We’re midway through my detour from the “testing around effects” series. In the previous post, we learned about the difference between static and dynamic function calls. In this post, I want to explore a closely related topic: the difference between implementation dependencies and interface dependencies. We’ll see how each of these types of dependencies affects the structure of the dependency graph.

Things I want you to take away from this post:

The dependency relationships in our software form a graph. The dependency graph often mirrors the call graph, but they are not the same thing.
Dynamic calls cause the dependency graph to diverge from the call graph.
Codebases with deep dependency graphs spawn mysterious bugs. By contrast, debugging is easier in systems where the dependency graph is shallow.
Dynamic calls can help us create shallow dependency graphs, even when the call graph must be deep.

A Definition of Dependency

First things first, let’s clarify what I mean when I say “dependency”. It might seem like an elementary term, but I’m going to be using it in a specific way in this post, and I want to make sure I cover all my bases. So:

A depends on B if failures propagate from B to A

Thing A depends on thing B if changes to or failures in B might cause failures in A. This definition might seem vague or hand-wavy, but it is as specific and formal as I care to get at the moment. Dependency is one of those things that looks more complicated the longer you stare at it, so I prefer to lean back and squint. We’ll see some of those complications shortly.

The Dependency Graph

It’s straightforward to draw a graph of dependencies in a codebase:

The nodes of the graph are the things (units of code or concepts) that can be depended on.
The arcs of the graph are dependency relationships. We draw an arrow from A to B if A depends on B.

Note the direction of the arrows. Some people like to draw dependency graphs with the arrows going the other way. I’m not going to say that’s wrong—it’s an arbitrary choice, after all—but I will point out that my way of doing it is consistent with UML conventions, and is therefore more likely to be widely understood.

Two Types of Dependency

The most straightforward type of dependency occurs when one function statically calls another. If function A statically calls function B, then A depends on B, because any bugs in B might cause bugs in A. I’ll call this type of dependency an implementation dependency1. When all the dependencies in a codebase are implementation dependencies, the dependency graph will look identical to the call graph.

Things get a bit more interesting in the presence of dynamic calls. Each dynamic call is an opportunity for an implementation dependency to be replaced by an interface dependency. Interface dependencies form whenever a few conditions are met:

Our code makes a dynamic call to some other function.
We don’t know or care what specific function is on the other side of that dynamic call…
…but we need that function to adhere to certain guidelines, or have certain properties. At a minimum, we probably need it to accept the types of arguments we’re passing to it, and return the right type of value. The function may also need to have other properties, like referential transparency, idempotence, commutativity, etc. in order for the algorithm we’re using to guarantee correct results.

Under these conditions, the properties we require of the function we’re calling constitute an interface. Our code depends on that interface. We can imagine an interface as something like a contract that defines the interaction between two parties, an implementer and a client2. The code that calls the interface is the client, and the code that receives the call is the implementer.

An interface constrains its implementers and provides guarantees to its clients.

Now let’s think about which way the dependency arrows go in this situation. If the interface changes—that is, if the terms of the contract change—its clients might not be able to function. Thus, clients depend on the interface.

Implementers also depend on the interface, because if the terms of the contract change, then the implementers might not be able to fulfill the contract anymore—and fulfilling the contract is part of their job.

So, the dependency graph for an interface dependency looks like this:

Three circles: a solid circle labeled client, a solid circle labeled implementer, and a dashed circle labeled interface. Arrows go from client to interface and from implementer to interface.

In other words, the client and implementer are peers in the dependency graph. Neither one can be broken by changes to the other.

What’s the point?

As should be clear from the length of my description above, interface dependencies introduce more (apparent) complexity than implementation dependencies. So why should we ever prefer them?

The answer has to do with the dependency graph I drew above. I drew the interface node with a dotted border to indicate that it isn’t made out of procedural code. Indeed, in a dynamically-typed language, the interface might exist only in our heads. Since the interface contains no code, it can’t have any dependencies, which means nothing else in the system can break it. If a chunk of code depends only on interfaces, that means that nothing can break that code, either. Interfaces thus allow us to provide ironclad stability guarantees. Imagine if you could ship code changes with absolute confidence that certain parts of your system would not break! Interface dependencies can give you that confidence.

An Analogy

To illustrate why interface dependencies are so useful, consider a physical interface that exists all around us: the humble electrical outlet.

There are dozens of electrical outlets in my house. Some are connected to switches, some are part of power strips with surge protectors, some have ground fault circuit interrupters, and some have none of these features. But all of them implement a simple contract: they provide 120 volts of 60 Hz AC to a standard US plug. Because all of my electrical outlets implement a common interface, I can plug any appliance into any outlet and be confident that nothing will explode. And if the utility company upgrades the power lines, or replaces gas-fired power plants with solar, I don’t need to buy new appliances, as long as I’m still getting 120 volts.

There are subtler benefits, as well. The ubiquity of electrical outlets, and the appliances that plug into them, means that I can use different combinations of outlets and appliances to diagnose failures.

"Electric Kettle, Melbourne Museum" by Miss Dilettante is licensed under CC BY-SA 2.0.

Suppose I’m trying to make coffee one morning and discover that my electric kettle doesn’t work. Is the problem with the kettle or the power outlet? I can test the kettle by plugging it into a different outlet, and I can test the outlet by plugging in a different appliance. These tests let me quickly determine if I need to call an electrician or a kettle repair person3.

The same flexibility that lets me diagnose failures makes it easy to work around those failures. If one outlet stops working, I can just use another one, since there’s nothing wrong with the kettle. In the context of interface dependencies, if one part of the system breaks, the other parts are unaffected.

Imagine an alternate reality where, instead of an electric kettle, I had a water-heating device built into my kitchen counter. (This is equivalent to a kettle with an implementation dependency on a particular outlet). It might look slick, but if the power supply stops working one day, I’m stuck. Since I can’t isolate the broken part, I can’t work around the problem or even diagnose it. In the context of implementation dependencies, any component that uses a broken component is as good as broken.

The Dependency Graph Is Not the Call Graph

Enough philosophizing and analogizing. It’s high time for a code example.

Here’s a simple (if contrived) example to show how the dependency graph changes in the presence of interface dependencies. We’ll start with a function printMuppetNames, which logs the names of some characters from The Muppet Show:

const muppets = [
  {name: "Kermit"},
  {name: "Miss Piggy"},
  {name: "Gonzo"},
];

function printMuppetNames() {
  console.log(names(muppets));
}

function names(muppets) {
  const results = [];
  for (const muppet of muppets) {
    results.push(name(muppet));
  }
  return results;
}

function name(muppet) {
  return muppet.name;
}

printMuppetNames calls names, which calls name. All of these calls are static. Therefore, the dependency graph and the call graph both look like this:

three circles: print muppet names, names, and name. Arrows go from print muppet names to names, and from names to name.

Now, let’s consider an alternative implementation of the same behavior, using Array.map instead of a for loop:

const muppets = [
  {name: "Kermit"},
  {name: "Miss Piggy"},
  {name: "Gonzo"},
];

function printMuppetNames() {
  console.log(muppets.map(name));
}

function name(muppet) {
  return muppet.name;
}

In this implementation, the call graph still has the same linear shape as before: printMuppetNames calls map, and map calls name. However, the dependency graph has changed. printMuppetNames now has two direct dependencies, on map and name, and map doesn’t depend on anything in our code.

Three circles, labeled print muppet names, map, and name. Arrows go from print muppet names to map and from print muppet names to name.

Even though map is calling name, it doesn’t depend on name. Recall that our definition of dependency says that A depends on B if a bug in B can cause a bug in A. If name has a bug, that doesn’t create a bug in map. map might be used in thousands of places in our codebase, and we can confidently say that none of those usages are jeopardized by a problem with name.

How is this wizardry possible? The secret is that map and name both depend on an interface: one so tiny and elementary that if you blink, you’ll miss it. That interface is simply the concept of a function.

Four circles: three solid circles labeled print muppet names, map, and name, and a dashed circle labeled function. Arrows go from print muppet names to map and name, from map to function, and from name to function.

That is, map needs to be provided with a function in order to do its job, and name fits the bill. The only constraint is that the function passed to map needs to accept an element of the mapped array as its argument. name satisfies this constraint because it accepts a muppet, and our array contains muppets. There’s a way to formalize this constraint with TypeScript types, which allows the compiler to prove that everything’s copacetic, but we’ll leave that for a future post.

(If you’d like to be notified when I write about types, or anything else, why not hit the subscribe button?)

Why any of this matters

Bugs and breaking changes are contaminants that propagate backward through the dependency graph. E.g. say I’m the author of a library, and I release a new version that has a bug. That bug will affect both the systems that use my library directly, and everything that depends on those systems. Ouch.

Bugs are harder to diagnose when the symptom appears many steps away from the cause. If a dependency of a dependency of a dependency changes and causes a failure in my code, I’m going to have to do a lot of digging to find the source of the problem. When the dependency graph is shallow, on the other hand, bugs can’t propagate as far, and debugging is easier. This implies that deep dependency graphs can be problematic.

If we are used to using static calls for everything, the motivation to avoid deep dependency graphs puts us in a bit of a quandary. That’s because “divide and conquer” has traditionally been the overarching strategy when it comes to software design—it’s the approach I was taught in school, anyway. With divide and conquer, we solve big problems by recursively breaking them down into subproblems, until the problems are so tiny that we can solve them with a single gesture. Unfortunately, this strategy tends to produce deep call graphs, and when all calls are static, the dependency graph will also be deep. The difficulty of debugging a deep dependency graph makes it hard to continue growing a large4 software system designed via a top-down, divide-and-conquer approach.5

The solution is to “cut” the dependency graph—to introduce dynamic calls at appropriate abstraction boundaries. By doing this, we can keep the dependency graph shallow, and enable even our largest software systems to continue growing sustainably.

Tradeoffs

There is a tradeoff to using interface dependencies, of course. As I noted earlier, when two components interact via an interface, they are peers in the dependency graph. That peer relationship gives you a lot of flexibility to plug different parts together, but the cost is that something or someone has to do the plugging. Generally, that’s you, the programmer responsible for the system.

Going back to the electrical analogy—I happen to like my electric kettle + wall outlet setup, but the downside (compared to the hypothetical built-in countertop water heater) is that I need to know about two things, the kettle and the outlet, and I need to manage their interaction. That is, I have to be more involved in the system, and understand more about how it’s put together. Someday, I will write a philosophical ramble about being involved, and the value and costs thereof, but for now I will spare you the digression. In the meantime, you can visit your local library and check out Zen and the Art of Motorcycle Maintenance, which treats the issue in some depth.

Wrap-up

We covered a lot of ground in this post, and I’m sure I glossed over some details. As always, feel free to ask questions in the comments section below. I really value your feedback.

If I’m reading the tea leaves right, the next post will return to the “testing around effects” series. We now have all the pieces in place to talk about dependency injection and dependency inversion, two alternative views of the “generalization” strategy for making code testable. Stay tuned.

The terms “implementation dependency” and “interface dependency” are used in https://www.researchgate.net/publication/2565865_On_the_Notion_of_Variability_in_Software_Product_Lines

Indeed, the term “contract” has been used in the sense in which I’m using “interface” here. Maybe I should have said “contract” instead of “interface”, but I wanted to be consistent with “interface dependency”. Time will tell if this was a mistake.

The kettle story is a true story, and, as usual, truth is more interesting than fiction. It turned out that both the kettle and the outlet were okay. The outlet’s circuit breaker had just tripped. It was a weird GFCI outlet that didn’t have built-in test/reset switches—those were located on another outlet on the other side of the room!

“Large” is relative, but as a rule of thumb, by the time I have 5 or 6 levels of static calls, I am feeling the pain.

Compare what I wrote in my post on The Call Graph: “Once [the program] grows beyond a certain size, you can’t keep it all in your head at once, and navigating it starts to feel bewildering. […] Every few months or so, I feel the need to draw out a call graph for a section of code on paper. When I do, I take it as a strong signal that the code is a mess.”

Fei Yang

Jan 6, 2024

Nice post, Ben!

One question: do you think that an interface also has a dependency on the implementer? Say I’m a baker and I posted in my menu 2 buns for a buck. I run out of eggs so no buns. In that sense, doesn’t that error affect the contract I provided? Or would you argue that error is actually propagated to the customer?

Expand full comment

3 replies by Ben Christel and others

3 more comments...

Ben’s Guide to Software Development

Discussion about this post