Daniel O'Neel

Software’s labor model

Software is a unique domain where you can infinitely layer abstractions. Nothing in the physical world behaves like this. You can only build a brick wall so high before it collapses under its own weight. If you stick an extra transmission or axle into a gear shaft you lose torque and efficiency. You lose something every step of every physical manufacturing process. Digital is different though. Errors and imperfections don’t accumulate and compound at each new layer of process like in the analog world, so you you can keep adding new transformations indefinitely.

“Software is eating the world” is about two things.

  1. The ability to infinitely layer abstractions means software can do all information processing better than people.
  2. Moore’s Law has ensured software can do information processing cheaper than people.

Software can do a million translations in a second with barely any electricity (low cost) and without losing any data (perfect efficiency). This property is the source of software’s competitive advantage in many situations but it has some unintuitive properties.

To illustrate this let’s use the analogy of car manufacturing. Most people think of building a website or an app like building a car. Sure it’s complex, and sure you might change your mind on certain design choices and have to order new parts, but at the end of the day you’re building something piece by piece (or rather webpage by webpage) until we have a finished but modifiable product. At any moment, you expect a competent engineer to be able to paint it a different color or upgrade the headlights. Take my word, for a moment, that building software isn’t like building a car, it’s much more like building a car manufacturing plant. Imagine that for whatever app you’re using right now, the engineers didn’t so much build the app as build a factory that produces that app over and over again for many different people.

Imagine you want to replace your car’s fabric seats with leather. I’ve never done that, but I’m guessing there autoshops who will install custom seat materials for you if you want. In software, you don’t have the option of one-off modifications because there is not durable item. To metaphorically install leather seats eve once in your software, you have to do the equivalent of talking to the manufacturing plant and asking them to modify the production line to enable installing leather seats. This move from an individual item to a universal factory is an illustration of adding a layer of abstraction: you can do much more by modifying a factory than by modifying a single item, but every change is more complex. This is the magic and the cost of software. Once you do this modification, all cars you’ve ever produced off this assembly line suddenly have the option to upgrade to leather seats at no marginal cost, but there’s no way to do that without massive up front costs.

Upfront costs

Starting with the obvious one, this is why software forces you to pay higher upfront costs. Most people with passing familiarity understand this dynamic, but nearly everyone underestimates the order of magnitude because they imagine how long it would take to install a leather seat instead of imagining how to source and install the chain of sourcing leather, shipping it to the facility, quality controlling before installation, figuring out machines that don’t stretch the leather the same way they’d stretch fabric, temperature controlling the process so you don’t heat damage the leather after installation, etc. Your run of the mill software engineer doesn’t build web pages so much as build a factory to source information, compile and format it in a repeatable process, and ship it to a customer.

Marginal feature costs

Let’s extend the analogy. Take my word for it that many parts of software, even simple software like websites, are less like building a manufacturing plant and more like building capital machines that in turn automatically build manufacturing plants that in turn build the cars themselves. It’s an analogy for the software reality of stacking abstraction layers on top of each other to automate more and more. It’s factories all the way down! When you imagine capital machines building factories, if you decide to build in support for leather seating, you’re not just going to get the option to have leather seats on one type of car, you’re going to get leather seats on all models of cars you can manufacture. Incredible.

But let’s imagine what’s involved in doing that. We now need to worry about not just how to install leather, we need to worry about where in the manufacturing line that step takes place and potentially move around other steps. We need to make sure the seat installation machine doesn’t leave any grease that would smudge leather seats in a way that wouldn’t have shown on fabric seats. We need new QA steps to inspect the leather before we cut it and also after seat installation. And the more complex our factory already is, the more things we might need to rearrange. The end result is that in a sufficiently advanced product, the cost of supporting a new feature is dominated 99% by the complexity of modifying the machinery for the rest of the product, not by the complexity of the feature itself. When you ask an engineering team about adding adding one new button on the settings page and they grumble about a database migration and tell you it’ll take three months, this is what’s going on under the surface. The button is easy. Adding the information to the datamodel, API, database, cache, and developing a life cycle for those things is like adding new machinery to every step of a manufacturing process. As your product grows the cost grows larger leading to the inevitable law that the larger your product, the further your intuition about feature complexity diverges from reality of the build complexity.

Operational costs

Here’s a less obvious implication that should be mind boggling when you grasp it. I’ve spent the last 18 months as the primary engineer on a reasonably complex website and can confidently say that 99.9% of the code that’s running wasn’t written by me or my team. I do write lots of code, but it sits on a massive pyramid of abstractions of other people’s code all the way from web pages to negotiating internet requests to accessing files from a database to instructions running on a physical hardware chip. This is intentional and a good thing. If I had to write all those other things, it would quite literally take me a lifetime to write a single webpage. Some of them I rarely have to think about, for instance I have no idea how my server handles handles connecting to the internet, which is great, I don’t want to think about it. Some of them I do worry about all the time, like the code package I use to connect to my database and translate data back and forth. But even the most experienced and broadly knowledgeable engineers, have no idea how 99.9% of the software running in their websites actually works.

This tradeoff of being able to build without understanding every layer of the process naturally comes back to haunt us when something breaks. And here’s what should be mind boggling when you really grasp it: even though I wrote my whole website, I’m only familiar with a tiny fraction of the things that can go wrong. When something breaks

  1. I have way less of an idea what’s going on than you’d think. I only know the very surface level logic.
  2. I have to start digging below my level of understanding and that means a whole new order of magnitude (10x, not 2x) more complexity to sift through.
  3. There’s immense exposure to slight incompatibilities with other people’s software that I’ll never even imagine until they blow up.

Please, be patient with software engineers when something goes wrong. They’re staring into an abyss of things they don’t understand, and that’s not comfortable for anyone and especially not engineers.

Offshoring and outsourcing

The offshoring fervor of the early 2000s came and went and software development prices never dropped in the US the way many expected. From 2003 to 2006, Infosys, a flagship tech outsourcing and offshoring firm grew their employee base 4.5x from 16,000 to 72,000, a mind boggling feat that was a testament to the bullishness of the day. Over the next fifteen years, they’ve continued to grow their workforce but at a modest 6% annually. Their growth, as with other outsourcing firms has been reliable but nothing like the explosive growth of domestic tech giants made up of in-house engineering teams.

The reason is straightforward in hindsight: software development of any non-trivial product doesn’t behave like labor in the same way that factory design doesn’t behave like labor. Mistaking developers for outsourceable labor is the same mistake as conflating the software with the car itself. Companies whose core product is the ability to produce and sell cars don’t offshore expertise in producing cars.

That isn’t to say software should never be outsourced. For firms which aren’t at the scale to stomach upfront costs shouldn’t be developing their own software and firms who have other core competency should certainly consider purchasing the services of a firm consider cheaper overseas firms. But these are cases where the software isn’t the company’s product itself. The more interesting surprise is the rise of SaaS, which I think has played the role that many expected outsourcing to play. Earlier observers were right to do the math and realize that it was too expensive for most firms to develop software that wasn’t core to their business proposition. The solution wasn’t to outsource that cost to cheaper development though, the solution was to spread out prohibitive fixed costs across many firms by way of a centralized company with zero-marginal distribution costs. Hence venture funded SaaS.

The insight is that anywhere we expect software to produce value, it needs to be a core competency of a company dedicated to solving that problem. Those wishing to solve a problem cannot outsource the solution to that problem to 3rd party software developers anymore than investment banks can outsource the development of financial models to 3rd parties who know excel. It’s not the technical ability that matters, it’s the ability to marry business context to a technical system.

Use off the shelf options

Another consequence that often bites cross functional product teams is the cost of customizability. A simple example of this is buttons on a webpage. It’s entirely normal for every stakeholder from product, design and engineering to agree to something trivial like a customized button loading state, only to have engineering announce halfway through the project that building that loading state will take an extra week. Out on the metaphorical assembly line, the engineer has realized that your button making machine (e.g. a UI toolkit) doesn’t offer the ability to customize button loading state. To an outside observer like a designer, this is an outrageous explanation when all we need to is an a coat of paint to the button and we already have a dozen painting machines. But that’s not how assembly lines work of course. If we want the options to add special paint to buttons, we can’t use the off shelf button making machine anymore and we need to build our own button making machine. Now, we need to pay the full cost for learning how to make a button making machine, hence the extra week of development time.

There’s no win-win here. Either design doesn’t get the button styling they already got approved or engineering spends a week doing work they know isn’t valuable and will incur them operational headache down the road when they have to perform maintenance on their custom button making machine. I’ve tried to explain this tradeoff to my own teams with an analogy: when you’re picking a route to hike over a mountain pass, we can start from a blank slate and pick what appears to be the optimal route, or we can use an existing trail. The trail might not look like our preferred route, but as anyone who has tried to offroad can attest, you’ll end up having to swap out for hardier boots, you’re going to spend a lot of time hacking your way through tangled brush, and you’re going to run into unexpected challenges, so you better be really sure it’s worth it when you veer off the beaten trail. Using software packages is like that. In exchange for limitations to your ability to customize, you get to make fast progress on a paved path.

So if you find yourself in a discussion about a seemingly trivial feature addition or customization, recognize when you’re not asking the team to just add one simple step and instead asking them to eschew the well tested commercially available button machine and instead wire up a custom one themselves. Is the risk to your operations and the cost of maintaining that worth it? For all the reasons we discussed above, you’re going to underestimate the cost of developing your own custom solution for anything. Avoid paying the upfront cost for software whenever you can, and the best way to do that is to use someone else’s software.