Wednesday, January 15, 2025

Navigating cities of code with Norris Numbers

Programming LanguageNavigating cities of code with Norris Numbers


When I was eight, my family moved from Salt Lake City to the Utah/Arizona border. Our new house was a suburban three-story, part of a neighborhood whose streets were curling rays extending outward from a private golf course.

Any kid with a bike knows what to do in a new neighborhood. I wheeled out on evenings and weekends and learned the streets. It wasn’t long before I could call up a mental map from point A to anywhere. All the same, for years to come I would occasionally discover cul-de-sacs or trails I’d never noticed before. There’s nothing quite like chasing a bird across a vacant lot and gradually realizing that you’ve passed through a wormhole and your whole concept of Bloomington Drive has been bent in half.

Getting into a new codebase is like that.

Settling down in a new city (or codebase) is a marathon, not a sprint. There’s an effectively infinite amount of information to absorb. The trick is to recognize that whatever surprises you today will be taken for granted tomorrow, and it will be like that every day for years. Whenever you visit a new location, you’ll look around and take stock of the area. In time you’ll develop an internal compass that can’t be replicated or transferred to anyone else; your intuition will get you where you need to go better than a map ever could.

If a codebase is a city, then each line of code is what real estate agents call a “unit”: a house, an apartment, an office, a power station, a retail space. It has the potential to do a lot of different things, but at any given point, it plays a relatively small role in the machinery of the city. There’s a feedback loop at play, too: the city’s character, layout, and governance influences every person and space within it. Cities and codebases alike are often described as living things—and just like us, they’re colonized by scores of smaller organisms that affect them in complex ways.

Since leaving my childhood home, I’ve visited a hundred cities large and small. No two are the same, though well-designed cities of the same size tend to have strong similarities. I’ve also shepherded codebases along from 10 to 1,000 to 100,000 lines. And I’ve seen how things change as the repo grows—not just quantitatively, but qualitatively. A healthy big-city codebase isn’t just a small-town codebase scaled up. It’s a completely different structure, right down to the foundations.

In 2011, John D. Cook coined the term “Norris’s Number”, a “fundamental constant [describing] the average amount of code an untrained programmer can write before he or she hits a wall.” The number given was 1,500 lines. Lawrence Kesteloot later raised this number to 2,000 as part of a terse theory that categorizes programs by size, each category bearing unique challenges and tradeoffs for the programmers who work within it. 2,000 lines of code isn’t just the upper limit for a novice; it encompasses a fundamentally, philosophically different program from 20,000 lines, which is different from 200,000, which is different from 2,000,000. Someone who’s worked on several 20,000-line programs isn’t necessarily equipped to work on a 200,000-line program. And in my experience, the opposite is true as well; the programming methodology just isn’t transferrable. A programmer who insists on building go-to-market software at a startup the same way they built hyperscale apps for millions of concurrent users at Meta is destined to fail.

When you see programmers giving contradictory advice, it might be that they’re both right. They’ve just acclimated to different Norris Numbers.

Of course, everyone knows “total lines” is the very worst way to measure code. What can be implemented in one line of code can also be implemented in 20, and the former isn’t always better than the latter. But Norris’s number doesn’t need to be universal in order to be useful, nor does it need to account for its own incentives, since it’s not an incentivizing device. It’s merely a way to talk about code at the center of the bell curve. Most code written for pay converges toward a workable median: it’s not so elegant as to make you weep from reading it, but neither is it terribly chaotic or verbose. For run-of-the-mill, line-of-business code, there’s nearly always a consistent, appreciable difference between 2,000 and 20,000 lines of code. And describing the space between them as a “wall” is accurate to my own experience: it’s something you have to climb to get to the other side, and it won’t feel like you’re making progress until you do.

Let’s take a look at several Norris Numbers and what they mean for code across the spectrum, from a code snippet on Stack Overflow to a massive enterprise platform.

City equivalent: Rabbit Hash, Kentucky (109 households), whose mayor is a dog

Type of application: Impromptu command-line script

Required skills: Basic code syntax

Strategy: Copy and paste, or type fast and don’t look back

The smallest category of code is well below even Norris’s Number. It can be wrangled by a developer of any stripe; it has very little context. None of the software you knowingly use on a daily basis is anywhere near this small. It would be fair to wonder why it’s worth discussing at all.

But it’s important for one reason: nearly all developers learn to code on codebases of 200 lines or less. I’m talking about code samples in programming textbooks, interactive Codecademy lessons, Stack Overflow answers, blog posts, and LeetCode challenges. In all likelihood, your first program was a “hello world” of less than 10 lines and your next fifty programs weren’t much bigger. This is a practical constraint; using a full-featured application to demonstrate how a variable declaration or `println` works would be ridiculous. But it matters because the problems of a 50-line code sample have nothing to do with the problems of real-world programming.

Think of it this way: the average programmer is trained in a “town” so small it can be managed by a French bulldog. Then the moment they get their first job (or work on their first big school project), they’re sworn in as city councilor of a codebase whose zoning department employs more households than they’ve ever seen in their life. They may know what a line of code looks like, or even how to write one, but that doesn’t prepare them in the least to manage the affairs of a line-of-business application with its own fire department (unit tests) and chamber of commerce (tech debt backlog).

In no way am I saying this to disparage junior developers. In fact, I believe the ability to hire and retain junior developers is one of the most powerful competitive advantages a company can have. What I am saying is that colleges and bootcamps need to do more to expose their students to problems at the scale of the average corporate monolith. And in the meantime, we shouldn’t be surprised when a recent graduate takes six months or more to become fully productive at a new job. They have to learn the streets, just like anyone else.

City equivalent: Idyllwild, California (1,614 households), whose mayor is also a dog

Type of application: Proof-of-concept, demo, or AI-generated app

Required skills: An elementary understanding of functions, imports/includes, and glue code

Strategy: Do whatever’s easiest in the moment

Most of us have independently built a few things under the 2,000-line limit proposed by Kesteloot. This is where you begin to realize that the challenges of code have nothing to do with mind-bending syntax and stubborn compiler errors, and everything to do with humans’ limited working memory. Computers can track billions of values and operations at once. Humans are limited to five or so.

Still, you can get away with murder in a codebase of this size. There just isn’t room to make a decision that will haunt you—the whole thing is small enough to run on an electric toothbrush. If worst comes to worst, you can throw it away and rewrite it in a week or two.

That means, at this scale, you get very little upside from things like static analysis, unit tests, defensive programming, and code review. You might be tempted to dispense with them altogether, unless they come very cheap. In most cases, I say go for it. Cowboy it up. Code like there’s no tomorrow. Use an LLM if you want. You’re still very much in “governable by a dog” territory.

There are exceptions, of course. If the program is mission critical—code at this scale rarely is, but it’s possible—it’s worth a few extra safety measures. If it’s for your personal portfolio, consistent code style and automated testing could be meaningful to the rare hiring manager who looks. And if you end up doing more than a handful of iterations, unit tests could save you some time. But usually this is throwaway code. There are no best practices for throwaway code, only a rule of thumb: don’t overthink it.

Picturing the program as a small town puts this in perspective. A town of a couple thousand households doesn’t need a team of salaried urban planners. It can’t support a shopping mall or a football stadium. It doesn’t need constant oversight and intervention. And it can reinvent itself fairly easily—Idyllwild, California has at times been known as a summer camp, a faux-German village, a furniture production center, a hippie headquarters, and a Hollywood film set. If it were, say, a military base, then there would be a persistent need for specific facilities and regulations. But in most cases, a town that size can just go with whatever the occasion suggests.

City equivalent: Burlington, Vermont (17,448 households)

Type of application: Single-purpose indie app or internal tool

Required skills: Separation of concerns and information hiding

Strategy: Organize around a piece of core functionality; take time to refactor

A programmer’s first feat of ambition—the first time they try to build something meaningful—is where they’ll acquaint themselves with the “wall” described in Cook’s original blog post. What is this wall, specifically? If you look back on the first programs you ever wrote, you’ll see it in the piles of disheveled and unpredictable code, the poorly named variables, the functions as tangled as a Texas freeway. Code like that is self-limiting.

If you can’t modify a program without having the whole thing in your head, you’ll easily top out by the time you hit 2,000 lines. Breaking this barrier isn’t about increasing the capacity of your brain, any more than being the mayor of a growing city is about shaking hands with more people. Rather, it’s about governing more effectively: organizing, separating, and simplifying each piece of functionality so its internal state can be ignored when you’re not immediately working on it.

A codebase of up to 20,000 lines is still very manageable for a small team or solo developer. But it needs principles. If you treat it like a group of 2,000-line codebases that can talk to each other, you’ll get lost in an exponential cascade of complexity.

Apps of this size typically only do one thing (or variations of one thing), and they often do it extremely well. That’s your core functionality. Everything else should be small, easily understood, and subordinate to it. The app is big enough to have a “downtown” area, and maybe a shopping center across town, but everything else is neighborhoods. The neighborhoods build, staff, and patronize the city center, and the city center gives them purpose and importance.

Granted, as any urban planner will tell you, the ideal city isn’t neatly divided into residential, retail, and business. Each neighborhood should be diverse and self-sufficient, just as each component of an application should be able to invoke the resources it needs without a comprehensive understanding of other components’ shared or internal state—if everyone has to leave their neighborhood every day, you don’t have a city so much as a perpetual traffic jam. But when each component is self-contained, only appearing on Main Street when its moment has arrived, the city is a well-oiled machine.

City equivalent: Minneapolis, Minnesota (193,694 households)

Type of application: Mid-stage company’s software product

Required skills: High-level software architecture

Strategy: Develop mature processes; only build what you’ve proven has value

In this category lies the code for Apollo 11, which took us to the moon on about 145,000 lines of AGC4 assembly. Kesteloot says the key to passing 20,000 lines is learning never to write more code than absolutely necessary. Every line of code is a stranger you’ll meet again and again in the years to come; if it doesn’t carry its weight, you’ll come to regret it. I would add that it’s also about thinking in higher dimensions—organizing code not just into functions and groups of functions, but into groups of groups with strong boundaries between them and a very tight lid on each. In a way, a large program becomes its own programming language, with its own syntax (shared utilities and scopes), grammar rules (specifications and patterns), and compilation errors (unit tests). Part of building a corporate-scale program is foreseeing the advantages and pitfalls of the language you’re creating, then deliberately shaping it so it’s easy to do the right thing and hard to do the wrong thing. Enforcement is no substitute for design; a speed limit sign isn’t nearly as effective as a road that feels unsafe to speed on. As programming language designers, metropolitan governments, and transportation engineers alike have learned, if you design a large system only for efficiency, you’ll be less efficient than if you design it to mitigate human error.

Outside the code itself, robust software engineering practices are essential to the survival of a corporate codebase. There needs to be QA to keep the application’s quality above water; product management to envision and protect its future; IT to resource, deliver, and secure it; and management to coordinate the team and protect their time. All of these people need to keep a sustainable pace and be fastidious about best practices.

At smaller scales, founders and developers may be able to share these responsibilities for a time. But beyond the inflection point of 20,000 lines, the gap in effectiveness between generalists and specialists grows too wide to ignore. If the mayor of Rabbit Hash is also a volunteer firefighter, that’s admirable. But when the mayor of Mineappolis is getting called out of meetings to put on bunker gear, both the mayor’s office and the fire station have a problem.

Economic constraints also become more urgent at this level of scale. Startups can go a long way by throwing things against the wall and seeing what sticks. But when a codebase grows large enough to carry a mid-stage company, there’s a significant and continuous cost to keeping it alive, and changing its core processes becomes expensive. To stay viable, you either need an extraordinary amount of luck or the ability to experiment and pivot before writing code. This is why UX research is so valuable. You have to limit yourself to building features only after they have evidence of value and usability.

The most successful corporate codebases are selective about what they build, staffed with specialists to keep things running, and intentionally designed to encourage safe behavior. And if they can succeed at this level of scale, they’re poised to thrive when they grow beyond it.

City equivalent: New York City (3,373,039 households)

Type of application: Enterprise platform, operating system, or Dwarf Fortress

Required skills: Documentation, comprehensive testing, navigating bureaucracy

Strategy: Optimize every part of the development process; invest heavily in governance; standardize gradual rollouts and canary testing

Things get fuzzy in the largest category of codebases. Semantically, it’s not always clear when an extremely large “app” (like Linux or Google) is actually multiple apps in a trenchcoat, or an app with several logically distinct plugins, or a low-level system bundled with high-level virtual apps that run on it. Is it even possible for an app to grow this large as a single, cohesive unit and not crater under its own gravity? It depends on who you ask.

As I mentioned earlier, a big-city codebase can’t just be a small-town codebase tiled over a larger area. That’s how you end up with places like Los Angeles: a colorful and irreplaceable city, without question, but one that doesn’t really work. Ask anyone who lives there. It’s a collage of suburbs velcroed together without a consistent organizing principle, a place where everybody needs to go somewhere else and none of them can get there in time. Big things are happening in L.A. every day, but none of them as efficiently or auspiciously as they should.

For contrast, consider cities like New York, Boston, Portland, and Montreal (plus hundreds of others around the world—these are just the ones I can speak from experience about). These cities work. They’re organized and intentional. They’re full of opportunity. They quickly and reliably move millions of people per day. They have their issues, of course, and too many of the residents are underserved and overlooked, but it’s hard to argue with their raw efficiency.

At massive scale, code tends toward one of these extremes: a capricious logjam, like L.A., or a ruthless machine, like New York. A lot depends on testing, resilience, and separation of concerns—when you have thousands upon thousands of discrete units of code, the vast majority of them need to continue working even when you haven’t touched them in years. A 1% degradation rate over any period of time can become an unstoppable avalanche of failures.

As with the previous category, you also need to limit your scope. One notable characteristic of enterprise platforms, besides the proliferation of nested drawer menus and documentation pages, is the barebones feel of each interaction. A million-line codebase can’t deliver the flourish and friendliness of a smaller tool, at least not for all use cases. Salesforce can’t (and shouldn’t) aspire to have the same cutesy UI and ease-of-use as your favorite notetaking app; Microsoft Excel won’t (and shouldn’t) make your data look glossy and beautiful without any effort, like all the specialized visualization builders you can find online. Big apps can’t say “yes” to everything. They would implode.

The primary feature set of these applications has to grow, as Zawinski’s Law has it. So to keep complexity under control, the secondary feature set—animated transitions, inviting UI, foolproof tutorial flows—has to shrink. And it works because at this level of scale, the competition is either nonexistent or equally spread thin.

Finally, a codebase of this size is where ROI rapidly goes positive for optimizing smaller details of the development process, like the time it takes to run `git status` or the exact reproducibility of builds. In a small city, the occasional late train or washed-out road is an acceptable compromise for cheaper infrastructure; in a metropolis, every small delay can represent an economic loss of millions of dollars. Tech giants like Google and Meta have spent large sums of money building internal tools to keep their infrastructure fast and lean, resulting in the creation of extremely efficient software like Microsoft’s Scalar, Google’s Bazel, and Meta’s Buck2. Every second they’re able to trim from daily processes can add up to thousands of hours at scale, which is worth almost any price.

The theory of Norris Numbers, as far as you’re willing to buy into it, brings perspective to a few important topics in tech.

Take generative AI, for example. I’ve previously described AI coding tools as firehoses of mid-quality code. But more to the point, they’re firehoses of code that fit perfectly into Norris One: cheap, instant code that usually works but is neither complete nor structurally sound. To someone who doesn’t code for a living, low-Norris code is indistinguishable from high-Norris code, giving generative AI its aura of magic and buzz. And in codebases with well-established patterns and plenty of repetition, AI can even build a whole service without too many faux pas—if your codebase needs another cookie-cutter suburb, AI can spin one up overnight. Just remember that too many suburbs can bankrupt a city.

Or consider the hiring of junior developers. As a default, every developer starts at Norris One. But most of them quickly learn the norms and expectations of whatever Number they’re hired to work on. And if you’re hiring, you should take this into account; given a team that maintains a Norris Two application, a junior developer with experience in that category may outperform a senior developer with several years’ experience in a Norris Four organization.

Again, this isn’t a ranking tool. None of the Norris Numbers (or the developers who work within them) are better than any other. They’re just different—significantly so. If the way you code at your Silicon Valley job is different from the way you code for a hackathon or side hustle, that’s because it’s supposed to be: you’ve moved into a different category of code. When you see online posts and comments that advocate for “one right way” of developing software, and you know it would go over like a lead balloon at work, you should feel free to ignore it, and not just because of scale differences—there are many ways to categorize software, revealing differences that are just as important as the ones I’ve described here. Learning to understand and describe code in a nuanced way can bring clarity to the way you approach software development and your career.

The structure of a codebase, like that of a city, is rarely “right” or “wrong” in and of itself. It may be more or less well-adapted to the needs of its inhabitants, or more or less able to accommodate changes of a certain magnitude, or more or less cost-effective. But most structures have their place, and one of the great challenges of software architecture is learning to match a budding codebase with the structures that will support it when it ultimately achieves its goals.

Check out our other content

Check out other tags:

Most Popular Articles