John Eastman - NotSoWrong

Optimism: The irrationality of discounting human creativity

“(T)here is no fundamental barrier, no law of nature or supernatural decree, preventing progress. Whenever we try to improve things and fail, it is… always because we did not know enough, in time.”
David Deutsch, The Beginning of Infinity (2011, pg. 212)

In today’s news and social media environment, optimism is scarce. We are constantly bombarded with claims that society is heading in the wrong direction, that our children will be worse off, that some problem or trend will ensure our demise. These pessimistic messages poison the conversation around a variety of threats: climate change, artificial intelligence (AI), nuclear apocalypse, extraterrestrial invaders, etc.

It’s true that problems are inevitable. It’s also true that there’s no guarantee we will solve these problems. However, we frequently behave (often unknowingly) as if we are not capable of solving them. The truth is that we humans have the gift of creating new knowledge. The future is never doomed to tragedy nor destined to bliss, because the knowledge that will determine the future has not yet been created.

Pessimism is self-enforcing. The (implicit) assumption that we can’t or won’t create the knowledge to solve our problems discourages us from even trying. On the other hand, optimism, the belief that we are capable of solving problems, is not merely a more pleasant way to approach life; in fact, it is the more rational approach to the future.

The futility of pessimism and prophesy

Pessimism is an ancient plague. Indeed, in the Biblical book of Ecclesiastes, King Solomon lamented the futility of life and knowledge:

“For with much wisdom comes much sorrow; the more knowledge, the more grief.”
Ecclesiastes 1:18

Many ancient philosophers and modern thinkers have echoed the king’s dread. Sure, ignorance may be blissful for some, but King Solomon had it backwards: the creation of knowledge is the only way to eliminate the world’s sorrows.

The famous population theorist Thomas Malthus offers a great example. In 1798, Malthus built a model of population growth and agricultural growth which led him to prophesy that excessive population growth destined humanity to mass famine, drought, and war in the 19th century. He believed that the poor and uninformed procreated imprudently, that periodic checks on the birth rate were necessary, and that people were unlikely to behave as required to avert these disasters.¹ Malthus and other prominent thinkers believed he had discovered the end of human progress.

Malthus’s population growth predictions were actually fairly accurate. However, his prophesied dooms never materialized. Instead, the food supply grew at an unprecedented rate in the 19th century due to remarkable innovations such as plows, tractors, combine harvesters, fertilizers, herbicides, and pesticides.² Living standards rose considerably. Today, the number of humans is 10x larger than in Malthus’s time, yet famine mortality rates have plummeted.

The principle of optimism

The mistake Malthus made remains common today: failure to account for the potential for humans to create new knowledge, new technology. Whenever we make predictions of the future but ignore this key factor, we inevitably devolve into pessimism—and our predictions become prophecies.

The brilliant physicist David Deutsch offers a prescription, a worldview he calls the “principle of optimism”:

“All evils are caused by insufficient knowledge.”
David Deutsch, The Beginning of Infinity (2011, pgs. 212-213)

By “evils,” Deutsch is not referring to Hitler or Thanos; he is referring to unsolved problems. He argues convincingly that the only permanent barriers to solving problems are the immutable laws of physics. As long as a solution does not require breaking these laws (for example, if it required traveling faster than the speed of light), then humans have the potential to create the knowledge needed to solve any problem, given time.³ Fortunately, creating new knowledge is a distinctive human specialty.

AI doomsday?

Let’s consider one of the most common areas of pessimism today: AI. Over the years, many renowned thinkers—from Alan Turing to Elon Musk—have proposed various doomsday theories that AI will make humans obsolete or, worse, conquer and enslave them.

Aaron Levie, Founder/CEO of Box, offered a thought-provoking counter-argument to the common assertion that AI will simply replace human workers in droves. Consider the claim that if AI could make a company 50% more profitable in a given function, it would also eliminate 50% of workers in that function. Levie rightly criticizes these naively pessimistic arguments:⁴

First, AI-displacement claims assume that companies are already operating with the maximum amount of labor they would or should have, if budget weren’t a constraint. In reality, companies may prefer to utilize the efficiency gains from AI to scale-up their operations, which might involve hiring more humans—not fewer. New business growth may promote even more investment.
Second, AI-displacement claims ignore the probability that competitors will also act in response to productivity gains, pressuring other firms to increase their own productivity—even at the expense of profits. Competition may demand they retain, or even expand, their workforces.

Levie is right. These AI-displacement arguments often ignore the second-order feedback effects that are characteristic of complex systems. We cannot simply assume that if AI leads to higher productivity, then humans will become obsolete. Rather, the impact of AI productivity gains on the demand for labor is unknowable, because the knowledge that humans will create in the future is unknowable. As with most new technologies, it’s likely that AI reduces the demand for labor in certain domains, increases it in others, and even opens up entirely new domains.

Most AI doomsday claims involve textbook pessimism: they implicitly assume that we will create no new knowledge and simply accept our fate—in other words, that progress is over. Instead of prophesying our eminent obsolescence, we would be better off testing and improving on new technologies, investigating how to deploy and control them responsibly, and learning how to adapt and reconfigure our societies to new circumstances. We have, in fact, made such evolutions countless times throughout history.

If AI does make humans redundant or conquer the world, it won’t be because we were incapable of solving the problem. The ultimate impact that AI has on humans is up to, well, humans.

A rational preference for the unknown

Pessimism also tends to emerge when we face decisions between the familiar and the unfamiliar. It’s quite easy to dismiss new ideas or possibilities simply because we’re more comfortable with the status quo.

A fascinating thought experiment from computer science offers a compelling case for why, under many conditions, we should actually prefer the unknown to the known—that is, why optimism can be rational.

In the “multi-armed bandit problem,” you enter a casino filled with slot machines, each with its own odds of paying off. Your goal: maximize total future profits. To do so, you must constantly balance between trying new machines (exploring) and cashing in on the most promising machines you’ve found (exploiting).

The problem has a fascinating solution, the “Gittins index,” a measure of the potential reward from testing any given machine—even one we know nothing about.⁵ We should prefer a machine with an index of, say, 0.60 to one with 0.55. The only assumption required is that the next pull is worth some constant fraction of the current pull. Let’s say 90%.

Here’s the fascinating part: an arm with a record of 0-0 (a total mystery) has an “expected value” of 0.50 but a Gittins index of 0.703. The index tells us we should prefer the complete unknown to an arm that we know pays off 70% of the time! The mere possibility of the unknown arm being better boosts its value. In fact, a machine with a record of 0-1 still has a Gittins index over 0.50, suggesting we should give the one-time loser another shot.⁶

The Gittins index offers a formal rationale for preferring the unknown. Exploration itself has value for the simple reason that new things could be better, even if we don’t expect them to be.

This type of reasoning carries over to all “explore/exploit” decisions. For example, how should companies balance between milking their current profitable products today versus researching and developing new ones? Or, how should societies allocate resources between ensuring energy availability with traditional fossil fuels versus investing in low-carbon alternatives like solar or nuclear? The Gittins index illustrates that the mere potential for progress means that trying new things is often the rational approach to the future.

***

Rational optimism is not naïve: we aren’t guaranteed to solve our problems. We could fail or destroy ourselves. However, we should be endlessly encouraged by the fact that the universe does not preclude us from solving our problems! We have all the tools we need.

Progress emerges from learning and creating the right knowledge to address our problems, current and future. Assuming otherwise, or—equally dangerous—failing to remember that creative problem-solving is a human specialty, leads us inevitably to pessimism, prophesy, and error. As Winston Churchill explained about why he was an optimist, “It does not seem to be much use being anything else.”

Memes: The evolutionary battle of ideas

“In reality, a substantial proportion of all evolution on our planet to date has occurred in human brains. And it has barely begun. The whole of biological evolution was but a preface to the main history of evolution, the evolution of memes.”
David Deutsch, The Beginning of Infinity (2011, pg. 380)

Memes are ideas that are replicators, the broader term for any entities capable of causing themselves to be copied. These aren’t limited to funny internet posts. Memes could include jokes, languages, cultural traditions, artistic movements, business strategies, advertising jingles, scientific theories, conspiracy theories, religions, documents, or recipes. Each of these ideas contains (often inexplicit) knowledge for causing its own propagation across human minds, through an evolutionary process of creation and recreation.¹

The leaders who transform political, social, scientific, or corporate systems—for better or worse—share the ability to successfully spread their ideas to others. Indeed, all human progress (and suffering) has relied on the creation and replication of memes—including those underlying democracy and autocracy, science and dogma, morality and evil.

For those striving to become more effective strategists and decision makers, fluency with memetics (the study of memes) better equips us to recognize and cultivate good ideas—and, more importantly, to avoid bad ones.

Genetic vs. memetic evolution

An enlightening way to explore memes is by contrast to the original replicators: genes, the bits of DNA that are copied across generations in living organisms.

Genes are the basis of biological evolution, which occurs through the imperfect copying of genes from parents to offspring, followed by a “selection” process in which nature ruthlessly filters out the gene variants that are less successful at causing their own replication.

evolution by replication, variation, and selection

Through 600m years of evolution, these remarkable molecules have given rise to the complex systems of control and feedback we see throughout nature in plants, animals, and bacteria. Genetic evolution, however, is slow—operating on timescales far too long for us to notice.

Memes—a concept coined by biologist Richard Dawkins and elucidated by physicist David Deutsch—also operate through replication, variation, and selection. First, someone transmits a meme (a joke, theory, recipe) to someone else. The recipient then either reenacts a version of that meme to others, or not. If its new holders recreate the meme, they may introduce further variations. “Selection” takes place as the competing “meme variants” achieve differential success in causing their own replication.

Here, however, the meme-gene metaphor crumbles. Whereas genes mutate randomly, with no regard to what problems they might solve, we can create new memes with conscious foresight. Moreover, memes undergo random and intentional variation not only when we express them to others, but also within our own minds. Our creative faculties enable us to subject our ideas to thousands of cycles of variation and selection before we ever enact a variant! Finally, we can transmit a meme immediately, and to anyone, not just our children. For these reasons, meme evolution is exponentially faster than gene evolution.²

We can create, vary, and discard our ideas with remarkable speed. This ability, in fact, helps explain the “ascent of man,” and especially our accelerating technological progress in the ~400 years since the Enlightenment. Our knowledge growth is no longer bound by genetic timescales.

How ideas spread

But why do some memes flourish, while most perish? Unfortunately, as with genes, the memes that survive don’t necessarily need to be “good,” or even beneficial. They simply must be better at displacing competing memes from the population of ideas.

Memes successfully replicate through one of two basic strategies: (1) by helping their holders (rational memes), or (2) by inhibiting their holders’ critical abilities (anti-rational memes).³

1) Rational memes

Memes can help their holders solve problems by conferring useful knowledge, such as the recipe to create bread, the knowledge of how to raise a child, instructions for constructing shelter, or a good scientific theory like Einstein’s theory of relativity. Because rational memes like these undergo countless cycles of creative variation and critical selection, they evolve to become increasingly useful. They prevail over alternative memes not because they have been immune from criticism, but because they withstood criticism and evolved accordingly. Memetics itself is robustly criticized⁴—as it should be.

Rational memes thrive in open, dynamic societies and organizations—where people can freely critique, vary, and discard them.

2) Anti-rational memes

Memes that can’t withstand rational scrutiny may still spread by incorporating knowledge to prevent such scrutiny in the first place. These anti-rational memes include a myriad of ideologies and dangerous yet contagious ideas, such as conspiracy theories, autocracy, crackpot religions, and discriminatory or violent cultural beliefs. These memes propagate not by being more useful, but by successfully shielding themselves from criticism.

For instance, conspiracy theories shouldn’t survive long after the factual evidence inevitably contradicts them, so they struggle to spread on their own. They almost always include an additional theory: that the conspirators and the news media are “in” on the conspiracy together—so you can’t trust any information from either of them!⁵ This anti-rational “protective coating” discourages holders of the conspiracy meme from taking any contradictory evidence seriously.

Consider also the case of North Korea, a one-party state led by a totalitarian dictatorship. The legitimacy of the Kim dynasty rests on an intricate system of anti-rational memes and illiberal policies that suppress criticism and dissent, including “divine right,” mass surveillance, arbitrary arrests, torture camps, rigged elections, political punishments, and state-run media.⁶ The result is a wholly static society that compares miserably to its free and democratic neighbor, South Korea.

Though North and South Korea possess similar natural resources and share the same ethnic makeup and language, the lives of their citizens could not be more different. South Koreans enjoy a robust democracy, strong civil liberties, and average incomes nearly 30x higher than their Northern neighbors. Meanwhile, North Koreans suffer from inexcusably high rates of poverty, corruption, starvation, disease, and infant mortality.

Rational meme organizations

It is not just governments, but all institutions that should be evaluated based on how well they facilitate error-correction—companies included. Creating dynamic, durable organizations requires that rational memes can evolve without anti-rational memes suffocating change.

Such organizations must cultivate three key traits: (1) criticism, (2) experimentation, and (3) information-sharing.

Criticism — When a leader’s authority rests on suppressing criticism, errors will go uncorrected, and stagnancy and decay are inevitable. This is one reason why the modern “board governance” corporate structure evolved: the board of directors acts as an error-correction mechanism that can remove incompetent leaders. Beware any institution in which individuals have unconstrained power or cultivate an environment in which criticizing or questioning them is unsafe. A great positive example is Netflix, which has institutionalized guiding principles for open, frequent feedback and even disagreement—not just with peers and subordinates, but with senior management.⁷ Netflix’s dynamic “feedback culture” enables employees to promptly address errors and improve themselves.
Experimentation — Every organization must balance efforts to exploit its current businesses and memes with experiments to explore new ones. Unfortunately, when the environment feels stable, many large organizations stop experimenting because it seems costly and inefficient.⁸ Such complacency is a death sentence in dynamic competitive environments. Consider ChatGPT-maker OpenAI, who promotes experimentation using a unique approach that embeds researchers directly into product teams, shortening the feedback loop between ideation and development.⁹
Information-sharing — Companies with a culture of transparency and documentation facilitate criticism and improvement of their memes. For rational memes to evolve efficiently, information and ideas should be codified in shareable, collaborative documents. Amazon is the paragon here. All meetings start with a pre-read of a standardized document format that summarizes an idea, an analysis, or a potential solution to a problem.¹⁰ This practice facilitates shared understanding, promotes diverse input and criticism of ideas, and creates an easily replicable artifact for every meeting and decision.

***

When we die, we leave behind genes and memes. In a couple generations, our genes will be forgotten. Memes, however, can live on.

Culturally, professionally, and politically, we should ask ourselves what memes we are creating, spreading, and allowing into our minds. Do they foster progress and understanding? Have we adequately considered alternatives? Are they somehow shielded from criticism by ourselves or others?

If we find ways to contribute positive ideas, behaviors, and culture to the world, we may leave a memetic legacy that long outlives our genes.

Hypothesis Testing: To be, or not to be significant

A fundamental objective of exploring data is to unearth the factors that explain something. For example, does a new drug explain an improvement in a patient’s condition? Does the DNA evidence match the suspect’s? Does the new product feature improve user engagement?

The reigning theory of knowledge, Karl Popper’s critical rationalism, tells us how we cultivate good explanations. Simply put, we start by guessing. We then subject our best guesses to critical tests, aiming to disprove them. Progress emerges from the trial-and-error process of eliminating ideas that fail such tests and preserving those that survive.¹

These guesses, or hypotheses, are not “ultimate truths.” They are tentative, temporary assumptions—like a list of suspects for an unsolved crime.

But how do we objectively evaluate these guesses? Fortunately, statistics provides a solution.

Hypothesis testing is a formal statistical procedure for evaluating the support for our theories. It is hard to understate the huge role this methodology plays in science. It teaches us how to interpret important experimental results and how to avoid deceptive traps, helping us systematically eliminate wrongness.

Feeling insignificant?

Typically with hypothesis testing, we are referring to the null hypothesis significance test, which has been something of a statistical gold standard in science since the early 20th century.

Under this test, a form of “proof by contradiction,” it is not enough that the data be consistent with your theory. The data must be inconsistent with the inverse of your theory—the maligned null hypothesis.

The null is a relentlessly negative theory, which always postulates that there is no effect, relationship, or change (the drug does nothing, the DNA doesn’t match, etc.). It is analogous to the “presumption of innocence” in a criminal trial: a jury ruling of “not guilty” does not prove that the defendant is innocent, just that s/he couldn’t be definitively proven guilty.²

Say we wanted to run a clinical trial to test whether our new drug has a positive effect. We define our null hypothesis as the claim that the drug does nothing, and we plot the expected distribution of those results (the blue bell-shaped curve below). Then, we run an experiment, with a sufficiently large and randomly selected group of subjects, that attempts to refute the null at a certain “significance level”—commonly 5% (see red shaded area below).

The significance level (or “p-value”) represents the probability of observing a result as extreme as we did if the drug actually had no effect—that is, if random variation alone could adequately explain our results. For us to reject the null and claim that our trial provides “statistically significant” support for our drug, we need the observed improvement in our patients’ conditions to be large enough that our p-value very low (below 5%)!

significance test decision rule (hypothesis testing)

A or B?

We can apply hypothesis testing methods to evaluate experimental results across disciplines, such as whether a government program demonstrated its promised benefits, or whether crime-scene DNA evidence matches the suspect’s.

Essentially all internet-based companies (Twitter, Uber, etc.) use these same principles to conduct “A/B tests” to evaluate new features, designs, or algorithms. Just as with our clinical trial example, the company exposes a random sample of users to the new feature, then compares certain metrics (e.g., click-rate, time spent, purchase amount) to those of a control group that doesn’t receive the feature. Whichever iteration performs best wins. For any app on your phone, its creator likely conducted dozens of A/B tests to determine the combination of features and designs that you see, down to the tempting color of the “Purchase” button!

Stop, in the name of good statistics

Failing to understand hypothesis testing guarantees that we will be wrong more often. Too many people blindly accept faulty or pseudo-scientific experimental results or catchy headlines. They sometimes choose to ignore the science altogether, preferring to craft simple, tidy narratives about cause and effect (often merely confirming their preexisting beliefs).

Even if we do take the time to consider the science, hypothesis testing is far from perfect. Three main points of caution:

First, achieving statistical significance is not synonymous with finding the “truth.” All it can tell us is whether the results of a given experiment—with all its potential for random error, unconscious bias, or outright manipulation—are consistent with the null hypothesis, at a certain significance level.
Second, significance ≠ importance. To “reject” the null hypothesis is simply to assert that the effect under study is not zero, but the effect could still be very small or not important. For example, our new drug could triple the likelihood of an extreme side effect from, say, 1-in-3 million to 1-in-1 million, but that side effect remains so rare as to be essentially irrelevant.³
Third, a significance test, by definition, has only a certain degree of precision which is never 100%. And improbable things happen all the time. For example, a significance level of 5% literally implies a 1-in-20 chance of incorrectly accepting experimental results as true. This enables the nasty “multiple testing” problem, which occurs when researchers perform many significance tests but only report the most significant results. For example, if we do 10 trials of a useless drug with a 5% significance level, the chance of getting at least one statistically significant result gets as high as 40%! You can see the incentive for ambitious researchers—who seek significant results they can publish in prestigious journals—to tinker obsessively to unearth a new “discovery.”⁴

***

Used properly, hypothesis testing is one of the best data-driven methods we have to evaluate hypotheses and quantify our uncertainty. It provides a statistical framework for the “critical tests” that are indispensable to Popper’s trial-and-error process of knowledge creation.

If you have a theory, start by exploring what experiments have been done about it. Review multiple studies if possible to see if the results have been replicated. And if there aren’t any prior research studies, be careful before making strong, unsubstantiated claims. Even better, see if you or your company can perform your own experiment!

Asymmetrical Thinking: Expose and exploit imbalances

The idea that we need to “think outside of the box” to solve problems is frustratingly vague. Fortunately, there are several mental strategies that can promote creativity, including thinking nonlinearly and counterfactually. But one of the most underutilized processes is asymmetrical thinking, which involves flipping around and reshuffling ideas to unearth hidden imbalances.

This mindset is useful across disciplines, including physics, economics, biology, business, and investing. Its origins, however, are from geometry.

The dance of symmetry and asymmetry

The basis of geometry is symmetry, which describes the property of an object being unaffected by undergoing some transformation. Consider an equilateral triangle. If we move the whole shape an inch to the right, the angles and dimensions of the triangle are unchanged; that is, they are symmetrical with regards to this simple translation.

On the flip side, asymmetry is simply the absence of symmetry, indicating that an object is affected by undergoing some transformation. If we take our original triangle and pinch or stretch it, the lengths and angles of the triangle will change—they are asymmetrical with regards to pinching or stretching.

Ancient scholars, particularly Euclid, harnessed the concepts of symmetry and asymmetry in shapes to discover revolutionary geometrical and mathematical principles, setting the foundation for modern physics, engineering, and more.¹

Outside of geometry, asymmetry offers an invaluable mental model: it can be extremely insightful to flip things—whether shapes, ideas, relationships, or strategies—backwards and forwards and see whether what is true in one direction is also true in the opposite direction—in other words, whether there is symmetry. In our lives, finding asymmetries presents opportunities for unearthing unique insights, stimulating creative ideas, and creating leverage to drive outsized impact.

Asymmetries all over

Physics

In physics, practically all laws of nature originate in symmetries and asymmetries. Emmy Noether’s groundbreaking 1915 theorem connected symmetries in nature with the universal laws of conservation. Her work showed that whenever we see some sort of symmetry in nature, there must also exist a corresponding “conserved” element which preserves the symmetry.²

For example, it’s well-established that the laws of physics are uniform throughout the universe. Regardless of where we are, the laws of nature remain unchanged (symmetrical), just as the angles of our triangle did not change when we moved it. This symmetry gives birth to principles such as the the “law of conservation of momentum,” which holds that unless an outside force (such as friction or air resistance) intervenes, the momentum of a system will remain unchanged.³

Biology

Nature abundantly showcases symmetry.

Most vertebrates, from humans to elephants, have two “halves” that are roughly equal. This property, called “bilateral symmetry,” is believed to be advantageous for efficient movement and centralized control of sensory organs in the “head” of the organism. Greater symmetry also correlates with higher rates of reproduction, since many organisms with greater symmetry tend to be preferred as mates (example: facial symmetry in humans).⁴

However, it’s important to note that asymmetry is also an important and widespread trait, even in humans. Consider the phenomenon of “handedness,” the left lung being smaller than the right, or the left and right brain controlling different cognitive functions.

Statistics

Picture the symmetrical “bell curve” of the normal distribution. For any given random observation, there’s an equal probability that it will fall above or below the central average of the data—such as with human height and weight.

However, many real-world events generate results that are in fact asymmetrical (or “skewed”) rather than normal. Some produce “power-law” distributions in which a few extreme values dominate over the modest majority, such as with the frequency of words in most languages, the magnitude of earthquakes, and city populations.

Business and Investing

In the world of venture capital (“VC”), investors bet on risky, early-stage ventures, which carry both high potential growth and high risk of failure.

Unsurprisingly, the financial returns of early-stage startups are power-law distributed. Most investments generate low or negative returns, but a few bets generate enormous returns. VCs are essentially in pursuit of asymmetrical returns, aiming to own a piece of occasional breakout successes such as Google or OpenAI.

They gladly accept that many (even most) of their investments will generate little or no return, as long as one or two investments become wildly successful. The most they can lose is 1x their money, but the extreme winners could generate 10x, 100x, or more.

Using asymmetry as a strategy

In the competitive realm, one indirect strategy involves using your relative advantage to impose asymmetric costs on the opposition. In military conflicts, for example, weaker insurgent factions may compensate by using asymmetric warfare tactics, such as by attacking the opponent’s electrical grids, roads, or water supply.

In business, the basic concept of competitive advantage is rooted in differences—in the asymmetries among competitors. The task falls to the leaders to identify the asymmetries that can be turned into advantage and exploited, such as a valuable patent, strong network effects, or significant economies of scale.⁵

A great example is Netflix’s critical insight around 2011 to commit substantial resources towards producing original content. Historically, Netflix and its competitors licensed a portfolio of content produced by other companies. This meant that every new subscriber increased content costs. With Netflix’s landmark pivot to producing its own original content—particularly early shows such as Lilyhammer and House of Cards—Netflix turned content into a fixed cost.⁶ With original content, more subscribers don’t directly increase content costs.

This strategy created a cost asymmetry between Netflix and its competitors: Netflix’s content costs would become cheaper over time on a per-subscriber basis, as it spread its fixed costs over a larger number of subscribers! Netflix would lead a global revolution in the production of original streaming content.

***

Asymmetry is much more than an academic concept—it offers a powerful new lens through which we can view the world. By identifying and leveraging the hidden asymmetries in our lives, we can open the door to inventive solutions and strategies. As we navigate challenges in our own lives and careers, we should consider how asymmetrical thinking might offer unexpected answers.

Expected Value: Don’t buy lotto tickets, but keep funding startups

The expected value of a process subject to randomness is the average of its outcomes, each weighted by its probability. We might use expected value to evaluate a variety of phenomena, such as the flip of a coin, the price of a stock, the payoff of a lottery ticket, the value of a bet in poker, the cost of a parking ticket, the decision of a business to launch a new product, or the utility of reading a book. The underlying principle of expected value is that the value of a future gain (or loss) should be directly proportional to the chances of getting it.

The concept originated from Pascal and Fermat’s solution in 1654 to the centuries-old “problem of points,” which sought to divide the stakes in a fair way between two players who have ended their game before it’s properly finished. Instead of focusing on what had already happened in the game, they focused on the relevant probabilities if the game were to continue. Their findings laid the foundations for modern probability theory, which has broad applications today, including in statistics (the mean of a normal distribution is the expected value), decision theory (utility-maximization and decision trees), machine learning algorithms, finance and valuation (net present value), physics (the center of mass concept), risk assessment, and quantum mechanics.

Expected value is one of the simplest tools we can use to improve our decision making: multiply the probability of gain times the size of potential gain, then subtract the probability of loss times the size of potential loss.

Simply put, we should bias towards making decisions with positive expected values, while avoiding decisions with negative ones. The idea is that, over the long run, we will be better off if we repeatedly select the alternatives with the highest expected values.

Useful but used wrongly

Lotteries are a classic example. We might genuinely enjoy playing the lottery. But we can’t ignore that lottery tickets are typically really bad bets, even when the jackpot is huge. Because the state takes a cut of the total pot and only pays out the remainder to winners, the expected value for all players must be negative.¹

The expected value rule of thumb may seem straightforward, but a body of fascinating psychological research—particularly the work of Daniel Kahneman and Amos Tversky on “prospect theory”—shows that the decision weights that people assign to outcomes systematically differ from the actual probabilities of those outcomes.

For one, we tend to overweight extreme (low-probability) outcomes (the “possibility effect”). As a result, we overvalue small possibilities, increasing the attractiveness of lotteries and insurance policies. Second, we give too little weight to outcomes that are almost certain (the “certainty effect”). For example, we weigh the improvement from a 95% to 100% chance much more highly than the improvement from, say, 50% to 55%.²

Adapted from *Thinking, Fast and Slow* (Kahneman, D., 2011)³

Process over outcome

The expected value model reminds us to be more critical of the process (how we value the possibilities) than the outcome (what we actually get). When randomness and uncertainty are involved—as they almost always are in complex systems—even our best predictions will be wrong sometimes. We can’t perfectly anticipate or control such outcomes, but we can be rigorous with our preparation and analysis.

Remember, too, that expected value doesn’t represent what we literally expect to happen, but rather what we might expect to happen on average if the same decision were to be repeated many times (a better name might have been “average value”). Often we don’t and can’t know the exact expected value, but we can say with some confidence whether it’s positive or negative.⁴

Seeking the “long tail”

In some circumstances, applying expected values may be an entirely misguided approach. We must consider carefully the distribution of the underlying data. For a standard, normally distributed system (e.g., human height, SAT scores), the expected value is also the central tendency of the data, and is therefore a reasonable guess for any individual observation.

However, for power-law distributed systems (e.g., the frequency of words in the English language, the populations of cities), there are a small number of inputs that account for a highly disproportionate share of the output, skewing the distribution of the data. Because power laws are asymmetrical distributions, the expected value is not the central tendency of the data; a few extreme observations wildly skew the results.

Consider the early-stage venture capital industry, in which investors put money into highly risky startup ventures. Financial returns on venture capital investments are power-law distributed. Most of these startups will fail, but the ones that do succeed can really succeed and generate massive financial returns (think Google or Tesla). Thus, for venture capitalists, the game is not to seek the average return or “expected value,” but rather to search for the “long tail”—the extreme outliers that generate outsized results.

Expected value is a lot less relevant when you don’t care as much about the probability of success than you do about the magnitude of success, if achieved. If one or two “grand slams” can generate massive returns for the fund, then VCs don’t care if 90% of their other investments fail.

Union Square Ventures, for example, invested in Coinbase in 2013 at a share price of about $0.20, and realized a massive return when Coinbase opened its initial public offering at $381 in 2021—a valuation of around $100bn and an increase of over 4,000x from the round that Union Square led eight years earlier.⁵

***

Expected value helps us evaluate alternatives even when we face substantial uncertainty our risk. Estimate the potential “payoffs” and weight them by their respective probabilities. In general, err towards making bets with positive expected values, while declining bets with negative ones.

While extremely useful as a rule of thumb, applying expected value can involve substantial subjectivity—and is therefore at risk of bias and error. Use ranges instead of single values to avoid false-precision. And remember that expected value may be entirely inappropriate when dealing with power-law distributed systems, where its not the “middle” outcome that dominates, but the “extreme” ones.

Signal vs. Noise: Finding the drop of truth in an ocean of distraction

Every time that we attempt to transmit information (a “signal”), there is the potential for error (or “noise”), regardless of whether our communication medium is audio, text, photo, video, or raw data. Every layer of transmission or interpretation—for instance, by a power line, radio tower, smartphone, document, or human—introduces some risk of misinterpretation.

The fundamental challenge we face in communication is sending and receiving as much signal as possible without noise obscuring the message. In other words, we want to maximize the signal-to-noise ratio.

While this concept has been instrumental to the fields of information and communication for decades, it is becoming increasingly relevant for everyday life as the quantity and frequency of information to which we are exposed continues to expand… noisily.

A firehose of noise

Our brains are fine-tuned by evolution to detect patterns in all our experiences. This instinct helps us to construct mental “models” of how the world works and to make decisions even amidst high uncertainty and complexity. But this incredible ability can backfire: we sometimes find patterns in random noise. And noise, in fact, is growing.

By 2025, the amount of data in the world is projected to grow to 175 “zetabytes,” growing by 28% annually.¹ To put this in perspective, at the current median US mobile download speed, it would take one person 81 million years to download it all.²

Furthermore, our average number of data interactions are expected to grow from one interaction every 4.8 minutes in 2010, to every 61 seconds in 2020, to every 18 seconds by 2025.³

So, the corpus of data in the world is enormous and growing exponentially faster than the capacity of the human brain. And, the frequency with which we interact with this data is so high that we hardly have a moment to process one new thing before the next distraction arrives. When incoming information grows faster than our ability to process it, the risk that we mistake noise for signal increases, since there is an endless stream of opportunities for us to “discover” relationships that don’t really exist.⁴

Sometimes, think less

In statistics, our challenge lies in inferring the relevant patterns or underlying relationships in data, without allowing noise to mislead us.

Let’s assume we collected some data on two variables and observed the graphical relationship, which appears to be an upward-facing curve (see charts below). If we tried to fit a linear (single-variable) model to the data, the average error (or noise) between our model’s line and the actual data is too high (left chart). We are “underfitting,” or using too few variables to describe the data. If we then incorporate an additional explanatory variable, we might produce a curved model that does a much better job at representing the true relationship and minimizing noise (middle chart). Next, seeing how successful adding another variable was, we might choose to model even more variables to try to eliminate noise altogether (right chart).

Unfortunately, while adding more factors into a model will always—by definition—make it a closer “fit” with the data we have, this does not guarantee that future predictions will be any more accurate, and they might actually be worse! We call this error “overfitting,” when a model is so precisely adapted to the historical data that it fails to predict future observations reliably.

Overfitting is a critical topic in modeling and algorithm-building. The risk of overfitting exists whenever there is potential noise or error in the data—so, almost always. With imperfect data, we don’t want a perfect fit. We face a tradeoff: overly simplistic models may fail to capture the signal (the underlying pattern), and overly complex algorithms will begin to fit the noise (error) in the data—and thus produce highly erratic solutions.

For scientists and statisticians, several techniques exists to mitigate the risk of overfitting, with fancy names like “cross-validation” and “LASSO.” Technical details aside, all of these techniques emphasize simplicity, by essentially penalizing models that are overly complex. One self-explanatory approach is “early stopping,” in which we simply end the modeling process before it has time to become too complex. Early stopping helps prevent “analysis paralysis,” in which excess complexity slows us down and creates an illusion of validity.⁵

We can apply this valuable lesson in all kinds of decisions, whether in making business or policy decisions, searching for job candidates, and even looking for a parking spot. We have to balance the benefits of performing additional analyses or searches with the costs of added complexity and time.

“Giving yourself more time to decide about something does not necessarily mean that you’ll make a better decision. But it does guarantee that you’ll end up considering more factors, more hypotheticals, more pros and cons, and thus risk overfitting.”
Brian Christian & Tom Griffiths, Algorithms to Live By (2016, pg. 166)

The more complex and uncertain the decisions we face, the more appropriate it is for us to rely on simpler (but not simplistic) analyses and rationales.

A model of you is better than actual you

In making professional judgments and predictions, we should seek to achieve twin goals of accuracy (being free of systematic error) and precision (not being too scattered).

A series of provocative psychological studies have suggested that simple, mechanical models frequently outperform human judgment. While we feel more confident in our professional judgments when we apply complex rules or models to individual cases, in practice, our human subtlety often just adds noise (random scatter) or bias (systematic error).

For example, research from the 1960s used the actual case decision records of judges to build “models” of those judges, based on a few simple criteria. When they replaced the judge with the model of the judge, the researchers found that predictions did not lose accuracy; in fact, in most cases, the model out-predicted the professional on which the model was built!

Similarly, a study from 2000 reviewed 135 experiments on clinical evaluations and found that basic mechanical predictions were more accurate than human predictions in nearly half of the studies, whereas humans outperformed mechanical rules in only 6% of the experiments!⁶

The reason: human judgments are inconsistent and noisy, whereas simple models are not. Sometimes, by subtracting some of the nuance of our human intuitions (which can give us delusions of wisdom), simple models actually reduce noise.

***

In summary, we have a few key takeaways with this model:

Above all, we should seek to maximize the signal-to-noise ratio in our communications to the greatest practical extent. Speak and write clearly and concisely. Ask yourself if you can synthesize your ideas more crisply, or if you can remove extraneous detail. Don’t let your message get lost in verbosity.
Second, be aware of the statistical traps of noise:

Don’t assume that all new information is signal; the amount of data is growing exponentially, but the amount of fundamental truth is not.
When faced with substantial uncertainty, be comfortable relying on simpler, more intuitive analyses—and even consider imposing early stopping to avoid deceptive complexity.
Overfitting is a grave statistical sin. Whenever possible, try to emphasize only a few key variables or features so your model retains predictive ability going forward.

Acknowledge that while human judgment is sometimes imperative, it is fallible in ways that simple models are not: humans are noisy.

Scale: The only way to change the world

Complex systems—such as organizations, governments, ecosystems, planets, or galaxies—tend to exhibit different properties and, consequently, to behave differently depending on their current relative size. Put simply, things that happen at a smaller scale may happen very differently—or not all—at a larger scale.

If scale didn’t matter, then all relationships would be linear. More of one thing would mean more of another—infinitely. For instance, if rain is good, we should want infinity rain. If donuts are good, we should eat infinity donuts. Such reasoning is clearly flawed.

In reality, having more (or less) of something is not always better (or worse); rather, which way you should go depends on where you already are.¹ We want more rain during a drought, but not during a flood. The first donut is delicious; the twelfth might cause problems. This is called nonlinear reasoning—a critical tool for any individual hoping to be less wrong.

Understanding the dynamics of nonlinear effects at scale can help us build more successful businesses, programs, and policies—and avoid catastrophe.

The nonlinearities of scale

Infinitely linear (“constant”) returns to scale are rare. More often, growth involves increasing returns or diminishing returns—or likely both, given time.

In business, key examples of increasing returns include economies of scale, whereby the average cost of all units declines as production volume increases (e.g., Costco, Tesla), as well as network economies, where each additional user makes the whole network more valuable to all other users (e.g., TikTok, Gmail).

Critically, most real-world results—from donuts, to rainfall, to business—eventually suffer from diminishing returns to scale. With economies of scale, for instance, a growing company cannot decrease its unit costs forever. Beyond a certain size, diminishing returns will tip into “diseconomies of scale,” when a firm’s unit costs increase. Diseconomies might emerge due to managerial limitations, swelling bureaucracy, or increasing scarcity (and thus higher costs) of key inputs, such as raw materials or specialized talent.

So, we’ve established that things change at scale. Let’s explore how to use these principles to build our own ventures.

My idea is scalable… right?

If we care at all about the size of our impact, we need to be able assess the potential scalability of an idea, policy, or program.

“Put simply: you can only change the world at scale.”
John List, The Voltage Effect (2022, pg. 9)

Successfully scaling requires taking an idea from a small group (of customers, units, employees, etc.) to a much larger group, in a healthy and sustainable way.

Fortunately, in his fantastic book on scaling, The Voltage Effect, economist John List proposes five scalability “vital signs” to analyze:

Vital sign #1: False positives

Early evidence may convince us that something is true, when in fact it isn’t—whether due to statistical errors or to human bias.

For example, based on some tentatively encouraging early research, the “D.A.R.E.” program, a zero-tolerance anti-drug campaign championed by the Reagan administration in the 1980s, was scaled to 43 million children over 24 years. Eager politicians jumped aboard to proclaim themselves as pro-kids and pro-cops. The problem: every major study on D.A.R.E. found that the program didn’t actually work, and in some cases actually increased drug use. Unsurprisingly, the program lost federal funding in 1998.² The false positive signal that convinced leaders to prematurely scale D.A.R.E. nationwide wasted billions of dollars and decades of time and effort by students, administrators, scientists, and legislators.

Because we humans, regardless of intelligence level, often fail to adequately critique our own ideas, we are prone to confirmation bias. This helps explain the occurrence of false positives, as we simply avoid information that challenges our preexisting or preferred beliefs.³ We cannot afford to make this error at scale. We must bring a critical perspective to our own ideas and a healthy skepticism towards whether positive early results are likely to be replicated at scale.

Vital sign #2: Knowing your audience

There is always a risk that the individuals who participate in a pilot study or survey will behave in ways particular to that group. Inevitably, when we scale something to new groups, different people will behave differently.⁴

If a social media platform such as Twitter, for example, launches an experimental new feature to a small subset (say, 1%) of users to assess the feature’s potential, it needs to be keenly aware of the potential for selection effects—the risk that the sampled users are not representative of the full user population (i.e., that the sample is not truly randomized). For instance, users of the new feature might disproportionately be die-hard Twitter users, or they might be younger and more tech-savvy than the broader user base. Such biases could skew the pilot results to be overly optimistic, giving an illusion of scalability.

We must understand exactly who our idea is for in order to assess its potential for scalability. The characteristics of our early adopters might be very different from our target market at scale.

Vital sign #3: The chef or the ingredients

For an idea to scale successfully, we need to identify our true performance drivers and do everything we can to cultivate and protect them. In particular, is it the “chefs” (indispensable individual humans) or the “ingredients” (the product or idea itself)?

Chefs — Simply put, humans with unique skills don’t scale. Individual-reliant ideas have a ceiling to their scalability. If key people are overloaded, an organization can atrophy. This is why elite chefs focus more on quality and reputation for one restaurant (or a few), versus on expanding to many locations—where their unique talents would be hard to replicate.
Ingredients — We must determine what elements we cannot live without—and whether they are scalable. For instance, if quality is a critical ingredient, quality standards cannot decline as we grow. Or if faithfulness to a particular mission is a must-have, we cannot allow drift from the original intent at scale. This “program drift” plagues governmental and philanthropic programs in particular, often due to multiple funding sources each pushing their own agendas.⁵

Vital sign #4: Externalities (“spillovers”)

The larger the scale, the greater the potential for unintended consequences, or “externalities.” When many individual decisions accumulate and interact, equilibrium may be disrupted and spillovers are inevitable—potentially working for or against our intended outcome.

As a negative example of spillovers, the flagship 1968 federal law requiring seatbelts to be worn in cars actually caused drivers, feeling safer, to take more risks while driving—wiping out the safety gains from wearing seatbelts.

In contrast, a positive example is achieving “herd immunity” from a disease through mass vaccinations. At scale, when a critical portion of the population receives a vaccine, positive spillovers emerge as unvaccinated people still benefit indirectly since a substantial share of the people around them are now immune.⁶

Positive spillovers scale incredibly well. As with herd immunity, achieving spillovers may even become the objective. Equally powerful, negative side effects can be worse than the original symptoms. As best we can, we should think about what second- and third-order consequences might emerge if we scale our idea, and be prepared to deal with unintended consequences.

Vital sign #5: The cost trap

Regardless of how good an idea is, if the returns on your product don’t exceed the cost, or if the benefits delivered by your program don’t support the expenses, the idea is not scalable. As we learned above with economies and diseconomies of scale, our cost profile can change dramatically at scale—for better or for worse.⁷

***

The overarching lesson is that when we are dealing with complex systems, we should always establish the scale at which we are analyzing the system—in “orders of magnitude,” at least (hundreds, thousands, millions, etc.). Any insight that applies at one scale may be different or even opposite at another scale. And before we invest substantial time and resources to scaling our ideas, we should critically evaluate whether the idea has the vital signs of success.

Regression to the Mean: Heard of it? Well, you probably have it slightly wrong

Regression to the mean is the statistical rule that in any complex process that involves some amount of randomness, extreme observations will tend to be followed by more “mediocre” observations.

Although regression to the mean is not a natural law but a statistical tendency, it is an extremely useful mental model, because we have a problematic tendency to get regression wrong. For one, we fail to appreciate its power to explain many apparent phenomena that are really just mirages of randomness. We also often foolishly “predict” regression when what we’ve been observing recently seems extreme.

Innate or random?

It is not some mediocrity-loving law that causes regression to the mean; rather, regression is the natural tendency when inherent characteristics are intermingled with chance. While we should expect inherent traits to show up repeatedly, chance is fleeting.

Consider a clinical trial where we use random sampling to test a new dieting method on overweight folks. Because our body weight fluctuates daily, there is some randomness involved. At initial weigh-ins, the individuals in the heaviest segment are certainly more likely to have a consistent weight problem (an inherent characteristic), but they are also more likely to have been at the top of their weight range on the day you happen to weigh them (a random fluctuation). Therefore, we should expect our heaviest participants to lose some weight on average during the study, regardless of the effectiveness of the diet!¹

Causal mirages

The same logic can be applied to outperforming businesses, artistic success, or sports achievement: all of these success cases are more likely to possess superior talent, but also to have had some luck—and luck, by definition, tends to be transitory.

In assessing cause and effect, we commonly attribute causality to a particular policy or treatment when the change in the extreme groups would be expected without the treatment. Regression does not have a causal explanation. It inevitably occurs when the correlation between two elements (such as body weight and a dieting method) is less than perfect—in other words, whenever some amount of randomness is involved.²

In statistical science, the prescription for this causality error lies in introducing a “control group,” which should experience regression effects regardless of treatment. In our dieting study, we would need to compare the results of the dieting group with those of a group who knows nothing of the diet. We then assess whether the outcomes between the control and treatment groups are more different than regression alone can explain.

In everyday life, we must be prudent before assigning causality to some factor when we observe more moderate outcomes following an extreme one. It’s rather tempting to come up with a coherent narrative about what caused a change than to say, “It’s just statistics.” If we believe strings of good or bad results represent a persistent state of affairs, then we will incorrectly label the reversion to normal as the consequence of some other change we made or observed.³

For example, we could come up with stories such as:

The saleswoman who generated record sales last year but did worse this year must have become less motivated after she got a big bonus.
The stock market rebound after last year’s recession means the President’s economic policies must be working.
When I gave my daughter ice cream after she earned an “A+” on a test, she did worse the next time. But when I sternly criticized her after she got a “C,” she did better the next time. Therefore, I should be more forceful.

In all of these examples, it’s possible that the moderation in behavior we observed could be entirely explained by the basic statistical workings of regression to the mean, regardless of the “causal” story we came up with.

An insane example is the purported “discovery” in the 1976 British Medical Journal that bran had an extraordinary balancing effect on digestion. Subjects with a speedy digestion tended to slow down, those with typical digestion speed were unchanged, and those with slow digestion tended to accelerate. The crazy thing is: due to regression to mean, these are exactly the results we should expect to see if the bran had no effect whatsoever!⁴

***

People tend to prophesy “regression!” after anything extreme happens, without properly understanding why and how it works. Nothing is ever “due” for regression (not the stockmarket, not your football team, etc.). Extreme behavior simply tends not to persist over large samples. Once we understand that the tendency towards mediocrity is inevitable whenever randomness is involved, we can avoid the delusions of causality that plague so many others—whether in business, sports, the stock market, or our weight-loss regimen.

Local vs. Global Peaks: Balancing exploration and exploitation to reach our pinnacle

A local optimum is a solution that is optimal within a neighboring set of candidate solutions—a point from which no small change can generate improvement. However, this local peak may still be far from the global optimum—the optimal solution among all possible solutions, not just among nearby alternatives.

This valuable model can teach us about the inherent tradeoff between capitalizing on our current opportunities and pursuing new ones—whether in biological ecosystems, businesses, or machine learning. We can use it to better understand the complex environments we operate in, and to design more effective strategies to achieve our goals.

Getting stuck

Picture a rugged plane comprised of many peaks and valleys of various elevations, with numerous individuals or groups competing to reach the highest peaks. Nearby points tend to have similar levels of “fitness.” The landscape itself may shift dynamically, altering the peaks and valleys and transforming the available paths to reach them. This model is known as a “fitness landscape,” an extremely useful metaphor for thinking about optimization amidst local and global peaks in a variety of applications—including systems, biology, computer science, and business.¹

In complex systems (such as an industry or an ecosystem), it is easy to get stuck on local peaks as the ground shifts beneath our feet (undermining our position), especially if we fail to survey new territory. We won’t know precisely how the landscape will shift, so the only way to sustain progress in the long-term is, simply, to explore.

Sometimes, we may even have to go down (temporarily worsen our situation) in order to ascend a higher peak. And this requires a lot of courage. For example, Netflix’s stock fell by almost 80% from its peak after CEO Reed Hastings announced they were getting out of the DVD business in 2011. Ten years later, Netflix had pioneered the video streaming industry, and its stock price had grown by nearly 1,300%!

Evolutionary searches can never relax. We must constantly experiment with new ideas and strategies to find better solutions and adapt as the landscape shifts.

Faster than the speed of evolution

In biological evolution, we can visualize the competition for genetic dominance as a rugged fitness landscape in which the peaks and valleys represent the highs and lows of evolutionary fitness across an ecosystem. Higher peaks represent species or organisms that are better adapted to their environment—that is, ones that are more successful than their nearby competitors at causing their own replication.

Evolution is capable of creating remarkably complex and useful features, such as the human body’s ability to repair itself or the peacock’s brilliant tail. However, because it optimizes only for the ability of genes to spread through the population, evolution will inevitably reach only local peaks of fitness within a given environment.² It can favor genes that are useless (the human appendix), suboptimal (women’s narrow birth canals), or even destructive to the species. For instance, the peacock’s large, colorful tail that helps it find mates also makes it more vulnerable to predators.³

When the landscape shifts, even a highly adapted species will be unable to evolve toward a worse (less well-fitted) state than its current one in order to begin ascending a new, higher evolutionary peak. If the environment shifts faster than the species can adapt to it, mass extinctions can occur.⁴

Fortunately, we humans don’t need to be bound by evolutionary timescales. Often, we can find better hills to climb.

Let’s look to computer science and business to see why.

Getting un-stuck

Algorithms provide useful insights into optimization and into overcoming local peaks.

The simplest optimization algorithm is known as “gradient ascent,” in which the program just keeps going “up.” For instance, a video site such as YouTube might be programmed to continue recommending videos that resemble your past content consumption. But “dumb” algorithms like this one maximize only short-term advantage, leading us to local peaks but not to global ones. What if the user’s content preferences change? What if the viewer gets bored by stale recommendations? What if repetitive videos trap the user in a filter bubble?

Randomness and experimentation can help us “pogo-jump” to higher peaks that simple gradient ascent would not reach. For example, a “jitter” involves making a few random small changes (even if they seem counterproductive) when it looks like you are stuck on a local peak, then resuming hill-climbing. A “random-restart” involves completely scrambling our solution when we reach a local peak—which is particularly useful when there are lots of local peaks.⁵

Perhaps our video site should recommend random pieces of viral content even if the viewer hasn’t watched similar clips previously. Or show clips that contrast sharply with past viewing habits (for nuance or contrarian content). Only experimentation can reveal whether we are climbing the best hill.

The explore/exploit tradeoff

In business, it is useful to picture the strategic environment as a rugged landscape, with each “local peak” representing a coherent bundle of mutually reinforcing choices.

Every organization needs to balance experiments in exploitation of its current businesses with experiments in exploration for future innovations. In the short-term, simple “gradient ascent” strategies (keep going up) help ensure the company is exploiting its current strengths and opportunities. Over the long-term, however, companies must make occasional medium- or long-distance “pogo jumps” to prevent getting stuck on local peaks and, sometimes, to make drastic improvements. The key problem with many organizations is that when the environment seems stable, they stop experimenting because it seems costly and inefficient, and because it sometimes creates internal competition.⁶

This was Reed Hastings’s revelation about Netflix in 2011: its wildly successful DVD-by-mail business was merely a local peak. The landscape had shifted. The new global peak, he believed (correctly), was streaming.

***

The overall lesson is that because the environment is uncertain and always changing, good strategy requires individuals and organizations to carefully cultivate and protect a portfolio of strategic experiments, creating valuable options for the future.

Even when it seems we are at a “peak,” there may be even higher peaks that we cannot yet see, and the peaks themselves are constantly shifting! In such an environment, complacency is a death sentence.

Leverage Points: The secrets to systemic change

Leverage points are places within a complex system—such as a corporation, an economy, a city, or an ecosystem—in which a small shift in input force can produce amplified changes in output force.¹ When concentrated towards the most critical areas for improvements, even small shifts in our efforts can cause large, durable changes.

If we care about maximizing our impact in our companies, communities, or personal lives, leverage is one of the most valuable models to understand.

Taking the systems perspective

Unfortunately, we live in an event-oriented society, in which we focus mainly on our day-to-day experiences. Things happen, we react, then we repeat tomorrow. Reacting to events is the lowest-leverage way to instigate change. It oversimplifies the world into a series of linear, cause-effect events, while ignoring the deeper causes underlying our experiences. In reality, our world is a web of interwoven relationships and feedback loops.

Consider complex societal problems such as homelessness or income inequality. Seemingly plausible solutions to these issues can produce highly unpredictable results and unwanted side effects.

Adopting a systems perspective, on the other hand, allows us to unearth the highest-leverage places for intervention. By looking beyond individual events and studying the higher-level patterns of behavior of the system, we can identify and act on the system structures creating those patterns.²

*Adapted from Introduction to Systems Thinking (Kim, 1999, pg. 4)*

Tackling homelessness

Consider the problem of homelessness. In order for individuals to qualify for permanent housing support, traditional policies have required homeless people to address certain issues that may have led to the episode of homelessness—for example, by requiring negative drug tests or mental health treatments.

These contingencies often lead to affected individuals getting entangled in bureaucracy and trapped in a vicious cycle where becoming homeless actually exacerbates the issues that caused the episode in the first place, such as unemployment, substance abuse, or medical issues—making it harder and harder to overcome. Simply put, these are low-leverage solutions.

However, modern policies have begun to adopt so-called “Housing First” principles, which enact a simple rule: offer basic permanent housing as quickly as possible to homeless people, followed by additional supportive services. Early research suggests that Housing First policies have helped improve housing stability, reduce criminal re-convictions, and decrease ER visits—all while saving money.³ That is leverage.

The leverage points checklist

Leverage points are not always intuitive, and are ripe for misuse. The checklist below provides a useful aid to identifying them (ordered from least to most impactful):⁴^,⁵

9. 🔢 Number & Parameters

Sadly, the majority of our attention goes to numbers, such as tax rates, revenue goals, subsidies, budget deficits, or minimum wage. Typically, they don’t provide much leverage on their own, because they rarely change behavior. Truly high-leverage parameters, such as interest rates, are much more rare than they seem.

8. 🏗 Physical Interventions

Physical structures that are poorly designed or outdated can be potential leverage points, but changing them tends to be difficult and slow relative to other, “intangible” interventions.

7. 🔁 Balancing (Negative) Feedback Loops

Negative feedback mechanisms can help systems adjust towards equilibrium, such as how the Federal Reserve adjusts interest rates to promote economic stability, or how companies use employee performance reviews to highlight growth areas and take corrective action if needed. Affecting the strength of these balancing forces can create leverage by reigning in other forces.

6. 📈 Reinforcing (Positive) Feedback Loops

Positive feedback mechanisms can fuel virtuous growth cycles (as with compound interest on an investment portfolio), but left unchecked they will create problems (such as exploding inequality).

Weakening reinforcing loops usually provides more leverage than strengthening balancing loops. For example, the “Housing First” policies discussed above aim to obviate the self-perpetuating nature of homelessness before the loop begins by simply giving people housing immediately.

5. 📲 Information Flows

Simply by improving the efficiency with which useful information can feed back to decision makers, we can often create tons of leverage by enabling better and more timely decisions. Poor quality or slow information can undermine individuals and organizations of all kinds.

4. 📜 Rules

Whoever controls the “carrots and sticks” in a system can exert massive influence. A country’s constitution, a legal system, and a company’s employee compensation plan are all potential high-leverage rules.

For example, salespeople traditionally receive a commission for every deal they close. These schemes provide strong incentives to do big deals and as many as possible. Recognizing that these incentives can fail to promote good post-sale customer support and breed unhealthy internal competition, some executives are experimenting with “vested commissions,” which are paid out over time to sales and customer success employees alike. Such rules can help foster a culture based on relationships and long-term customer success, rather than on transactions.⁶

3. 👩🏽‍🔬 Experimentation

Evolutionary forces are remarkably powerful, even outside of nature, where genetic variation and selection enables species to develop incredible adaptations. Similarly, companies or countries that are able to innovate and evolve will become more adaptable and resilient.

A culture of experimentation helps explain the success of Google, Amazon, and many others. Many firms fail to innovate simply due to an intolerance for failure. The vast majority of experiments will flop; the challenge is in viewing those failures as opportunities for improvement and learning instead of as wasteful side projects.⁷

2. 🎯 Goals

Every system has a goal, though not always an obvious one. All companies, viruses, and populations share the goal to grow—a goal which becomes dangerous if left unchecked. Change the goal, and you change potentially everything about the system.

1. 💡 Ideas

Our shared ideas (or “mental models”) provide the foundation upon which we interpret all our experiences. These ideas may be deeply embedded, but there is nothing inherently expensive or slow about changing them. For high-leverage impact, we must be willing to challenge existing ideas. Step back, take an outsider perspective, expose the hidden assumptions in our ideas, and reveal their weaknesses or contradictions.

Remember: all knowledge is tentative and conjectural. As with Einstein supplanting Newton, our best theories can be toppled at any time by newer, better ones. Since the world is always capable of surprising us, the best approach is to keep our minds open!

Private equity, lords of leverage

The business world offers great real-world applications of leverage, notably in the private equity (“PE”) industry, whose whole operating model revolves around applying principles of leverage.

PE firms seek to buy-out underperforming companies, turn them around, and sell them for a large profit. They immediately identify and act on key system intervention points. For instance, they promptly change the goal of the companies: generate more cash flow. By stressing 2-3 success levers and concentrating efforts towards them, they instigate reinforcing feedback loops. They use debt, which not only amplifies financial returns but also acts as a balancing loop by instilling cash-flow discipline on management. Finally, they often change shared ideas by replacing prominent executives.⁸

The success of this systematic approach is well-established. A 2020 study found that US buyout funds have outperformed the public stock market in nearly every year since 1991.⁹

***

While PE is just one example, the overarching lesson across economic, social, and political systems is that we can drastically increase our impact if we pause, adopt a systems-level view, and assess what points of intervention will most effectively push us towards our goals.