MIT Technology Review 1h ago

The inevitable weakness of metrics

There are plenty of useful things a metric can reveal. There are even more it can obscure or corrupt. It took me well over a decade of tracking my own life in ever greater detail to fully appreciate this duality, which probably reveals something about both me and the nature of measurement. Like a lot…

The inevitable weakness of metrics

Quantifying our lives is easier than it's ever been. But a philosopher of games warns that external metrics and data can never capture what's truly important.

Like a lot of people bitten by the self-quantifying bug, I initially started gathering personal data to pursue a nebulous collection of goals and desires. As a sedentary technology journalist, I wanted to feel better physically and emotionally, to get outside more, and-where possible-to bring order to some of the messiness and uncertainty of my daily existence. These all seemed to be things that could be improved with the cool clarity of numbers.

Self-quantifiers often get stereotyped as obsessive self-optimizers (and many of them are), but my reasons for producing and collecting personal data were less about life-maxxing and more about life meaning-at least at first. As most people who know me will attest, I do not have now, nor have I ever possessed, a "productivity mindset." I'm also not all that interested in life hacks, shortcuts, or new ways to compare myself with other people. Instead, what I wanted out of metrics-what I hoped I could divine from a never-ending stream of numbers about my health, work, and social life-was something more elusive: self-knowledge. This was my first mistake.

The quantified self

The idea that the more we know, the better is so profoundly embedded in our culture that it feels weird to even point it out. Since at least as far back as the Enlightenment, the primary way we've all agreed to go about knowing more has been through measurement and quantification. After all, more knowledge-more data-leads to better decisions, which leads to happier, more fulfilled people. Or so we're told, and with increasing frequency in the era of AI.

When two Wired magazine editors, Gary Wolf and Kevin Kelly, coined the term "quantified self" in 2007 and helped launch the movement we are all now helplessly a part of, they were essentially selling this very idea. "Unless something can be measured, it cannot be improved," wrote Kelly in an early blog post, doing his best impression of Lord Kelvin. "So we are on a quest to collect as many personal tools that will assist us in quantifiable measurement of ourselves."

Almost 20 years later, that quest is easier than ever thanks to a flood of devices, apps, and websites all designed to help us build our self-knowledge through numbers.

My first tool was a small, plastic clip-on Fitbit I started using in 2011. It did one thing: count the number of steps I took in a day. As a lifelong video game player, I was already well acquainted with the motivational power of simple scoring systems, and I hoped my new gadget would offer the gentle numerical nudge I thought I needed to step away from my Twitter feed and, if not touch grass, at least walk next to some. Walking also seemed to be one of the few times I had what could charitably be called intelligent ideas, which seemed like another promising by-product of doing more of it.

Alas, that was short-lived. I can't tell you precisely when "getting out into nature more" or "thinking smarter thoughts" stopped mattering to me as goals, but I suspect it took no more than a few weeks. What I can say with certainty is that my initial goal of 6,000 daily steps quickly turned into 10,000, which then jumped to 15,000 and eventually settled at 20,000 for years.

Stories about becoming a "steps guy" are clichéd at this point, and they've earned that status for a reason. It didn't take long for me to trade in pedometers for heart-rate monitors (I also started running), smartwatches, sleep-tracking rings, and an embarrassing number of macronutrient-tabulating apps.

Outside the health and fitness realm, my early career as a journalist also happened to coincide with the rise of social media and web analytics tools like Chartbeat, which promised to further quantify difficult-to-measure aspects of my life, like "job success" and "impact," by tracking things like page views, followers, retweets, likes, and all sorts of other attentional metrics that now carry great weight.

The trap of measurement

Metrics inevitably redefine your core sense of what's important, whether you're aware of the trap or not.

Ultimately, during the 10-plus years I diligently tracked my heart rate, steps, active calories, sleep, story engagement time, stress levels, and other metrics, I gained virtually nothing in terms of greater self-knowledge. (I suppose I did learn that I liked to make numbers go up and down, but who doesn't?) The swirl of data that followed me everywhere did not lend additional meaning or insight to the way I relate to myself, my work, or the important people in my life. In fact, the more I used numerical proxies, the worse I felt about pretty much everything.

What I did learn were two important lessons about what happens when you try to quantify the minutiae of your life:

First and foremost, whatever the amount of data you're currently collecting about yourself, it will never feel sufficient. There's always a new metric around the corner, a better way for a tracker to remix its readings and more accurately measure what's "important": heart rate variability, daily stress, exercise "readiness," cardiovascular or "fitness" ages. Measurement begets more measurement. You can count on it.
The second lesson was less obvious but no less significant. The more personal or nuanced your goals are when you set off on your self-quantifying journey, the more likely it is you will ultimately replace them with some simplified metric or ranking. Want to become a better journalist? Why not use page views and leaderboards as a proxy for success? Enjoy cooking and want to improve? Foodie metrics dictate that more complicated recipes with longer ingredient lists are the answer.

Even when we know that the value of good journalism isn't reflected in how many people read a given story or that the joys of cooking are as much about improvisation and experimentation as about successfully following some complex recipe, it's hard to resist the allure of a simple score or stat. Metrics inevitably redefine your core sense of what's important, whether you're aware of the trap or not.

Value capture

Over the years, people have invented various terms to describe this phenomenon. In his recent book The Score: How to Stop Playing Somebody Else's Game, the philosopher C. Thi Nguyen calls it "value capture." Value capture happens, he says, when you adopt external sources of measurement and then let them rule you without adapting them to suit your life.

"In value capture, you're essentially outsourcing your values," Nguyen writes. "You're letting an external metric or ranking set what's important for you."

Crucially, you're also outsourcing the process of figuring out your own sense of meaning. It's why my walks quickly shifted from feeling meditative to prioritizing miles.

Individuals, institutions, and indeed entire societies can fall prey to value capture. In fact, once you start noticing it, you start seeing it everywhere-in journalism, education, and business, but also in our food, our hobbies, and, yes, the way we measure our health and happiness.

Here's how Nguyen puts it:

Value capture happens when a restaurant stops caring about making good food and starts caring about maximizing its Yelp ratings. It happens when students stop caring about education and start caring about their GPA. It happens when scientists stop caring about finding truth and start caring about getting the biggest grants. It even happens in religion. A pastor recently told me that his church had become completely obsessed with baptism rates. The higher-ups had established an internal leaderboard in which the pastors competed on monthly baptism rates, and it was starting to dominate everybody's attention. He'd found himself caring less about the long-term spiritual development of his flock and focusing more on trying to deliver popular sermons that would up his baptism rates and move him up that leaderboard.

Games versus real-world metrics

At its core, The Score is trying to untangle a mystery that Nguyen, a specialist in the philosophy of games at the University of Utah, has been thinking about for a long time: Why is it that numbers and scoring systems in games can be the source of so much joy and fluidity and play, but public measures and institutional metrics (i.e., scores that apply to the real world) seem to drain the life out of everything and thrust us all into a bleak mindset of grinding optimization?

To begin to answer this question, he turns to one of the foundational inquiries into the limits of data and quantification, Theodore M. Porter's 1995 book Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Porter, a historian of science who specializes in the social power of numbers, has spent his career looking at why quantification has become so dominant, not just in political and bureaucratic life but everywhere.

One of his key insights about the inherent attractiveness of quantification, which he calls "a technology of distance," is that it "minimizes the need for intimate knowledge and personal trust." Put another way, metrics travel extremely well between different contexts and are easy to grasp and aggregate. Whether it's a student's GPA or a country's GDP, these measures are understood by pretty much everyone.

But that understanding comes at a price, Porter reminds us: To arrive at a clear metric, you inevitably need to simplify what you're attempting to measure, often jettisoning heaps of nuanced, qualitative, or open-ended information so that others can find the resulting number legible. No one (hopefully) believes that a GPA captures in any meaningful way a student's entire educational experience or aptitude for learning, but we've agreed to use it because more qualitative assessments are onerous to wade through and require expertise to decipher and compare. Ditto for the economic metric of GDP, which politicians and societies are now compelled to drive higher and higher because a group of economists once concluded that this figure correlates with general economic well-being.

This is the essential tension at the heart of all data, argues Nguyen. Any institutional quantification, he says, requires that the evaluation procedure and its product be comprehensible across contexts. That profoundly limits what the metric can actually measure.

"In value capture, you're ultimately taking that decontextualized nugget and internalizing it," he writes. "You're guiding your life using an evaluative technology that has been engineered to travel between contexts, by stripping it of nuance."

Goodhart's Law and its limits

Every so often I'll find myself in friendly debate with a "numbers person"-a statistician, an economist, or a friend who's still a committed self-quantifier. After patiently listening to my measurement-gone-awry examples-the disastrous attempt to quantify pain as "the fifth vital sign" in the mid-1990s (which exacerbated the opioid epidemic), or any of the countless examples of the McNamara fallacy, where decisions in academia, medicine, and politics are based solely on what's easily measured-many will insist that I'm misunderstanding or misinterpreting the whole point of measuring.

Metrics, they'll say, are simply a means, and the important questions concern the ends for which they are used. In other words, these unfortunate outcomes amount to user error, not something inherently dangerous or misleading about the nature of measurement.

At some point during these conversations, Goodhart's Law will invariably come up, usually as an explanation the metrics-minded deploy for why the ends get all mucked up. The principle, which is attributed to the British economist Charles Goodhart, is often expressed as the following: "When a measure becomes a target, it ceases to be a good measure."

I have a profound dislike for Goodhart's Law, not because I think it's untrue, but rather for the way it gets interpreted. As Nguyen notes, Goodhart's Law says very little about why metrics fail to capture what's important-or what to do about it. Find better measures, some will conclude. Don't let metrics become targets, others will insist. These are not helpful takeaways.

All measurements, I would argue, are in fact targets, whether you intend them to be or not. Metrics inevitably present one direction or option as better, Nguyen writes in The Score-"longer lifespans, faster student graduation rates, more page views, higher customer satisfaction scores." What people are talking about when they bring up Goodhart's Law isn't human error; it's actually a fundamental problem with measurement itself.

The value of measurement

I want to be clear here: Measurement can and does serve a number of vital functions. It has in a very literal sense made the modern world possible, with all its life-saving, suffering-reducing, and awe-inspiring scientific breakthroughs. When used with care and diligence, metrics can make our progress (or lack of it) clearer and more transparent. Are we decreasing carbon dioxide emissions or not? They can also introduce accountability into formerly opaque systems, such as by measuring whether a company is complying with state and federal regulations. They can even make us more objective, reduce biases, and galvanize us to act.

But as Nguyen points out throughout The Score, the fundamental weakness of metrics comes when we use them to pursue subtler, more p

Read on MIT Technology Review ↗ ← Back to News

The inevitable weakness of metrics

The inevitable weakness of metrics

The quantified self

The trap of measurement

Value capture

Games versus real-world metrics

Goodhart's Law and its limits

The value of measurement

Comments