Big Trouble in Little Data

10 Nov

(Nicu Buculei / Flickr )
Last week, The New York Post reported a 250 percent increase in New York City’s murder rate: the city saw seven murders last week, compared to two the same week a year earlier. Does that mean the city’s in the grips of a crime wave?The trouble with the Post analysis is that one week of data tells us very little about long-term trends. The city’s murder rate varies way too much from week to week.

As Gawker was quick to point out, if we look at the data from two weeks ago instead, it seems like the murder rate has dropped 50 percent since last year. So which is it: is the murder rate skyrocketing or is it plummeting? Without a lot more data, we should be careful about claiming either.

It’s not just the Post. We all have a tendency to see stories in short-term fluctuations.

When stock prices move in the morning, does it mean something big or is it just noise that will even out? For an illustration of just how hard it is to time the market and identify the start of a real trend, try playing this stock trading game from Bloomberg Business or Quartz‘s historical S & P 500 game.

Sports commentators sometimes read into randomness, too. Part of the problem is that we imagine randomness to be orderly and evenly spaced out, when real randomness generates plenty of streaks and clusters. With lots of athletes playing lots of games, a sport will generate some random outcomes that feel irresistibly like patterns.

If you flip a coin enough times, it would be strange not to have any long runs of one outcome. But last week a CBS Sports headline suggested that the New England Patriots’ winning 19 of their last 25 pre-game coin tosses was “impossible.” Once you consider that streak in context, it looks far from impossible. In fact, with 32 teams in the NFL each playing 16 games a season, you’d expect some team to have a run that “impossible” every few years.

Randomness can fool us when our dataset is small, and it can also lead us astray when the numbers are imprecise. Last Friday’s strong jobs report triggered a flood of reactions (including our coverage from the AP), but The Bureau of Labor Statistics estimates job growth with a survey and results get revised over time. As The Upshot illustrated last year, that means there’s a margin of error and a particular number of actual jobs might produce a range of different headlines through statistical noise.

With a presidential election approaching, we’re about to get another big dose of statistical noise: stories based on poll results. With so many polls being taken every week and some randomness in each, surprising results are inevitable. If you focus only on recent results in individual polls, it’ll make for a lot more ups and downs than the longer-term trend. Keep that in mind next time you see the outliers in the news.

Leave a comment

Posted by on November 10, 2015 in African American News


Tags: ,

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s

%d bloggers like this: