Serial Correlation: Patriots Win

The Patriots (that’s my team) beat the Falcons, Tom Brady is the GOAT, Bill Belichick is the GOAT.

How could this have happened?

ESPN’s model had the Patriots as having 0.3% chance of winning at one point, at <1% at more than 20 points.

However, ESPN’s model and many others aren’t properly factoring in serial correlation. Or at least I doubt it, I don’t actually know the inner workings of the model, other than that they use other games in similar situations to project the results.

Serial correlation is essentially the momentum effect. The application in football is thus: it’s unusual for equally matched teams to be separated by a lot of points early in a game, but because it’s a high variance sport, that can happen whether the teams are equally matched or whether one team or the other is superior.

Once one of the teams begins a comeback, if they are actually better than their opponent, they’ll be much more likely than statistics based on other similar games would show to finish the comeback. And there’s a snowball effect, if a team comes back from down 10, they’re likely to be better than their opponent, so more likely to be able to come back from down 14.

That’s because all of the actions the Patriots had to take to come back are related to each other, they’re all team A vs team B, but a sample of similar games played by various teams is going to include more closely matched teams or outmatched teams. But once team A has come back vs team B, it’s more likely that team A is actually beating team B, and will continue to do so. The probabilities have fat tails.

You Have a Higher Net Worth Than the Bottom 37%

The very interesting-sounding statistic that launched a thousand retweets is once again making the rounds because oxfam is milking it yet again has updated their annual report.

That’s right, it’s time to find out about wealth inequality. Statistic wording of choice this time, “Eight richest men are worth the same as HALF the rest of the world.

Now, this statistic sounds interesting, but is actually very stupid highly misleading. The actual interesting statistic, courtesy of Jacob Falkovic, is that the total net worth of the bottom 37% is a tiny smidge above zero. I imagine this will need updating with the latest report, but I doubt it changed much.

That’s right, if you have a positive net worth, you are wealthier than the bottom 37% combined. I suspect this is related to the fact that China and India have about 36% of the world population and not a huge amount of its wealth, and when you add in a few dozen million of the people who have the largest negative net worths (new doctors, lawyers, etc.), you get to a negative figure.

Try not to be too hard on anyone who shares the stupid fact, they are smart enough to know there is something interesting there.

Predictions for 2017

SlateStarCodex puts out predictions for 2017 with confidences. I love the idea, so I’ve hijacked the first few sections and replaced them with my own estimates. I’ve put periods at the end of guesses different from Scott’s.

1. US will not get involved in any new major war with death toll of > 100 US soldiers: 70%.
2. North Korea’s government will survive the year without large civil war/revolt: 95%
3. No terrorist attack in the USA will kill > 100 people: 95%.
4. …in any First World country: 90%.
5. Assad will remain President of Syria: 80%
6. Israel will not get in a large-scale war (ie >100 Israeli deaths) with any Arab state: 90%
7. No major intifada in Israel this year (ie > 250 Israeli deaths, but not in Cast Lead style war): 80%
8. No interesting progress with Gaza or peace negotiations in general this year: 80%.
9. No Cast Lead style bombing/invasion of Gaza this year: 90%
10. Situation in Israel looks more worse than better: 50%.
11. Syria’s civil war will not end this year: 60%
12. ISIS will control less territory than it does right now: 95%.
13. ISIS will not continue to exist as a state entity in Iraq/Syria: 60%.
14. No major civil war in Middle Eastern country not currently experiencing a major civil war: 90%
15. Libya to remain a mess: 80%
16. Ukraine will neither break into all-out war or get neatly resolved: 80%
17. No major revolt (greater than or equal to Tienanmen Square) against Chinese Communist Party: 95%
18. No major war in Asia (with >100 Chinese, Japanese, South Korean, and American deaths combined) over tiny stupid islands: 99%
19. No exchange of fire over tiny stupid islands: 90%
20. No announcement of genetically engineered human baby or credible plan for such: 95%.
21. EMDrive is launched into space and testing is successfully begun: 50%.
22. A significant number of skeptics will not become convinced EMDrive works: 90%.
23. A significant number of believers will not become convinced EMDrive doesn’t work: 70%.
24. No major earthquake (>100 deaths) in US: 99%
25. No major earthquake (>10000 deaths) in the world: 70%.
26. Keith Ellison chosen as new DNC chair: 70%

27. No country currently in Euro or EU announces new plan to leave: 90%.
28. France does not declare plan to leave EU: 99%.
29. Germany does not declare plan to leave EU: 99%
30. No agreement reached on “two-speed EU”: 80%
31. The UK triggers Article 50: 70%.
32. Marine Le Pen is not elected President of France: 80%
33. Angela Merkel is re-elected Chancellor of Germany: 70%
34. Theresa May remains PM of Britain: 80%
35. Fewer refugees admitted 2017 than 2016: 90%.

36. Bitcoin will end the year higher than $1000: 40%.
37. Oil will end the year higher than $50 a barrel: 60%
38. …but lower than $60 a barrel: 50%.
39. Dow Jones will not fall > 10% this year: 80%. (assumed for this and below that he means price indexes as measured from Jan 1-Dec 31)
40. Shanghai index will not fall > 10% this year: 70%.

41. Donald Trump remains President at the end of 2017: 95%.
42. No serious impeachment proceedings are active against Trump: 80%.
43. Construction on Mexican border wall (beyond existing barriers) begins: 70%.
44. Trump administration does not initiate extra prosecution of Hillary Clinton: 95%.
45. US GDP growth lower than in 2016: 30%.
46. US unemployment to be higher at end of year than beginning: 50%.
47. US does not withdraw from large trade org like WTO or NAFTA: 90%
48. US does not publicly and explicitly disavow One China policy: 95%
49. No race riot killing > 5 people: 95%
50. US lifts at least half of existing sanctions on Russia: 60%.
51. Donald Trump’s approval rating at the end of 2017 is lower than fifty percent: 90%.
52. …lower than forty percent: 60%

Statistical Pet Peeves

I am fairly well known around my office for cynicism skepticism around “fun facts”. I usually need to see the figurative receipt before I believe it. When it comes to more complicated topics, I’m going to want to see the study.

Not all studies are created equal, however, and there are a million ways to use results to mislead the target audience (often journalists or the public).

There are three kinds of lies: lies, damned lies, and statistics.

Benjamin Disraeli

There are two things I do when I spot statistic abuse: the first is to close my browser tab, shake my head briefly for having been tricked into wasting a click, and forgetting it ever happened. The second is, if the subject matter is interesting enough, to look into the original study and see what the authors actually said. If the authors themselves are the ones making things purposefully unclear, I usually just assume the study is biased too, and don’t update my bayesian priors at all.

Without further ado, two fast ways to make me think the author is a clown in the best case, or purposefully misleading in the worst case.

1.) Stating a change of a percentage as a percentage:

For example: Reporting a change in income taxes from 10% to 11% that looks like:

Income taxes are slated to rise by 10% starting in 2017.

The average reader is going to have absolutely no clue what this means, and is likely to conclude that taxes (which they probably know are already close to 10%) are going to double.

This sort of thing happens all the time, and is most often seen in headlines, which is one of my unforgivable sins. A faster way to my ‘ignore list’ does not exist. That is why my adblocker also blocks the entire news section of yahoo finance.

Now, there is a perfectly acceptable way to report the percent on a percent change, but using the same example above, it looks like:

Income taxes are slated to rise from 10% to 11% starting in 2017, a 10% increase.

It’s so easy to make things clear that I can’t help but have a harsh interpretation when it’s not done.

2.) Dual Y-axis Graphs

What is a dual Y-axis graph and why do I have beef?

Here’s the first sample I found on google.

Note the two y-axes. this is the hallmark of a two y-axis graph.

So what is the problem? The problem is that you can make a two y-axis graph “say” virtually anything you want. The relative values of units are completely out the window (and don’t get me started if the units are different). The only thing you can’t abuse is the direction of relationship from start to finish, assuming the data is a time series (meaning the x-axis represents dates/times).

If a graph has two x-axes and doesn’t have zero showing at the bottom of both axes, it’s a clear sign that the graphee has an agenda.

Now, sometimes you want to make as strong a case as possible, and sometimes you have a time series (which, in my opinion in the most proper time to use a two-axis graph, if one exists) rather than a bar chart comparing discrete variables or something, and the point of the graph is to reinforce the relationship within the series (upticks and downticks on the same days in the series, for instance), then a two-axis graph is a powerful visual tool.

For me, this means the person publishing the graph needs to have already made the point they are trying to prove clear (rather than using the graph as the smoking gun), and needs to have an already impeachable standing as far as statistical integrity. Needless to say, there aren’t many people whose graphs pass this test.

Dual axis graphs are a staple of finance presentations (investment bankers, sell-side firms, etc.) and publications with agendas, and it takes too much time to unravel the actual relationships shown in the data for them to be worth much.

Get more angry rants below: