George Rebane
In the months leading up to this election year these pages have recorded a lot of predictions, some more vehement than others. Predicting, estimating, forecasting, …, all of them are hard and involve some level of risk depending how well these efforts are carried out. We spend almost every waking moment taking some kind of peek into the future, and depending on how certain we are of what we see there, we take an action which may involve some kind of hedging if we’re not too confident about what we see.
I’ve always been interested in how well we can prognosticate. Recently nobelist Daniel Kahneman of Kahneman/Tversky fame wrote what instantly became a very famous essay – Thinking Fast and Slow (2011) – on the findings of research into things such as prediction, estimation, risk aversion, reasoning and so on. Bottom line – research to date has shown that humans are mostly not very good predictors, and they don’t always do the reasonable things. Yet we do have a brain bone that has allowed us to survive over the millennia and evolve our thinking capacity to some pretty commendable levels. After all, we did discover relativity, sequence the human genome, put a man on the Moon, and are about to devise an AI that may make us second class citizens on our own planet.
So how well can we predict? Here I propose a fun experiment on the topic that invites RR readers to go on record and compete with each other predicting anything they want and, hopefully, get someone else interested enough to also render their prediction. I offer an easy, intuitive, and enjoyable way to do this using the MAB distribution which asks you to specify four numbers that characterize your prediction. The method is spelled out in a previous post ‘Predicting with Expressed Beliefs – a formal approach’.
Since this is election season, say you want to predict what percent of the Democrat vote Bernie Sanders will get in this Tuesday’s New Hampshire primary. Today that percentage is a random variable, but next Wednesday it will be known and no longer random. All such future values are random variables, and the best we can do to express them is to characterize our belief in the value of the variable is to describe what is called its probability distribution. And as the above referenced post details, this can be done by simply writing down your subjective belief in terms of the Low (L) and High (H) values which bound the range of Bernie’s percentage, the most likely percentage (M) Bernie will get, and your confidence (C) – zero to one – that Bernie’s actual percentage will be in the neighborhood of your best guess or most likely value. In short, your prediction will be a 4-tuple that might look something like – [L, H, M, C] = [51%, 60%, 54%, 0.7].
As described in the cited post, this 4-tuple defines a MAB distribution, that can be compared with other readers’ similarly expressed distributions, and the winner determined in short order by seeing whose distribution yielded that highest likelihood at the actual value realized on Wednesday. (There are also other ways of comparing, but we’ll begin with this simple yet powerful method.)
So I invite everyone to start off by submitting their predictions in this post’s comment stream on all or some of the candidates competing in next Tuesday’s primary. I’ll put them into a spreadsheet I’ve generated that will compare and calculate the results which I will publish in an update to this post. If you want to do your own comparisons, I’ll gladly email you the spreadsheet into which you can enter the competing 4-tuples. From there on, as the weeks and months pass, everyone can offer predictions on anything – future primary results, polls, when the FBI will submit its ‘Comey vote’, and so on.
Finally, I realize that I’m taking a risk that a sufficient number or any RR readers will give a warm bucket of spit about actually putting their predictions on record and having them compared to those of other readers. It may not happen, but those who are interested in how their ‘prognosticator’ is working, here is an easy and correct way of doing it that you may want to use in other undertakings that involve future uncertainty. In future posts, depending on interest, I’ll publish how to use MABs for budgeting, calculating investment returns (and risk), and estimating costs and/or revenues.
[10feb16 update] Well boys and girls, the results are in and it looks like Jo Ann, Russ Steele, and I were not all that good in predicting yesterday’s NH primary results. Of course, we didn’t do much worse than the talking heads on TV, but we did reveal more about our predictions than those pundits ever do. I’d like to see their MABs compared to actual results published as you see ours below.
Actually, we weren’t all that bad when it came to the Democrats, but our efforts on the gaggle of Republicans needs a little work. Hopefully when the field winnows a bit going forward, we will do better.
I have added two additional prediction metrics to the MAB likelihood values, the higher values of which indicates you did better – zero values mean that the actual result fell outside your predicted MAB range. The other two metrics are percent error of your most likely (best guess) call, and a normalized error measure in terms of how many of your MAB’s standard deviations or sigmas did your absolute error (difference between best guess and actual result) contain. The smaller that value, the better were your MABs. So here’s the spreadsheet of the results for Jo Ann, Russ, and me (click on image and then CTRL+ to see a larger version).



Leave a comment