Metaculus AI Forecasting Track Record
One of the core ideas of Metaculus is that predictors should be held accountable for their predictions. In that spirit, we present here a track record of the Metaculus system.
The first of these graphs shows every resolved question, the time it resolved, the resolution, and the score (Brier or Log) for that question. Note that:
- You can select the score for the community prediction, the Metaculus prediction, or the Metaculus postdiction (what our current algorithm would have predicted if it and its calibration data were available at the question's close).
- Lower Brier scores are better; higher scores are better for Log scores; in both cases better scores are higher on the plot.
- The bright line provides a moving average of the scores over time.
- The dashed line shows the expected score for a maximally uncertain prediction, i.e. for 50% on binary questions, or a uniform distribution for continuous predictions.
The second graph is a histogram that more clearly shows the distribution of scores. Bins to the right of 0 (for log) or 0.25 (for Brier) scores contain predictions better than complete uncertainty.
The third graph breaks the predictions into bins and shows how well-calibrated each bin is. For example, if a perfectly calibrated predictor (represented by the dashed line) predicts 80% on 10 different questions, then we expect that 8 out of those 10 would resolve positively. (An ideal predictor would only predict with absolute certainty and would be correct every time, and so wouldn't have any data at all for the center of the graph.) The thick gray line in each bin shows that bin's proportion of positive predictions. The vertical range of each bin corresponds to the 50% Jeffreys confidence interval.
These graphs are interactive. You can click on individual data points to see which question they refer to, and you can click on the different calibration bins to highlight the data points. You can also filter by date and category to see the track record for a subset of questions.
Note: The Metaculus prediction, as distinct from the community prediction, wasn't developed until June 2017. At that time, all closed but unresolved questions were given a Metaculus prediction. After that, the Metaculus prediction was only updated for open questions.
For each question, the Metaculus postdiction uses data from all other questions to calibrate its result, even questions that resolved later.