Metaculus

Forecasting AI

Metaculus Question approval checklist

The guidelines below are general rules that should be followed in the large majority of cases. There are always exceptions to the rules, but if your question breaks a guideline below then it is worth checking with another moderator or admin to get a second or third opinion that it is the best path forward.



Explanations/Rationales

Formatting and style

Does the question have at least 2 paragraphs / ~100 words of context? If not, is the question self-explanatory or high value and urgent?

Not everyone reading/predicting a question is a domain expert, or has even heard of the question’s subject. Good background text should draw the reader’s interest and provide some glimpse into our current understanding on the subject. Inevitably some of this information may become out of date or irrelevant, but this is fine. Metaculus is as much a record/evaluation of past predictions as it is a tracker of current information on a subject.

Exceptions to this rule can be made in some scenarios, such as tracking COVID-19 deaths while we’re 1 year into the pandemic (self-explanatory), or if a question needs to be written/opened quickly, such as an outbreak of political violence (high value and urgent).

Are dates written in "Month, day year" format?

Our forecasters, readers, and question authors come from all over the world, and we should avoid the confusion of seeing different date formats in different questions/contexts. Month, day year (for example, “September 26, 2021” or “Sep 26, 2021”) is aligned with Metaculus’ design elements, and is easy for people to read and parse.

Are all dates written as absolute dates (not relatively, such as “in this century” or “after the opening of this question”)?

People come to our questions in many contexts, and may not be familiar with terms like “resolution date” or “opening date” or our site’s interface. Many of our questions are meant to remain open for a long time; 10 years from now, it may be hard to parse the meaning of “in the next 100 years” or “when this question was written”.


Resolution and Scoring Details

Is the question resolution solely under the authority of Metaculus Admins or an external third-party?

Resolution should almost always come in the form of 1. Publication from third-parties; or 2. Discretion of Metaculus Admins. Metaculus moderators are valued members of our team, but they do not have authority to resolve questions. Resolution text including “polls from users in comments” or “if at least 1 moderator says…” should be avoided. Possible exceptions: Will at least one Metaculus user report a positive test result for novel coronavirus by the end of 2020? -- users can submit verifiable claims for Admins to approve (though alternative resolutions are generally preferred).

If the question involves currencies or prices, is the resolution inflation-indexed?

Though there are often exceptions to this rule, Metaculus has several far-future forecasts where inflation can significantly change the terms. When considering resolution dates 10 years out or more, inflation indexing is generally preferred.

When naming a resolution source, are we trying to forecast that particular source more than the “true answer”? If we prefer the “true answer”, are any fallback sources listed?

Sometimes we care about the behavior of a source, like When will the WHO announce that the COVID-19 pandemic has ended?, other times we care about the actual answer, like Will 2021 be the hottest year on record according to NASA?. We have a default policy in place: we will assume the question is tracking the “true answer” unless the author specifically stresses otherwise.

Does the resolution avoid linking/dependence on other Metaculus questions?

For example, a question might say “If this linked question on Metaculus resolves true, then how many X by Y date?”. This may seem to have no harm and make the question briefer and simpler, but this hides important complexity. Every term in the resolution criteria is significant, and linking questions can lead to cascades of simplifying and misunderstanding criteria. Even if criteria are copied and pasted, this redundancy encourages predictors to re-review the terms, potentially discovering ambiguities.

Does the resolution avoid dependence on any individual or group saying particular words or phrases? If not, are they formal or well-defined terms?

Examples of poor questions: In the 2020 US Presidential election, when will the losing candidate concede? And Will any body of the US federal government conclude that COVID-19 originated in a lab in Hubei before June 1st 2022? In cases like these, a public figure might be under great pressure to make a certain statement, and their reluctance to do so will often lead them to vague or softened word choices. No matter how broad or inclusive we might define resolution criteria, these situations frequently lead to polarizing debates and unsatisfying resolutions. If a question can’t be defined on concrete actions or information, it is a sign it should be avoided (With an exception for formal or well-defined terms).


Sensitive Subjects

Does the question avoid predicting the mortality of an individual or specific small group?

Example of a poor question: Will the number of living humans who have walked on another world fall to zero? This question can be easily rewritten to focus on future space missions (a matter of public interest), rather than the health and longevity of retired astronauts (not appropriate).

A good question: When will the next US Supreme Court vacancy arise? Though an individual’s death is a component of resolution, it is arguably not the most likely component; regardless, the transition is highly important to the public interest. Public interest can outweigh this rule, for instance in a question like “When will Kim Jung Un no longer be Dictator-For-Life?”

Does the headline question avoid making a stronger/weaker claim than the resolution criteria?

This is naturally somewhat more of an art than a science. Although every detail in the resolution criteria is relevant, the more the headline question matches the resolution criteria, the stronger the question and forecasts will be. If a complicating detail does not make a question more insightful, remove it.

Does the topic avoid highly controversial or potentially controversial subjects? If not, has it been reviewed carefully by 2 mods/admins?

Sometimes in controversial areas, Metaculus can offer a public service in gathering high-quality information and giving falsifiable predictions. If there is value or public interest in controversial subjects, they can make for good questions. However the stakes are generally higher, and such questions will attract more attention. More care is necessary in these cases.