Assessing Regulations: Red Team versus Blue Team, and Related Methods

Decision making is improved by avoiding unaided expert judgment, and using structured judgmental procedures instead. The Red Team-Blue Team approach is one such structured approach. It recognizes that it is difficult for people—including scientists and public officials—to remain objective about the consequences of public policies and regulations. The solution that the approach provides is akin to the adversarial system that we use in our courts.

One of the teams must be genuinely skeptical of the benefits of government action and of any studies offered in support of it. While such skepticism should be the natural attitude of public officials and scientists (see Appendix D), it cannot be taken for granted.

If one team is genuinely advocating for government intervention and the other is genuinely skeptical, they are unlikely to reach an agreement. For that reason, a panel of independent judges will likely be needed to weigh the relative merits of the scientific evidence presented by the Red and Blue teams.

Again similarly to our court system, there should be a presumption that citizens are innocent of the need to be disciplined by regulation. In other words, the pro-regulation team would need to prove beyond a reasonable doubt that regulation would lead to a substantial net benefit relative to no regulation over the long term. To maintain transparency and facilitate any challenges of their findings, the judges should be required to publish their reasoning and recommendation of regulation or deregulation.

In order to guarantee the integrity of the process, regulatory agencies should commit to implementing the judges’ recommendations, so long as they meet pre-specified scientific criteria. That commitment is critical: consider the harm caused by inaugural EPA Administrator Ruckelshaus’s decision to ignore Judge Sweeney’s conclusion—after hearing seven months of testimony—that DDT was not dangerous to people, and had substantial net benefits (Zubrin 2013).

We describe our suggestions for Red Team-Blue Team procedures under four headings. The first suggestion is critical.

  1. Accept only scientific studies as evidence. Not all studies are equal. We recommend considering as evidence only findings from studies that comply with the criteria for a scientific study, as described in the checklist available at com (also attached as Appendix A to this memo). We created that operational checklist for users to determine whether a study meets the eight criteria for science using definitions of science provided by pioneering scientists. Details and evidence are provided in Armstrong and Green (2018).
  2. Conduct meta-analyses. To decide whether a regulation is justified, consider all available relevant scientific evidence—including uncertainty, existing knowledge gaps, and scientific forecasts based on knowledge of the situation. For example, to be able to recommend regulations in response to possible climate changes, evidence-based forecasts are needed on (1) whether a persistent and substantive long-term global temperature trend will occur, (2) whether such a trend would be harmful, and (3) whether regulations could be devised that would reduce any forecasted harm in a cost-effective way.
    1. A general call should be made to submit relevant scientific papers. To reduce the burden imposed by the submission of unscientific papers, we suggest levying a fee for submission expenses. The fee should be refunded for papers that meet the criteria for being a relevant and useful scientific paper.
    2. At least three raters should independently rate each paper using the Guidelines for Science checklist (Appendix A). The process takes less than one hour per paper. Our studies show that raters have found it easy to rate the papers because the criteria are expressed operationally. For example, “Did the study test alternative reasonable hypotheses?”
    3. The Red and Blue teams would independently prepare meta-analyses of the relevant papers that met the criteria for useful science and identify important gaps in knowledge of the situation and on the effects of regulation/no regulation alternatives.
  3. Use Multiple Anonymous Authentic Dissent (MAAD). For regulation problems in which more conclusive evidence is needed than can be obtained from competing meta-analyses, use the Multiple Anonymous Authentic Dissent approach to design and conduct new studies comparing the scientific evidence on competing hypotheses. In MAAD, researchers act as dissenters at key points in the study, beginning with the design. As the project develops, each researcher is asked to independently record all potential defects in the research. An administrator collates and edits the anonymous submissions and circulates the compiled list to the team. Each researcher on the team then assumes that each of the objections has merit, describes ways to deal with them, and sends their anonymous suggestions to the administrator. The administrator then summarizes the suggestions for the team to make anonymous written revisions.
    1. Fund researchers with diverse viewpoints and backgrounds—in particular, different stakeholders relevant to the regulations—to work in teams, with each team using the method of multiple reasonable hypotheses. Funding for their research should be contingent upon complying with scientific criteria.
    2. To ensure independent thinking, the individual researchers in each group should not communicate face-to-face. Instead, they should work in virtual groups, a process that has additional benefits of reducing travel and meeting costs and providing a written record of the reasons for the group’s decisions. The MAAD process should be conducted via an administrator to help ensure that suggestions for improvements are kept anonymous.
    3. The findings from each team would be submitted to a “Science Court, which would operate like a law court. Each team would present its case in writing to the judges. The names of the judges should be included with their decision, along with full disclosure of their reasoning and access to the relevant scientific studies.MAAD is likely to be efficient because the teams’ efforts and time go towards improving the research programs rather than travel, meetings, or defending their viewpoint. Related research on group processes indicates that MAAD should be effective at providing evidence for judging the merits of regulation alternatives.
  4. Ask researchers to use evidence-based checklists: To determine whether a regulation would be beneficial, researchers should complete a checklist to assess whether the regulation meets the logical precursors for success. In order to do so, they will also need to complete checklists for scientific forecasting to identify whether or not forecasts of the effects of regulation and deregulation are valid.
    1. We have developed a checklist for evaluating regulations based on logic and findings from studies on the effects of regulation by leading scientists. The “Conditions Necessary for Successful Regulation” checklist is available at the com website, and is attached as Appendix B. (We developed the site to provide a clearinghouse for studies providing evidence on the effects of regulations.)
    2. We suggest the use of evidence-based forecasting methods to predict whether a regulation will achieve the desired effects while avoiding unintended consequences. The checklists are described in “Forecasting methods and principles: Evidence-based checklists.” Armstrong and Green (2018), and the abstract of the paper is attached as Appendix C.


Armstrong, J.S. & Green, K.C. (2018a). Guidelines for Science: Evidence and Checklists. Working paper available from ResearchGate.

Armstrong, J.S. & Green, K.C. (2018b). Forecasting methods and principles: Evidence-based checklists. Working paper available from ResearchGate.

Zubrin, R. (2013). Merchants of Despair. New York: Encounter Books.

Appendix A

checklist of required criteria for useful scientific research

checklist continued

Appendix B: Forecasting Methods and Principles: Evidence-Based Checklists


Problem: How to help practitioners, academics, and decision makers use experimental research findings to substantially reduce forecast errors for all types of forecasting problems.

Methods: Findings from our review of forecasting experiments were used to identify methods and principles that lead to accurate forecasts. Cited authors were contacted to verify that summaries of their research were correct. Checklists to help forecasters and their clients practice and commission studies that adhere to principles and use valid methods were developed. Leading researchers were asked to identify errors of omission or commission in the analyses and summaries of research findings.

Findings: Forecast accuracy can be improved by using one of 15 relatively simple evidence-based forecasting methods. One of those methods, knowledge models, provides substantial improvements in accuracy when causal knowledge is good. On the other hand, data models—developed using multiple regression, data mining, neural nets, and “big data analytics”—are unsuited for forecasting.

Originality: Three new checklists for choosing validated methods, developing knowledge models, and assessing uncertainty are presented. A fourth checklist, based on the Golden Rule of Forecasting, was improved.

Usefulness: Combining forecasts within individual methods and across different methods can reduce forecast errors by as much as 50%. Forecasts errors from currently used methods can be reduced by increasing their compliance with the principles of conservatism (Golden Rule of Forecasting) and simplicity (Occam’s Razor). Clients and other interested parties can use the checklists to determine whether forecasts were derived using evidence-based procedures and can, therefore, be trusted for making decisions. Scientists can use the checklists to devise tests of the predictive validity of their findings.

Key words: big data, combining forecasts, decision-making, decomposition, equal weights, expectations, extrapolation, index method, intentions, Occam’s razor, prediction intervals, regression analysis, scenarios, uncertainty

Appendix C: Scientific advances depend on dissent

In 1620, Francis Bacon advised researchers to consider “any contrary hypotheses that may be imagined.” In the third edition of his Philosophiae Naturalis Principia Mathematica, first published in 1726, Newton described four “Rules of Reasoning in Philosophy.” The fourth rule reads, “In experimental philosophy we are to look upon propositions collected by general induction from phænomena as accurately or very nearly true, notwithstanding any contrary hypotheses that may be imagined, till such time as other phænomena occur, by which they may either be made more accurate, or liable to exceptions.”

The need for dissent is ingrained in the scientific method (Armstrong, 1980). Few scientists challenge this.

In 1890, Chamberlin expressed the need for dissent in operational terms when he stated that scientific papers should use the “method of multiple hypotheses.” He observed that the fields of science that made the most progress were those that tested all reasonable hypotheses. Platt (1964) argued for more attention to Chamberlin’s conclusions. A review of natural experiments supports Chamberlin’s conclusions about the importance of multiple hypotheses. For example, agriculture showed little progress for centuries. That changed in the early 1700s, when English landowners began to conduct experiments to compare the effects of alternative ways of growing crops (Kealey 1996, pp. 47-89).

There are numerous ways to use dissent. Early research suggested the Devil’s Advocate (DA), in which someone is arbitrarily selected to find everything wrong with a study. Schwenk (1990) concluded from non-experimental studies that this was successful. However, Nemeth, Connell, Rogers, & Brown (2001), in their experimental studies, found that authentic dissent is more effective than applying DA. This occurs, first, because people with authentic beliefs appear to do a better job in arguing that position; second, the other group members realize that the arguments are not pulled out of thin air.

One problem with authentic dissent is that dissenters will become unpopular with the group. (Witness that many of the climate change skeptics were retired from universities.) The DA is randomly selected to protect dissenters. Unfortunately, Nemeth, Brown & Rogers (2001) found little support for the notion that DA protects individuals. Antagonism was almost as serious when using DA as it was for authentic dissent. Another problem with both DA and authentic dissent is that groups tend to bolster their own arguments rather than attempt to overcome objections.


Armstrong, J. S. (1980). Advocacy as a scientific strategy: The Mitroff myth, Academy of Management, Review, 5, 509-511.

Bacon, F. (1620). The New Organon: Or The true directions concerning the interpretation of nature. Retrieved from

Chamberlin, T. C. (1890). The method of multiple working hypotheses. Reprinted in 1965 in Science, 148, 754-759.

Kealey, T. (1996). The economic laws of scientific research. New York, NY: St Martin’s Press.

Nemeth, C., Connell, J., Brown, K., & Rogers, J. (2001). Devil’s advocate versus authentic dissent: Stimulating quantity and quality, European Journal of Social Psychology, 31, 707-720.

Nemeth, C.J. (2003). Minority dissent and its “hidden” benefits.  New Review of Social Psychology, 2, 21-28.

Nemeth, C.J., & Nemeth-Brown, B. (2003).  Better than individuals? The potential benefits of dissent and diversity for group creativity. In P. Paulus & B. Nijstad (Eds.), Group Creativity: Innovation through Collaboration (pp. 63-84). Oxford: Oxford University Press.

Nemeth, C.J., Personnaz, M., Personnaz, B., & Goncalo, J. (2004). The liberating role of conflict in group creativity: A cross-national study.  European Journal of Social Psychology. 34, 365-374.

Nemeth, C., Connell, J., Rogers, J., & Brown, K. (2001). Improving decision making by means of dissent, Journal of Applied Social Psychology, 31, 48-58.

Nemeth, C.J. (2003). Minority dissent and its “hidden” benefits. New Review of Social Psychology, 2, 21-28.

Nemeth, C.J., & Nemeth-Brown, B. (2003). Better than individuals? The potential benefits of dissent and diversity for group creativity. In P. Paulus & B. Nijstad (Eds.), Group Creativity: Innovation through Collaboration (pp. 63-84). Oxford: Oxford University Press.

Nemeth, C.J., Personnaz, M., Personnaz, B., & Goncalo, J. (2004). The liberating role of conflict in group creativity: A cross-national study.  European Journal of Social Psychology. 34,365-374.

Platt, J. R. (1964). Strong inference. Science, 146, 347-353.

Schwenk, C. R. (1990). Effects of devil’s advocacy and dialectical inquiry on decision making: A meta-analysis, Organizational Behavior and Human Decision Processes, 47, 161-176.