The Ideal of Preregistration and what warrants it

class: center, middle
name: qrcode

github.com/aaronpeikert/bayes-prereg

.pull-right[![CC0](https://mirrors.creativecommons.org/presskit/buttons/88x31/svg/cc-zero.svg)]

---
class: inverse, center, middle

# Trust in Science

---
class: center, middle

.enormous[?]

---

## Is there a replication crisis?*

.center[

<img src="presentation_files/figure-html/unnamed-chunk-4-1.svg" width="70%" />
]

.small.right[Baker, M. 1,500 scientists lift the lid on reproducibility. *Nature* 533, 452–454 (2016). https://doi.org/10.1038/533452a]

.small.right[***They call it "reproducibility" and fail to publish the raw data.**]

---
class: center, middle

# Transparency in Science

---
class: center, middle, inverse

# Ideal Preregistration

---
class: center, middle

# Preregistration as Code

### version control + dynamic documents

---
class: center

.pull-left[

### Standard Preregistration

hunches

↓

preregistration

↓

data

↓

article draft
]

.pull-right[
### Preregistration as Code

simulated data

↓

article draft with mock results

↓

data

↓

article draft with real results

]

---
class: inverse, center, middle

# Traditional view

---
class: center

# Preregistration separates:

.pull-left.center[

# confirmatory

]

.pull-right.center[

# exploratory

]

.pull-left.center[

# ≈

# preregistered

]

.pull-right.center[

# ≈

# not preregistered

]

---
class: center, middle

# Consider three scenarios:

---
layout: true
class: center, middle
---

.huge[1.]

.large[You predict the outcome of a horse race correctly.]

.large[Great! You know something about horses.]

---

.huge[2.]

.large[You tell me the outcome of a horse race after the race.]

.large[Great! You know something about reading a chart.]

---

.huge[3.]

.large[You predict that you can tell me the outcome after the race.]

.large[Great! You know that you can preregister harking.]

---
layout: false
class: center

# Preregistration separates?

.pull-left.center[

# confirmatory

]

.pull-right.center[

# exploratory

]

.pull-left.center[

# ≈

# preregistered

]

.pull-right.center[

# ≈

# not preregistered

]

---
class: center, middle

## Obvious problems:

1. not a principled rational
2. vagueness of prediction is not accounted for
3. what about changes after the preregistration

---
class: center, middle

.left[I propose that, reduction in ]

.large.blurry2[Uncertainty]

.right[warrants preregistration.]

---
class: center, middle

.pull-left[

It makes a difference what you

## Preregister:

]

.pull-right[

### Results (= confirmatory)

### ≠

### Inductive Process (≠ confirmatory)

]

---
class: center, middle

# Can I explore the data?

# Yes.

---
class: center, middle

# Can I deviate?

# Sometimes.

---
class: center, middle

# How much detail?

# Very detailed.

---
class: center, middle

# World → Theory

---
class: center, middle

# World ← Theory

---
class: center, middle

# World ↔ Theory

.curly.white.large[It's a match!]

---
class: center, middle, inverse

# Principled Approach

---
class: center, middle

# `$L(\text{Theory}, \text{Data})$`

---
class: center, middle

# ѴαЯi€𝜏y

### `$L(\text{Theory}, \text{Data})$`

---
class: center, middle

`$$L_1(\text{Theory}, \text{Data})$$`
`$$\vdots$$`
`$$L_\infty(\text{Theory}, \text{Data})$$`

---
class: center, middle, inverse

# Statistical Models

= automated induction

---
class: center, middle

# Theory

---
class: center, middle

# T█e█ry

---
class: middle

`$$Model(\text{Theory})$$`
---
class: middle

`$$Model(\text{Theory}, \text{Data})$$`

---

### How statisticians judge a model:

Sampling the world and compare:

$$
L(\text{Model}(\text{Data}), \text{Data}) + \mathcal{C}(\text{Model})
$$
e.g.: Adjusted R², Stein's Unbiased Risk Estimator, Information Criteria, etc.

.center[

**We account for peeking at the data**

]

---

### Researcher have a harder job:

Sampling the world and compare:

$$
L(\text{Model}(\text{Data}), \text{Data}) + \mathcal{C}(\text{Model}) + \mathcal{C}(\text{Human})
$$
.center[

**We must account for the models and humans ability to make sense of any data.**

]

---
class: center, middle

.left[What happens if]

.center[

`$\mathcal{C}(\text{Model})$` or `$\mathcal{C}(\text{Human})$`

]

.right[is unknown?]

---
class: center, middle, inverse

# .blurry[Uncertainty]

---

The goal:

.center[

`$L(\text{Model}(\text{Data}), \text{Data}) + \mathcal{C}(\text{Model}) + \mathcal{C}(\text{Human})$`

]

The enemy:

.pull-left[

.blurry2[$$\mathcal{C}(\text{Model})$$]

]

.pull-right[

.blurry2[$$\mathcal{C}(\text{Human})$$]

]

The solution:

.pull-left.center[

### Computational Reproducibility

]

.pull-right.center[

### Preregistration

]

---
class: center, middle

.large[**Preregistration**]

.large[~~**Computational Reproducibility**~~]

.tiny[Maybe not today.]

---
class: inverse, center, middle

# Traditional view

---
class: center

# Preregistration separates:

.pull-left.center[

# confirmatory

]

.pull-right.center[

# exploratory

]

.pull-left.center[

# ≈

# preregistered

]

.pull-right.center[

# ≈

# not preregistered

]

---
class: center, middle

# Consider three scenarios:

---
layout: true
class: center, middle
---

.huge[1.]

.large[You predict the outcome of a horse race correctly.]

.large[Great! You know something about horses.]

---

.huge[2.]

.large[You tell me the outcome of a horse race after the race.]

.large[Great! You know something about reading a chart.]

---

.huge[3.]

.large[You predict that you can tell me the outcome after the race.]

.large[Great! You know that you can preregister harking.]

---
layout: false
class: center

# Preregistration separates?

.pull-left.center[

# confirmatory

]

.pull-right.center[

# exploratory

]

.pull-left.center[

# ≈

# preregistered

]

.pull-right.center[

# ≈

# not preregistered

]

---
class: center, middle

## Obvious problems:

1. not a principled rational
2. vagueness of prediction is not accounted for
3. what about changes after the preregistration

---
class: center, middle

.left[I propose that, reduction in ]

.large.blurry2[Uncertainty]

.right[warrants preregistration.]

---
class: center, middle

## Let's be more specific

---
class: center, middle

#### My favorite `$L$`:

$$
P(H|E) = \frac{P(H)P(E|H)}{P(H)P(E|H) + P(¬H)P(E|¬H)}
$$

But this holds for any `$L$` that satisfies the "statistical relevancy condition", i.e., decreases if `$P(E|H) \gg P(E|¬H)$`, ceteris paribus.

---
class: center, middle
name: theoretical-risk-plot2

---
class: center, middle
layout: true
---

.large[**Preregistrations**] reduce .large[**uncertainty**] about .large[**theoretical risk**].

---
class: center

Different goals:

.pull-left[

## Preregistration

]

.pull-right[

## Confirmation

]

---
class: center, middle

.pull-left[

It makes a difference what you

## Preregister:

]

.pull-right[

### Results (= confirmatory)

### ≠

### Inductive Process (≠ confirmatory)

]

---

# Can I explore the data?

# Yes.

---
template: theoretical-risk-plot2

---

# Can I deviate?

# Sometimes.

---
template: theoretical-risk-plot2

---

# How much detail?

# Very detailed.

---
template: theoretical-risk-plot2

---
class: center, middle, inverse

# Ideal Preregistration

---
class: center, middle

# Preregistration as Code

### version control + dynamic documents

---
class: center

.pull-left[

### Standard Preregistration

hunches

↓

preregistration

↓

data

↓

article draft
]

.pull-right[
### Preregistration as Code

simulated data

↓

article draft with mock results

↓

data

↓

article draft with real results

]

---
template: qrcode