## Joseph Cooprider and Brigham Frandsen, Department of Economics

When there is endogeneity in an economic model, basic ordinary least squares regression analysis breaks down. Our assumptions for the model collapse so we cannot infer causality without bias in our estimations. Therefore, use of an instrumental variable is necessary. However, if instruments are weak, sample sizes are small, or assumptions about error terms are invalid, then our analysis is biased as well. We developed an exact, finite-sample approach to instrumental variables estimation and inference that remains valid for weak instruments, small samples, and other settings where large-sample approximations are poor. This approach imposed no parametric model for causal effects and made no distributional assumptions on the outcome variable. We estimated effects of a possibly endogenous binary regressor using an instrumental variable.

This method makes inferences about the effects of a binary treatment on a scalar outcome or response variable. Individuals can be sorted into three groups: never-takers, always-takers, and compliers. Never-takers are those who do not take the treatment no matter how they are assigned. Always-takers are those who take the treatment no matter how they are assigned. Compliers take the treatment if they are assigned to, but not otherwise.

In this framework, we can calculate the distribution of compliers’ potential outcomes. By going through the entire hypergeometric distribution of outcomes we can calculate the exact probability of a complier’s scalar outcome being below a specific threshold, which we will call x. Through this, we can calculate the smallest confidence interval possible of this probability for a given alpha.

Going through each potential outcome leads to large computation times so for sample sizes large enough, we take random draws from the hypergeometric distribution to create a confidence interval that may not be exact but also does not rely on parametric model.

Because computation time is large, we used the super computer to create a directory of every possible confidence interval for sample sizes less than 100. This is possible since all the parameters which are need to calculate our confidence interval are finite. I uploaded this directory to a website I made for this project, nonparametriciv.com. Users can go to this website and, using parameters calculated from their own data set, can immediately obtain a confidence interval using our method from our directory. This allows people to use our method without using the large computing time that is normally required.

Further research must be done to determine whether confidence intervals calculated using our nonparametric methods are narrower than those calculated using traditional parametric methods. However, we mathematically showed why our methods can be relied upon for cases where sample sizes are small and other instances when parametric methods cannot be relied upon.