Breunig, Christoph (Emory University)
Haan, Peter (DIW Berlin and FU Berlin)
We consider the problem of regressions with selectively observed covariates in a nonparametric framework. Our approach relies on instrumental variables that explain variation in the latent covariates but have no direct effect on selection. The regression function of interest is shown to be a weighted version of observed conditional expectation where the weighting function is a fraction of selection probabilities. Nonparametric identification of the fractional probability weight (FPW) function is achieved via a partial completeness assumption. We provide primitive functional form assumptions for partial completeness to hold. The identification result is constructive for the FPW series estimator. We derive the rate of convergence and also the pointwise asymptotic distribution. In both cases, the asymptotic performance of the FPW series estimator does not suffer from the inverse problem which derives from the nonparametric instrumental variable approach. In a Monte Carlo study, we analyze the finite sample properties of our estimator and we demonstrate the usefulness of our method in analyses based on survey data. We also compare our approach to inverse probability weighting, which can be used alternatively for unconditional moment estimation. In the empirical application, we focus on two different applications. We estimate the association between income and health using linked data from the SHARE survey data and administrative pension information and use pension entitlements as an instrument. In the second application we revisit the question how income affects the demand for housing based on data from the Socio-Economic Panel Study. In this application we use regional income information on the residential block level as an instrument. In both applications we show that income is selectively missing and we demonstrate that standard methods that do not account for the nonrandom selection process lead to significantly biased estimates for individuals with low income.
selection model; instrumental variables; fractional probability weighting; nonparametric identification; partial completeness; incomplete data; series estimation; income distribution; health