It is shown that whenever g is Lipschitz, though not necessarily differentiable, the posterior distribution of g(theta) and the bootstrap distribution of theta_n coincide asymptotically. One implication is that Bayesians can interpret bootstrap inference for g(theta) as approximately valid posterior inference in a large sample. Another implication---built on known results about bootstrap inconsistency---is that credible sets for a nondifferentiable parameter g(theta) cannot be presumed to be approximately valid confidence sets (even when this relation holds true for theta).]]>

This working paper is an updated version of W16/17.

]]>This forestalling hampered an attempt by HMRC (the UK’s tax authority) to estimate the revenue effects of the tax rise (HMRC, 2012) using incomplete data from the first year of the higher tax rate (2010–11). This analysis used an aggregate difference-in-difference approach. In this paper we update this analysis, using more complete data on the first year following the reform (2010–11) and an additional year of data (2011–12) that was unavailable when HMRC conducted their analysis. Using a similar method to HMRC (2012), we estimate an elasticity of around 0.31 based on the response in 2010–11, and 0.83 based on the response in 2011–12.

We next refine HMRC (2012)’s methodology for estimating how much of the forestalled income came from 2010–11 and how much from subsequent years. We find that all else equal, HMRC's method for estimating from which years forestalled income came – which suggests that around 70% came from 2010–11 – is likely to lead to overestimates of how much came from these initial post-reform years, and hence underestimate the underlying taxable income elasticity. An alternative method that better accounts for these issues suggests around 45% was unwound in 2010–11, and around one-sixth unwound in 2011–12, implying an elasticity of 0.58 based on the response in 2010–11 and 0.95 based on the response in 2011–12. These would both imply negative revenues from the increase in the top tax rate to 50%.

Finally, we show the sensitivity of HMRC (2012)’s estimates to changes in the specification of the model used to estimate the counterfactual incomes of the group affected by the 50% tax rate. We find that relatively small changes to the specification yield very different results, with higher taxable income elasticity estimates frequently in excess of unity. The range of reasonable central estimates that the UK’s Office for Budget Responsibility could use to estimate the revenue effects of changes to the UK’s top income tax rate is therefore wide.

However, it is important to sound three notes of caution here. First, if individuals anticipated (correctly) the 50% rate being reduced in later years (or were able to respond to the announcement made towards the end of the 2011–12 tax year that it would be reduced to 45% in 2013–14), they may also have delayed receiving income. We still obtain higher taxable income elasticity estimates than HMRC (2012) when we assume that individuals were able to delay as much income from 2011–12 to 2013–14 as they were able to bring forward from 2011–12 to 2009–10, but it may be the case that delaying income is easier than bringing it forward. If this were the case, more of the overall response to the 50% tax rate may represent temporary timing effects as opposed to underlying response, which would imply that the estimates of the underlying taxable income elasticity may be overestimates. Second, some behavioural responses, such as additional occupational pension contributions, or retention of income in businesses, while reducing income tax revenues in the short-term, generate at least some revenue in the longer-term. Third, the estimate of the counterfactual is very imprecisely defined, meaning that the estimates from the different specifications are not statistically significantly different from each other, or indeed from zero. The central estimates of HMRC (2012) are therefore still very much within the margin of error of our estimates. There is therefore still significant uncertainty in both directions around HMRC’s estimates of the taxable income elasticity of high earners, and hence the revenue effects of the 50% rate.

]]>In this paper we use panel data methods in an attempt to strip out the impact of forestalling, and estimate the underlying taxable income elasticity of those affected by the 50% tax rate, and thus the revenue-effect of the reform. In particular, we develop a new method of correcting for forestalling by averaging income over the (three year) period during which forestalling is likely to have taken place. This approach yields an estimate of the taxable income elasticity of 0.31, lower than earlier estimates by HMRC (2012) based on the same reform (but a different method), and consistent with the 50% tax raising around £1 billion a year (relative to the current 45% rate).

Three things are worth noting, however. First, is that estimated elasticities are very sensitive to changes in specification, and to the inclusion or exclusion of a small number of individuals with extremely high (and volatile) incomes. Second, at the same time the 50% rate was introduced, restrictions were placed on the amount of pension contributions some taxpayers could deduct from their taxable incomes (in advance of more general restrictions in place from 2011–12). Those forced to reduce their pension contributions (or unable to increase them) would have higher taxable income than they would have if these restrictions were not put in place: this may downwardly bias our estimate of the taxable income elasticity. Indeed, our estimates of the elasticity of broad income (before personal pension contributions are deducted) are higher – 0.71 using the same method. Finally, it is worth noting that the panel approach adopted here, by focusing on individuals who are observed both pre- and post- reform, excludes some forms of response (such as migration). Taken together, these three issues imply that higher figures for the taxable income elasticity (including those in HMRC, 2012) are plausible. Thus it is also plausible that the re-introduction of the 50% could reduce revenues somewhat: an elasticity of 0.71 would imply a reduction of around £1.75 billion if none of the lost income tax or NICs revenues were recouped from other tax bases or in other time periods.

We also explore in more detail the nature of the response to the 50% tax rate. Two findings stand out. First, when we restrict our sample to those just around the £150,000 threshold, we consistently estimate the taxable income elasticity to be between 0.1 and 0.2, implying that behavioural response to the higher tax rate is concentrated among those with the very highest incomes. Second, we find little evidence that individuals responded to the higher tax rate by increasing use of tax deductions. However, this must be a tentative conclusion as not all deductable items are recorded on the tax return data available. Particularly relevant in this context is the possibility that owners of closely-held incorporated businesses chose to respond to the 50% tax rate by retaining income in their business, for extraction at a later date (perhaps in the form of capital gains rather than dividends). Analysis of such responses would require the linking of personal and corporate income tax returns, which is a subject for future research.

]]>intergenerational income persistence.]]>

may improve life expectancy, but also impose serious short term risks; reducing class sizes may improve performance of good students, but not help weaker ones or vice versa. Quantile regression methods can help to explore these heterogeneous effects. Some recent developments in quantile regression methods are surveyed below.]]>

]]>

One of the main objectives of empirical analysis of experiments and quasi-experiments is to inform policy decisions that determine the

allocation of treatments to individuals with different observable covariates. We study the properties and implementation of the Empirical Welfare Maximization (EWM) method, which estimates a treatment assignment policy by maximizing the sample analog of average social welfare over a class of candidate treatment policies. The EWM approach is attractive in terms of both statistical performance and practical implementation in realistic settings of policy design. Common features of these settings include: (i) feasible treatment assignment rules are constrained exogenously for ethical, legislative, or political reasons, (ii) a policy maker wants a simple treatment assignment rule based on one or more eligibility scores in order to reduce the dimensionality of individual observable characteristics, and/or (iii) the proportion of individuals who can receive the treatment is a priori limited due to a budget or a capacity constraint. We show that when the propensity score is known, the average social welfare attained by EWM rules converges at least at n^(-1/2) rate to the maximum obtainable welfare uniformly over a minimally constrained class of data distributions, and this uniform convergence rate is minimax optimal. We examine how the uniform convergence rate depends on the richness of the class of candidate decision rules, the distribution of conditional treatment effects, and the lack of knowledge of the propensity score. We offer easily implementable algorithms for computing the EWM rule and an application using experimental data from the National JTPA Study.

]]>]]>

This April 2017 version is an updated version of the January 2017 version. The original version of the working paper is available here.

]]>only under dense graph sequences. ]]>

]]>

]]>

agents present choice options based on quality, but as agents of health authorities also consider their financial implications.]]>

an incomplete model of English auctions, improving on the pointwise bounds available till now. Application of many of the results of the paper requires no familiarity with random set theory.

]]>

We then apply this result to derive a Gaussian multiplier boot-strap procedure for constructing honest conﬁdence bands for non-parametric density estimators (this result can be applied in other nonparametric problems as well). An essential advantage of our ap-proach is that it applies generically even in those cases where the limit distribution of the supremum of the studentized empirical pro-cess does not exist (or is unknown). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fashion, which is needed for adaptive constructions of the conﬁdence bands. Finally, of independent inter-est is our introduction of a new, practical version of Lepski’s method, which computes the optimal, non-conservative resolution levels via a Gaussian multiplier bootstrap method.

]]>This is an updated version of W15/22 New joints: private providers and rising demand in the English National Health Service.

]]>We apply our method to analyze the distributional impact of insurance coverage on health care utilization and to provide a distributional decomposition of the racial test score gap. Our analysis generates new interesting findings, and complements previous analyses that focused on mean effects only. In both applications, the outcomes of interest are discrete rendering standard inference methods invalid for obtaining uniform confidence bands for quantile and quantile effects functions.

]]>moment conditions continue to hold when one first step component is incorrect. Locally robust moment conditions also have smaller bias that is flatter as a function of first step smoothing leading to improved small sample properties. Series first step estimators confer local robustness on any moment conditions and are doubly robust for affine moments, in the direction of the series approximation. Many new locally and doubly robust estimators are given here, including for economic structural models. We give simple asymptotic theory for estimators that use cross-fitting in the first step, including machine learning.]]>

the young.]]>

cycle, using a survey dataset from rural Tanzania. We find that adverse shocks during teenage years increase the probability of early marriages and early fertility among women.]]>

]]>

]]>

The original version of the working paper, posted on 01 April, 2016, is available here.

]]>Both the Smith Commission Agreement and the UK Government’s subsequent Command Paper, ‘An Enduring Settlement’ recognised that the devolution of fiscal powers has to be accompanied by the development of a new Fiscal Framework for Scotland.

Without such a framework there could be no fiscal devolution. It is essential in order to set out rules such as: how the Scottish Government’s block grant will be calculated in light of its new fiscal powers; what level of borrowing powers Scotland will have to enable it to deal with the additional economic risks and revenue volatility that it will face; the extent and scope of fiscal rules governing Scottish Government deficits and debt; arrangements for independent fiscal scrutiny, including fiscal forecasting; and arrangements for governing the increasingly complex interactions between Scottish and UK fiscal policy, including dispute resolution.

The Fiscal Framework is not part of the Scotland Bill: it is instead an agreement between the UK and Scottish governments (and therefore does not have the same legal standing as the Bill). It was finally published on 25 February 2016 after many months of negotiations between the two governments. The process of reaching agreement was protracted, and there were a number of contentious areas. But it seems the most significant area of disagreement was how the Scottish Government’s block grant should be adjusted to reflect its new powers.

The Smith Commission Agreement established that Scotland’s underlying block grant funding would continue to be determined by the Barnett Formula. But the Barnett-determined block grant would then have to be adjusted to reflect the new powers. On the one hand, the grant would have to be reduced to reflect the transfer of tax revenues from the UK to the Scottish Government, while on the other, an addition would need to be made to reflect the transfer of new welfare spending responsibilities to the Scottish Government.

The Smith Commission Agreement also established a number of high-level principles which it felt the Fiscal Framework should adhere to, and which were expected to govern the development of a proposal to adjust Scotland’s block grant. But, as we showed in our previous report, it is not possible to design a method for adjusting Scotland’s block grant that meets all of the Smith Commission principles simultaneously.

This inconsistency between the Smith principles was the main cause of the protracted negotiations between the two governments, and for several months it seemed likely to undermine the progress of the Scotland Bill. Each government interpreted the principles somewhat differently and chose to prioritise them differently, with the result that each favoured an alternative approach to adjusting Scotland’s block grant. Compromise was finally reached in February 2016, with an agreement on how to adjust the block grant for the next five years. While the mechanism chosen is complex and seems to blend elements of the UK and Scottish governments’ preferred approaches, ultimately it is the Scottish government’s approach that will determine the block grant available to Scotland during this period. After five years, an independent assessment will be carried out and negotiations will take place on how to adjust the block grant in the years beyond 2022.

This report reviews and appraises the Fiscal Framework Agreement, with a particular focus on this issue of block grant adjustment.

*The work was carried out jointly with authors at the ESRC Centre on Constitutional Change is the hub for research of the UK’s changing constitutional relationships. Its fellows examine how the evolving relationships between governments and parliaments in London, Edinburgh, Cardiff, Belfast and Brussels impact on the polity, economy and society of the UK and its component nations. *

generous federal aid.]]>

Moreover, the data suggest that the wife and the husband retire at the same time for a nonnegligible fraction of couples. Our approach takes as a starting point a stylized economic model that leads to a univariate generalized accelerated failure time model. The covariates of that generalized accelerated failure time model act as utility-flow shifters in the economic model. We introduce simultaneity by allowing the utility flow in retirement to depend on the retirement status of the spouse. The econometric model is then completed by assuming that the observed outcome is the Nash bargaining solution in that simple economic model. The advantage of this approach is that it includes independent realizations from the generalized accelerated failure time model as a special case, and deviations from this special case can be given an economic interpretation. We illustrate the model by studying the joint retirement decisions in married couples using the Health and Retirement Study. We provide a discussion of relevant identifying variation and estimate our model using indirect inference. The main empirical nding is that the simultaneity seems economically important. In our preferred specication the indirect utility associated with being retired increases by approximately 5% when one's spouse retires. The estimated model also predicts that the marginal effect of a change in the husbands' pension plan on wives' retirement dates is about 3.3% of the direct effect on the husbands'.

]]>less entry into technologies regardless of a firm’s size.]]>

The critical level is by construction smaller (in finite sample) than the one used if projecting confience regions designed to cover the entire parameter vector. Hence, our confidence interval is weakly shorter than the projection of established confidence sets (Andrews and Soares, 2010), if one holds the choice of tuning parameters constant. We provide simple conditions under which the comparison is strict. Our inference method controls asymptotic coverage uniformly over a large class of data-generating processes. Our assumptions and those used in the leading alternative approach (a profiling-based method) are not nested. We explain why we employ some restrictions that are not required by other methods and provide examples of models for which our method is uniformly valid but profiling-based methods are not.

]]>satisfy standard norm bounds, and (3) functions with unbounded domains. In all three cases we provide two kinds of results, compact embedding and closedness, which together allow one to show that parameter spaces defined by a ||·||

Also available: Executive Summary

]]>Using diﬀerential geometry and functional delta methods, we establish that the estimated sorted eﬀects are consistent for the true sorted eﬀects, and derive asymptotic normality and bootstrap approximation results, enabling construction of pointwise conﬁdence bands (point-wise with respect to percentile indices). We also derive functional central limit theorems and bootstrap approximation results, enabling construction of simultaneous conﬁdence bands (simultaneous with respect to percentile indices). The derived statistical results in turn rely on establishing Hadamard diﬀerentiability of the multivariate sorting operator, a result of independent mathematical interest.

]]>Click here to view accompanying sample size calculators for this paper.

]]>Leading important special cases encompassed by the framework we study include: (i) Tests of shape restrictions for infinite dimensional parameters; (ii) Confidence regions for functionals that impose shape restrictions on the underlying parameter; (iii) Inference for functionals in semiparametric and nonparametric models defined by conditional moment (in)equalities; and (iv) Uniform inference in possibly nonlinear and severely ill-posed problems.

]]>Supplementary material for this paper is available here.

]]>Supplementary material for this paper is available here.

]]>We also analyze the properties of fixed effects estimators of functions of the data, parameters and individual and time effects including average partial effects. Here, we uncover that the incidental parameter bias is asymptotically of second order, because the rate of the convergence of the fixed effects estimators is slower for average partial effects than for model parameters. The bias corrections are still effective to improve finite-sample properties.

View the supplementary document for this paper here.

]]>For the case of discretely-valued covariates we present analog estimators and characterize their large sample properties. When the number of time periods (*T*) exceeds the number of random coefficients (*P*), identification is regular, and our estimates are *√N* - consistent. When *T* = *P*, our identification results make special use of the subpopulation of stayers - units whose regressor values change little over time - in a way which builds on the approach of Graham and Powell (2012). In this just-identified case we study asymptotic sequences which allow the frequency of stayers in the population to shrink with the sample size. One purpose of these “discrete bandwidth asymptotics” is to approximate settings where covariates are continuously-valued and, as such, there is only an infinitesimal fraction of exact stayers, while keeping the convenience of an analysis based on discrete covariates. When the mass of stayers shrinks with *N*, identification is irregular and our estimates converge at a slower than *√N* rate, but continue to have limiting normal distributions.

We apply our methods to study the effects of collective bargaining coverage on earnings using the National Longitudinal Survey of Youth 1979 (NLSY79). Consistent with prior work (e.g., Chamberlain, 1982; Vella and Verbeek, 1998), we find that using panel data to control for unobserved worker heterogeneity results in sharply lower estimates of union wage premia. We estimate a median union wage premium of about 9 percent, but with, in a more novel finding, substantial heterogeneity across workers. The 0.1 quantile of union effects is insignificantly different from zero, whereas the 0.9 quantile effect is of over 30 percent. Our empirical analysis further suggests that, on net, unions have an equalizing effect on the distribution of wages.

Supporting material is available in a supplementary appendix here.

]]>The amendments to the initial proposed reforms were made to make the tax change more ‘progressive’. We find that, measured as a proportion of income or expenditure, poorer households did gain most from the amendments, but that the cash-terms gains were much larger for households with high levels of income and expenditure. In other words, the reduction in tax take from the amendments was weakly targeted at poorer households; even simple universal cash transfers would have been much more beneficial to poor households. This shows the distributional case for zero rates of VAT on goods like food is weak – especially given the growing sophistication of cash transfer programmes in particularly middle income countries.

We then examine the efficiency implications of Mexico’s VAT rate structure. We find that deviations from uniformity have a notable effect on spending patterns, but very little effect on aggregate welfare and economic efficiency as estimated by a standard QUAIDS model of consumer demand. We then argue that economic informality may actually provide an efficiency reason for lower rates of tax on goods like food for which informal production and transactions seem to be much more prevalent. This may turn the typical arguments about differential VAT rates on their head. Rather than being justifiable on distributional grounds, but entailing an efficiency cost, the reverse may actually be true.

]]>A version of this paper appeared in Spanish in the December 2014 issue of *Panorama Social*, available here.

Technical supporting material is available in a supplementary appendix here.

]]>

Supplementary material for this paper is available here.

]]>

This paper will be presented at the 'Are you prepared for retirement?' conference this afternoon.

]]>Supplementary material for this paper is available here.

]]>

These findings will be presented at a briefing on 9 September, alongside several other pieces of work which shed light on how financial preparedness for retirement differs across cohorts and important differences within cohorts.

]]>*This working paper was updated in May 2015.*

In this note, we point to a simple explanation that is fully consistent with rational behaviour on the part of Indian farmers. In computing the return on cows and buffaloes, the authors used data from a single year. Cows are assets whose return varies through time. In drought years, when fodder is scarce and expensive, milk production is lower and profits are low. In non-drought years, when fodder is abundant and cheaper, milk production is higher and profits can be considerably higher. The return on cows and buffaloes, like that of many stocks traded on Wall Street, is positive in some years and negative in others. We report evidence from three years of data on the return on cows and buffaloes in the district of Anantapur and show that in one of the three years returns are very high, while in drought years they are similar to the figures obtained by Anagol, Etang and Karlan (2013).

This paper is also published as part of the NBER working paper series no. 20304

]]>We also analyze the properties of fixed effects estimators of functions of the data, parameters and individual and time effects including average partial effects. Here, we uncover that the incidental parameter bias is asymptotically of second order, because the rate of the convergence of the fixed effects estimators is slower for average partial effects than for model parameters. The bias corrections are still useful to improve finite-sample properties.

]]>

This paper sets out the methodology, assumptions, and modelling specifications used to produce the report The changing face of retirement by Emmerson, Heald and Hood (2014), which aims to shed some light on how the demographic and financial circumstances of this group will change.

This paper is also published as part of the Inter-American Development Bank Working Paper series No. IDB-WP-527.

]]>The commands clrbound, clr2bound, and clr3bound provide bound estimates that can be used directly for estimation or to construct asymptotically valid conﬁdence sets. clrtest performs an intersection bound test of the hypothesis that a collection of lower intersection bounds is no greater than zero. The command clrbound provides bound estimates for one-sided lower or upper intersection bounds on a parameter, while clr2bound and clr3bound provide two-sided bound estimates based on both lower and upper intersection bounds. clr2bound uses Bonferroni’s inequality to construct two-sided bounds that can be used to perform asymptotically valid inference on the identiﬁed set or the parameter of interest, whereas clr3bound provides a generally tighter conﬁdence interval for the parameter by inverting the hypothesis test performed by clrtest. More broadly, inversion of this test can also be used to construct conﬁdence sets based on conditional moment inequalities as described in Chernozhukov et al. (2013). The commands include parametric, series, and local linear estimation procedures, and can be installed from within STATA by typing “ssc install clrbound”.

]]>the fi rst stage and then the preference parameters in the second stage based on Manski (1975, 1985)s maximum score estimator using the choice data and first stage estimates. This setting can be extended to maximum score estimation with nonparametrically generated regressors. The paper establishes consistency and derives rate of convergence of the two-stage maximum score estimator. Moreover, the paper also provides sufficient conditions under which the two-stage estimator is asymptotically equivalent in distribution to the corresponding single-stage estimator that assumes the first stage input is known. The paper also presents some Monte Carlo simulation results for finite-sample behavior of the two-stage estimator.]]>

We propose two new specification tests, denoted Tests RS and RC, that achieve uniform asymptotic size control and dominate Test BP in terms of power in any finite sample and in the asymptotic limit. Test RC is particularly convenient to implement because it requires little additional work beyond the confidence set construction. Test RS requires a separate procedure to compute, but has the best power. The separate procedure is computationally easier than confidence set construction in typical cases.

]]>In the second part of the paper, we present a generalization of the treatment effect framework to a much richer setting, where possibly a continuum of target parameters is of interest and the Lasso-type or post-Lasso type methods are used to estimate a continuum of high-dimensional nuisance functions. This framework encompasses the analysis of local treatment effects as a leading special case and also covers a wide variety of classical and modern moment-condition problems in econometrics. We establish a functional central limit theorem for the continuum of the target parameters, and also show that it holds uniformly in a wide range of data-generating processes *P*, with continua of approximately sparse nuisance functions. We also establish validity of the multiplier bootstrap for resampling the first order approximations to the standardized continuum of the estimators, and also establish uniform validity in *P*. We propose a notion of the functional delta method for finding limit distribution and multiplier bootstrap of the smooth functionals of the target parameters that is valid uniformly in *P*. Finally, we establish rate and consistency results for continua of Lasso or post-Lasso type methods for estimating continua of the (nuisance) regression functions, also providing practical, theoretically justified penalty choices. Each of these results is new and could be of independent interest.

A supplement to this paper can be downloaded here.

]]>These technical tools allow us to contribute to the series literature, specifically the seminal work of Newey (1997), as follows. First, we weaken considerably the condition on the number k of approximating functions used in series estimation from the typical k2/n → 0 to k/n → 0, up to log factors, which was available only for spline and local polynomial partition series before. Second, under the same weak conditions we derive L2 rates and pointwise central limit theorems results when the approximation error vanishes. Under an incorrectly specified model, i.e. when the approximation error does not vanish, analogous results are also shown. Third, under stronger conditions we derive uniform rates and functional central limit theorems that hold if the approximation error vanishes or not. That is, we derive the strong approximation for the entire estimate of the nonparametric function. Finally, we derive uniform rates and inference results for linear functionals of interest of the conditional expectation function such as its partial derivative or conditional average partial derivative.

]]>We then apply this result to derive a Gaussian multiplier bootstrap procedure for constructing honest confidence bands for nonparametric density estimators (this result can be applied in other nonparametetric problems as well). An essential advantage of our approach is that it applies generically even in those cases where the limit distribution of the supremum of the studentized empirical process does not exist (or is unknown). This is of particular importance in problems where resolution levels or other tuning parameters have been chosen in a data-driven fashion, which is needed for adaptive constructions of the confidence bands. Furthermore, our approach is asymptotically honest at a polynomial rate - namely, the error in coverage level converges to zero at a fast, polynomial speed (with respect to the sample size). In sharp contrast, the approach based on extreme value theory is asymptotically honest only at a logarithmic rate - the error converges to zero at a slow, logarithmic speed. Finally, of independent interest is our introduction of a new, practical version of Lepski's method, which computes the optimal, non-conservative resolution levels via a Gaussian multiplier bootstrap method.]]>
http://zippy.ifs.org.uk/publications/7031

*(A typo on page 27 that erroneously resulted in the OLS estimator instead of the 2SLS estimator was corrected in July 2015).*

Supplementary material for this paper is available here.

]]>Supplementary material relating to this working paper can be viewed here

]]>This paper is forthcoming in the The Journal of Multivariate Analysis

]]>An online appendix to accompany this publication is available here

]]>The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We also present a simple generalisation of our method to a fully heterogeneous model with a binary treatment variable. We illustrate the use of the developed methods with numerical simulations and an application that considers the effect of abortion crime rates.

]]>A supplement to this article, which outlines theoretical properties underpinning the methodology and provides a proof of theorem, can be viewed here

]]>This article is accompanied by a web appendix in which we present omitted discussions, an algorithm to implement the proposed method for the sharp RSS and proofs for the main results.

]]>This research was funded by the Nuffield Foundation

]]>As part of developing the main results, we introduce distribution regression as a comprehensive and flexible tool for modelling and estimating the *entire* conditional distribution. We show that distribution regression encompasses the Cox duration regression and represents a useful alternative to quantile regression. We establish functional central limit theorems and bootstrap validity results for the empirical distribution regression process and various related functionals.

This is a revision of CWP05/12 and CWP09/09

]]>This paper is supplemented by an online appendix which can be viewed **here**

]]>

This working paper is supplemented by an online appendix which can be viewed **here**

2SLS has the advantage of providing an easy to compute point estimator of a slope coefficient which can be interpreted as a local average treatment effect (LATE). However, the 2SLS estimator does not measure the value of other useful treatment effect parameters without invoking untenable restrictions.

The nonparametric instrumental variable (IV) model has the advantage of being weakly restrictive, so more generally applicable, but it usually delivers set identification. Nonetheless it can be used to consistently estimate bounds on many parameters of interest including, for example, average treatment effects. We illustrate using data from Angrist & Evans (1998) and study the effect of family size on female employment.

This October 2015 version corrects an error in the paper, as explained in footnote 1. The original version of the working paper is available here.

]]>The current version of this working paper was published in January 2014 and replaces an earlier version originally published in March 2013.

]]>We propose two hypothesis tests that use the infimum of the sample criterion function over the parameter space as the test statistic together with two different critical values. We obtain two main results. First, we show that the two tests we propose are asymptotically size correct in a uniform sense. Second we show our tests are more powerful than the test that checks whether the confidence set for the parameters of interest is empty or not.

]]>This report provides findings from a series of focus groups investigating how people think about household expenditure and what issues people may have in reporting household expenditure in a social survey context. The information collected in the focus groups will be used as a starting point for designing new questions on household spending for use in future social surveys. Subsequent stages of work will include cognitively testing any new questions produced and consulting a panel of experts over the proposed questions.

This project was funded by the Nuffield Foundation.

]]>This paper is a revised version of CWP13/09.

]]>The main attractive feature of our method is that it allows for imperfect selection of the controls and provides confidence intervals that are valid uniformly across a large class of models. In contrast, standard post-model selection estimators fail to provide uniform inference even in simple cases with a small, fixed number of controls. Thus our method resolves the problem of uniform inference after model selection for a large, interesting class of models. We illustrate the use of the developed methods with numerical simulations and an application to the effect of abortion on crime rates.

This paper is a revision of CWP42/11.

]]>The paper focuses on household ‘tax planning’ in the context of tax reliefs for retirement saving in the United Kingdom. It examines whether take-up of retirement saving instruments increases at the higher rate threshold for income tax, since tax relief is given at the marginal tax rate and should be more attractive to those just above this threshold than to those just below it. It then examines a more complex case where the tax system provides an incentive for pension saving to do be done by one member of a couple. Econometric results are obtained from the Family Resources Survey on these two tests of household responses to complex incentives.

]]>This is a revision of CWP09/09.

]]>Various methods have been used to overcome the point identification problem inherent in the linear age-period-cohort model. This paper presents a set-identification result for the model and then considers the use of the maximum-entropy principle as a vehicle for achieving point identification. We present two substantive applications (US female mortality data and UK female labor force participation) and compare the results from our approach to some of the solutions in the literature.

]]>We show that the model delivers set identification of the latent utility functions and we characterize sharp bounds on those functions. We develop easy-to-compute outer regions which in parametric models require little more calculation than what is involved in a conventional maximum likelihood analysis. The results are illustrated using a model which is essentially the parametric conditional logit model of McFadden (1974) but with potentially endogenous explanatory variables and instrumental variable restrictions. The method employed has wide applicability and for the first time brings instrumental variable methods to bear on structural models in which there are multiple unobservables in a structural equation.

]]>

This paper presents the findings from two experiments designed to test the hypothesis that individuals' notions of distributive justice are associated with their economic status relative to others within their own society. In the experiments, each participant played a specially designed distribution game. This game allowed us to establish whether and to what extent the participants perceived inequalities owing to differences in productivity rather than luck as just and, hence, not in need of redress. A type of participant that distinguished between inequalities owing to productivity and luck, redressing the latter and not or to a lesser extent the former, is said to be subject to an earned endowment effect. Drawing on previous work in both economics and psychology, we hypothesised that the richer members of any society would be more likely to be subject to an earned endowment effect, while the poorer members would be more inclined towards redistribution irrespective of whether the inequality was owing to productivity or luck.

We conducted our first experiment in the UK. We selected unemployed residents of one city to represent low economic status individuals and student and employed residents of the same city to represent relatively high economic status individuals. We found a statistically significant earned endowment effect among the students and employed and no effect among the unemployed. The difference between the unemployed and the others was also statistically significant.

Our second experiment was designed to test the generalizability of the findings from our first. It was conducted in Cape Town, South Africa. Exploiting the fact that Cape Town is home to one of the continent's best universities, we built a participant sample that was highly comparable to the UK sample in many regards. However, the states of employment and unemployment are less distinct in South Africa as compared to the UK and a number of interventions are in place to ensure that the student body of the University of Cape Town includes young people from not only rich and middle income but also poorer households. So, in South Africa we chose to rely on responses to a survey question to distinguish between high and low economic status individuals. The findings from this second experiment also supported the hypothesis; among individuals who classified their households as rich or high or middle income there was a statistically significant earned endowment effect, among individuals who classified their households as poor or low income there was not and the different between the two participant types was significant.

We conclude that individuals' notions of distributive justice are associated with their relative economic status within society and that this is a generalizable result.

]]>The paper studies the partial identifying power of structural single equation threshold crossing models for binary responses when explanatory variables may be endogenous. The paper derives the sharp identified set of threshold functions for the case in which explanatory variables are discrete and provides a constructive proof of sharpness. There is special attention to a widely employed semiparametric shape restriction which requires the threshold crossing function to be a monotone function of a linear index involving the observable explanatory variables. It is shown that the restriction brings great computational benefits, allowing direct calculation of the identified set of index coefficients without calculating the nonparametrically specified threshold function. With the restriction in place the methods of the paper can be applied to produce identified sets in a class of binary response models with mis-measured explanatory variables.

This is a further revised version (Oct 7th 2011) of CWP23/09 "Single equation endogenous binary response models"

]]>Part of the success of China has been to attract the investment of foreign multinationals. This is also true for a number of other Emerging Economies. Europe's largest multinational firms increasingly file patent applications that are based on inventor activities located in emerging economies, often working alongside inventors from the firm's home country.

]]>mechanism that generates the high degree of wealth inequality in the model is the dynamic of the "wage ladder" resulting from the search process. There is an important asymmetry between the incremental wage increases generated by on-thejob search (climbing the ladder) and the drop in income associated with job loss (falling off the ladder). The behavior of workers in low paying jobs is primarily governed by the expectation of wage growth, while the behavior of workers near the top of the distribution is driven by the possibility of job loss.

]]>school system, in particular for individuals originating from homes with low educated fathers. This study estimates the impact of the reform on criminal behavior: both within the generation directly affected by the reform as well as their children. We use census data on all born in Sweden between 1945 and 1955 and all their children merged with individual register data on all convictions between 1981 and 2008. We find a significant inverse effect of the reform on criminal behavior of men and on sons to fathers who went through the new school system.

]]>Quantile regression (QR) is a principal regression method for analyzing the impact of covariates on outcomes. The impact is described by the conditional quantile function and its functionals. In this paper we develop the nonparametric QR series framework, covering many regressors as a special case, for performing inference on the entire conditional quantile function and its linear functionals. In this framework, we approximate the entire conditional quantile function by a linear combination of series terms with quantile-specific coefficients and estimate the function-valued coefficients from the data. We develop large sample theory for the empirical QR coefficient process, namely we obtain uniform strong approximations to the empirical QR coefficient process by conditionally pivotal and Gaussian processes, as well as by gradient and weighted bootstrap processes.

We apply these results to obtain estimation and inference methods for linear functionals of the conditional quantile function, such as the conditional quantile function itself, its partial derivatives, average partial derivatives, and conditional average partial derivatives. Specifically, we obtain uniform rates of convergence, large sample distributions, and inference methods based on strong pivotal and Gaussian approximations and on gradient and weighted bootstraps. All of the above results are for function-valued parameters, holding uniformly in both the quantile index and in the covariate value, and covering the pointwise case as a by-product. If the function of interest is monotone, we show how to use monotonization procedures to improve estimation and inference. We demonstrate the practical utility of these results with an empirical example, where we estimate the price elasticity function of the individual demand for gasoline, as indexed by the individual unobserved propensity for gasoline consumption.

]]>We examine the "home bias" of knowledge spillovers (the idea that knowledge spreads more slowly over international boundaries than within them) as measured by the speed of patent citations. We present econometric evidence that the geographical localization of knowledge spillovers has fallen over time, as we would expect from the dramatic fall in communication and travel costs. Our proposed estimator controls for correlated fixed effects and censoring in duration models and we apply it to data on over two million patent citations between 1975 and 1999. Home bias is exaggerated in models that do not control for fixed effects. The fall in home bias over time is weaker for the pharmaceuticals and information/communication technology sectors where agglomeration externalities may remain strong.

]]>When consumption goods are indivisible, individuals have to hold enough resources to cross a purchasing threshold. If individuals are liquidity constrained, they are unable to borrow to cross that threshold. Instead, we show that such individuals, even if risk averse, may choose to play gamble through playing lotteries to have a chance of crossing the threshold. One implication of this model is that income effects for individuals who choose to play lotteries are likely to be larger than for the general population. This in turn implies that estimating income effects through the random allocation of lottery winnings is likely to be a biased estimate of income effects of the broader population who chose not to gamble. Using UK data on lottery wins, other windfalls and durable good purchases, we show that lottery players display higher income effects than non-players but only amongst those likely to be credit constrained. This is consistent with credit constrained, risk-averse agents gambling in order to cross a purchase threshold and to convexify their budget set.

]]>

We show that the model delivers set, not point, identification of the latent utility functions and we characterize sharp bounds on those functions. We develop easy-to-compute outer regions which in parametric models require little more calculation than what is involved in a conventional maximum likelihood analysis. The results are illustrated using a model which is essentially the parametric conditional logit model of McFadden (1974) but with potentially endogenous explanatory variables and instrumental variable restrictions.

The method employed has wide applicability and for the first time brings instrumental variable methods to bear on structural models in which there are multiple unobservables in a structural equation.

This paper has now been revised and the new version is available as CWP39/11.

]]>This paper is a revised version of cemmap working paper CWP33/07.

]]>I present an application to the study of segregation in school friendship networks, using data from Add Health containing the actual social networks of students in a representative sample of US schools. My results suggest that for white students, the value of a same-race friend decreases with the fraction of whites in the school. The opposite is true for African American students.

The model is used to study how different desegregation policies may affect the structure of the network in equilibrium. I find an inverted u-shaped relationship between the fraction of students belonging to a racial group and the expected equilibrium segregation levels. These results suggest that desegregation programs may decrease the degree of interracial interaction within schools.

]]>Optimal instruments are conditional expectations; and in developing the IV results, we also establish a series of new results for LASSO and Post-LASSO estimators of non-parametric conditional expectation functions which are of independent theoretical and practical interest. Specifically, we develop the asymptotic theory for these estimators that allows for non-Gaussian, heteroscedastic disturbances, which is important for econometric applications. By innovatively using moderate deviation theory for self-normalized sums, we provide convergence rates for these estimators that are as sharp as in the homoscedastic Gaussian case under the weak condition that log p = o(n ^{1/3}). Moreover, as a practical innovation, we provide a fully data-driven method for choosing the user-specified penalty that must be provided in obtaining LASSO and Post-LASSO estimates and establish its asymptotic validity under non-Gaussian, heteroscedastic disturbances.

In this paper we use English school level data from 1993 to 2008 aggregated up to small neighbourhood areas to look at the determinants of the demand for private education in England from the ages of 7 until 15 (the last year of compulsory schooling). We focus on the relative importance of price and quality of schooling. However, there are likely to be unobservable factors that are correlated with private school prices and/or the quality of state schools that also impact on the demand for private schooling which could bias our estimates. Our long regional and local authority panel data allows us to employ a number of strategies to deal with this potential endogeneity. Because of the likely presence of incidental trends in our unobservables, we employ a double difference system GMM approach to remove both fixed effects and incidental trends. We find that the demand for private schooling is inversely related to private school fees as well as the quality of state schooling in the local area at the time families were making key schooling choice decisions at the ages of 7, 11 and 13. We estimate that a one standard deviation increase in the private school day fee when parents/students are making these key decisions reduces the proportion attending private schools by around 0.33 percentage points which equates to an elasticity of around -0.26. This estimate is only significant for choices at age 7 (but the point estimates are very similar at the ages of 11 and 13). At age 11 and age 13, an increase in the quality of local state secondary reduces the probability of attending private schools. At age 11, a one standard deviation increase in state school quality reduces participation in private schools by 0.31 percentage points which equates to an elasticity of -0.21. The effect at age 13 is slightly smaller, but still significant. Demand for private schooling at the ages of 8, 9, 10 and 12, 14 and 15 are almost entirely determined by private school demand in the previous year for the same cohort, and price and quality do not impact significantly on this decision other than through their initial influence on the key participation decisions at the ages of 7, 11 and 13.

]]>Childcare costs are often viewed as one of the biggest barriers to work, particularly among lone parents on low incomes. Children in England are eligible to attend free part-time nursery classes (equivalent to pre-kindergarten) from the academic term after they turn 3, and are typically eligible to start free fulltime public education on 1 September after they turn four. These rules mean that children born one day apart may start nursery classes up to four months apart, and may start school up to one year apart. We exploit these discontinuities to investigate the impact of a youngest child being eligible for part-time nursery education and full-time primary education on welfare receipt and employment patterns amongst lone parents receiving welfare. In contrast to previous studies, we are able to estimate the precise timing (relative to the date on which part-time or full-time education begins) of any impact on labour supply, by using rich administrative data. Amongst those receiving welfare when their youngest child is aged approximately three and a half, we find a small but significant effect of free full-time public education on both employment and welfare receipt (of around 2 percentage points, or 10-15 per cent), which peaks eight to nine months after the child becomes eligible (aged approximately 4 years and 9 months). We find weaker evidence of an even smaller effect of eligibility for part-time nursery education. This suggests that the expansion of public education programmes to younger disadvantaged children may only encourage a small number of low income lone parents to return to work (although, of course, this is not the primary aim of such programmes).

]]>We examine the effect of large cash transfers on the consumption of food by poor households in rural Mexico. The transfers represent 20% of household income on average, and yet, the budget share of food is unchanged following receipt of this money. This is an important puzzle to solve, particularly so in the context of a social welfare programme designed in part to improve nutrition of individuals in the poorest households. We estimate an Engel curve for food. We rule out price increases, changes in the quality of food consumed and homotheticity of preferences as explanations for this puzzle. We also show that food is a necessity, with a strong negative effect of income on the food budget share. The decrease in food budget share caused by the large increase in income is cancelled by some other relevant aspect of the programme so that the net effect is nil. We argue that the program has not changed preferences and that there is no labelling of money. We propose that the key to the puzzle resides in the fact that the transfer is put in the hands of women and that the change in control over household resources is what leads to the observed changes in behaviour.

]]>We provide a tractable characterization of the sharp identification region of the parameters θ in a broad class of incomplete econometric models. Models in this class have set valued predictions that yield a convex set of conditional or unconditional moments for the observable model variables. In short, we call these models with convex moment predictions. Examples include static, simultaneous move finite games of complete and incomplete information in the presence of multiple equilibria; best linear predictors with interval outcome and covariate data; and random utility models of multinomial choice in the presence of interval regressors data. Given a candidate value for θ, we establish that the convex set of moments yielded by the model predictions can be represented as the Aumann expectation of a properly defined random set. The sharp identification region of θ, denoted Θ_{1}, can then be obtained as the set of minimizers of the distance from a properly specified vector of moments of random variables to this Aumann expectation. Algorithms in convex programming can be exploited to efficiently verify whether a candidate θ is in Θ_{1}. We use examples analyzed in the literature to illustrate the gains in identification and computational tractability afforded by our method.

This paper is a revised version of CWP27/09.

]]>We evaluate the German apprenticeship system, which combines on-the-job training with classroom teaching, by modelling individual careers from the choice to join such a scheme and followed by their employment, job to job transitions and wages over the lifecycle. Our data is drawn from administrative records that report accurately job transitions and pay. We find that apprenticeships increase wages, and change wage profiles with more growth upfront, while wages in the non-apprenticeship sector grow at a lower rate but for longer. Non-apprentices face a much higher variance to the shocks of their match specific effects and a substantially larger variance in initial level of the offered wages. We find no evidence that qualified apprentices are harder to reallocate following job loss. The average life-cycle return to an apprenticeship career is about 14% and the return is mainly driven by the differences in the wage profile.

]]>We illustrate the approach using scanner data on food purchases to estimate bounds on willingness to pay for the organic characteristic. We combine these estimates with information on households' stated preferences and beliefs to show that on average quality is the most important factor affecting bounds on household willingness to pay for organic, with health concerns coming second, and environmental concerns lagging far behind.

]]>Social experiments are powerful sources of information about the effectiveness of interventions. In practice, initial randomization plans are almost always compromised. Multiple hypotheses are frequently tested. "Significant" effects are often reported with p-values that do not account for preliminary screening from a large candidate pool of possible effects. This paper develops tools for analyzing data from experiments as they are actually implemented.

We apply these tools to analyze the influential HighScope Perry Preschool Program. The Perry program was a social experiment that provided preschool education and home visits to disadvantaged children during their preschool years. It was evaluated by the method of random assignment. Both treatments and controls have been followed from age 3 through age 40.

Previous analyses of the Perry data assume that the planned randomization protocol was implemented. In fact, as in many social experiments, the intended randomization protocol was compromised. Accounting for compromised randomization, multiple-hypothesis testing, and small sample sizes, we find statistically significant and economically important program effects for both males and females. We also examine the representativeness of the Perry study.

]]>This paper is a revised version of CWP18/09.

]]>In this paper we study post-penalized estimators which apply ordinary, unpenalized linear regression to the model selected by first-step penalized estimators, typically LASSO. It is well known that LASSO can estimate the regression function at nearly the oracle rate, and is thus hard to improve upon. We show that post-LASSO performs at least as well as LASSO in terms of the rate of convergence, and has the advantage of a smaller bias. Remarkably, this performance occurs even if the LASSO-based model selection 'fails' in the sense of missing some components of the 'true' regression model. By the 'true' model we mean here the best s-dimensional approximation to the regression function chosen by the oracle. Furthermore, post-LASSO can perform strictly better than LASSO, in the sense of a strictly faster rate of convergence, if the LASSO-based model selection correctly includes all components of the 'true' model as a subset and also achieves a sufficient sparsity. In the extreme case, when LASSO perfectly selects the 'true' model, the post-LASSO estimator becomes the oracle estimator. An important ingredient in our analysis is a new sparsity bound on the dimension of the model selected by LASSO which guarantees that this dimension is at most of the same order as the dimension of the 'true' model. Our rate results are non-asymptotic and hold in both parametric and nonparametric models. Moreover, our analysis is not limited to the LASSO estimator in the first step, but also applies to other estimators, for example, the trimmed LASSO, Dantzig selector, or any other estimator with good rates and good sparsity. Our analysis covers both traditional trimming and a new practical, completely data-driven trimming scheme that induces maximal sparsity subject to maintaining a certain goodness-of-fit. The latter scheme has theoretical guarantees similar to those of LASSO or post-LASSO, but it dominates these procedures as well as traditional trimming in a wide variety of experiments.

]]>]]>

This paper has three aims:

- We provide a framework for weighing up the insurance value of disability benefi…ts against the incentive cost of inducing healthy individuals to stop work at different points of their life-cycle.
- We estimate the risks to health that may lead to work-limiting disabilities and the risk to wages that may lead to individuals choosing not to work. We also estimate the extent of false awards made through the DI program alongside the proportion of awards to those in genuine need.
- We use our model and estimates to characterize the economic effects of the disability insurance and to consider how policy reforms would affect behaviour and standard measures of household welfare.

We differentiate disability status by its severity, and show that a severe disability shock leads to a decline in wages of 40%, as well as a substantial rise in the fixed cost of going to work. In terms of the effectiveness of the DI program, we estimate high levels of rejections of genuine applicants. In our counterfactual simulations, this means that household welfare increases as the program becomes less strict, despite the worsening incentives for false applications that this implies. On the other hand, incentives for false applications are reduced by reducing generosity and increasing reassessments, and these policies increase household welfare, despite the worse insurance implied.

]]>This paper develops a formal language for study of treatment response with social interactions, and uses it to obtain new findings on identification of potential outcome distributions. Defining a person's treatment response to be a function of the entire vector of treatments received by the population, I study identification when shape restrictions and distributional assumptions are placed on response functions. An early key result is that the traditional assumption of individualistic treatment response (ITR) is a polar case within the broad class of constant treatment response (CTR) assumptions, the other pole being unrestricted interactions. Important non-polar cases are interactions within reference groups and distributional interactions. I show that established findings on identification under assumption ITR extend to assumption CTR. These include identification with assumption CTR alone and when this shape restriction is strengthened to semi-monotone response. I next study distributional assumptions using instrumental variables. Findings obtained previously under assumption ITR extend when assumptions of statistical independence (SI) are posed in settings with social interactions. However, I find that random assignment of realized treatments generically has no identifying power when some persons are leaders who may affect outcomes throughout the population. Finally, I consider use of models of endogenous social interactions to derive restrictions on response functions. I emphasize that identification of potential outcome distributions differs from the longstanding econometric concern with identification of structural functions.

This paper is a revised version of CWP01/10

]]>We develop a general class of nonparametric tests for treatment effects conditional on covariates. We consider a wide spectrum of null and alternative hypotheses regarding conditional treatment effects, including (i) the null hypothesis of the conditional stochastic dominance between treatment and control groups; ii) the null hypothesis that the conditional average treatment effect is positive for each value of covariates; and (iii) the null hypothesis of no distributional (or average) treatment effect conditional on covariates against a one-sided (or two-sided) alternative hypothesis. The test statistics are based on L1-type functionals of uniformly consistent nonparametric kernel estimators of conditional expectations that characterize the null hypotheses. Using the Poissionization technique of Giné et al. (2003), we show that suitably studentized versions of our test statistics are asymptotically standard normal under the null hypotheses and also show that the proposed nonparametric tests are consistent against general fixed alternatives. Furthermore, it turns out that our tests have non-negligible powers against some local alternatives that are n−½ different from the null hypotheses, where n is the sample size. We provide a more powerful test for the case when the null hypothesis may be binding only on a strict subset of the support and also consider an extension to testing for quantile treatment effects. We illustrate the usefulness of our tests by applying them to data from a randomized, job training program (LaLonde, 1986) and by carrying out Monte Carlo experiments based on this dataset.

]]>In this paper we consider endogenous regressors in the binary choice model under a weak median exclusion restriction, but without further specification of the distribution of the unobserved random components. Our reduced form specification with heteroscedastic residuals covers various heterogeneous structural binary choice models. As a particularly relevant example of a structural model where no semiparametric estimator has of yet been analyzed, we consider the binary random utility model with endogenous regressors and heterogeneous parameters. We employ a control function IV assumption to establish identification of a slope parameter 'â' by the mean ratio of derivatives of two functions of the instruments. We propose an estimator based on direct sample counterparts, and discuss the large sample behavior of this estimator. In particular, we show '√'n consistency and derive the asymptotic distribution. In the same framework, we propose tests for heteroscedasticity, overidentification and endogeneity. We analyze the small sample performance through a simulation study. An application of the model to discrete choice demand data concludes this paper.

]]>This paper gives identification and estimation results for quantile and average effects in nonseparable panel models, when the distribution of period specific disturbances does not vary over time. Bounds are given for interesting effects with discrete regressors that are strictly exogenous or predetermined. We allow for location and scale time effects and show how monotonicity can be used to shrink the bounds. We derive rates at which the bounds tighten as the number T of time series observations grows and give an empirical illustration.

]]>This paper is a revised version of cemmap working paper CWP15/08

]]>

In this paper we analyse the findings from a series of 'public good' games that were conducted between November 2005 and February 2007 in 104 municipalities in rural and urban Colombia with mainly poor participants. The data covers municipalities both with ('treatment') and without ('control') a PRDP in place, and within the 'treatment' municipalities, both beneficiaries and non beneficiaries of the PRDP initiative. The data for 'control' municipalities was collected as part of the evaluation of Familias en Accion (FeA), Colombia's conditional cash transfer programme.

The game is structured as a typical free-rider problem with the act of contributing to the 'public good' (a collective money pot) being always dominated by non-contribution. We interpret contribution as an act consistent with a high degree of social capital.

Potentially endogenous selection into the programme makes identifying programme effects difficult but we find strong and suggestive evidence that exposure to PRDPs improve social capital and that this extends beyond direct beneficiaries of the programme. In particular, the duration of programme operation and the proportion of programme beneficiaries in a game session increase contribution to the public good, suggesting that in order to have a major impact the programme must be sufficiently 'intensive'.

]]>This paper examines the impact of *in utero* exposure to the Asian influenza pandemic of 1957 upon physical and cognitive development in childhood. Outcome data is provided by the National Child Development Study (NCDS), a panel study of a cohort of British children who were all potentially exposed in the womb. Epidemic effects are identified using geographic variation in a surrogate measure of the epidemic. Results indicate significant detrimental effects of the epidemic upon birth weight and height at 7 and 11, but only for the offspring of mother's with certain health characteristics. By contrast, the impact of the epidemic on childhood cognitive test scores is more general: test scores are reduced at the mean, and effects remain constant across maternal health and socioeconomic indicators. Taken together, our results point to multiple channels linking foetal health shocks to childhood outcomes.

Updated version available CWP31/11

]]>We study the identification of panel models with linear individual-specific coefficients, when T is fixed. We show identification of the variance of the effects under conditional uncorrelatedness. Identification requires restricted dependence of errors, reflecting a trade-off between heterogeneity and error dynamics. We show identification of the density of individual effects when errors follow an ARMA process under conditional independence. We discuss GMM estimation of moments of effects and errors, and introduce a simple density estimator of a slope effect in a special case. As an application we estimate the effect that a mother smokes during pregnancy on child's birth weight.

]]>This paper considers semiparametric efficient estimation of conditional moment models with possibly nonsmooth residuals in unknown parametric components (Θ) and unknown functions (h)of endogenous variables. We show that: (1) the penalized sieve minimum distance(PSMD) estimator (ˆΘ, ˆh) can simultaneously achieve root-n asymptotic normality of ˆΘ and nonparametric optimal convergence rate of ˆh, allowing for noncompact function parameter spaces; (2) a simple weighted bootstrap procedure consistently estimates the limiting distribution of the PSMD ˆΘ; (3) the semiparametric efficiency bound formula of Ai and Chen (2003) remains valid for conditional models with nonsmooth residuals, and the optimally weighted PSMD estimator achieves the bound; (4) the centered, profiled optimally weighted PSMD criterion is asymptotically chi-square distributed. We illustrate our theories using a partially linear quantile instrumental variables (IV) regression, a Monte Carlo study, and an empirical estimation of the shape-invariant quantile IV Engel curves.

This is an updated version of CWP09/08.

]]>This paper analyzes the effects of a ban on smoking in public places upon firms and consumers. It presents a theoretical model and tests its predictions using unique data from before and after the introduction of smoking bans in the UK. Cigarette smoke is a public bad, and smokers and non-smokers differ in their valuation of smoke-free amenities. Consumer heterogeneity implies that the market equilibrium may result in too much uniformity, whereas social optimality requires a mix of smoking and non-smoking pubs (which can be operationalized via licensing). If the market equilibrium has almost all pubs permitting smoking (as is the case in the data) then a blanket ban reduces pub sales, profits, and consumer welfare. We collect survey data from public houses and find that the Scottish smoking ban (introduced in March 2006) reduced pub sales and harmed medium run profitability. An event study analysis of the stock market performance of pub-holding companies corroborates the negative effects of the smoking ban on firm performance.

]]>This paper develops methodology for nonparametric estimation of a polarization measure due to Anderson (2004) and Anderson, Ge, and Leo (2006) based on kernel estimation techniques. We give the asymptotic distribution theory of our estimator, which in some cases is nonstandard due to a boundary value problem. We also propose a method for conducting inference based on estimation of unknown quantities in the limiting distribution and show that our method yields consistent inference in all cases we consider. We investigate the finite sample properties of our methods by simulation methods. We give an application to the study of polarization within China in recent years.

]]>This is a substantial revision of "Semiparametric identification of structural dynamic optimal stopping time models", CWP06/07.

]]>We investigate a method for extracting nonlinear principal components. These principal components maximize variation subject to smoothness and orthogonality constraints; but we allow for a general class of constraints and densities, including densities without compact support and even densities with algebraic tails. We provide primitive sufficient conditions for the existence of these principal components. We also characterize the limiting behavior of the associated eigenvalues, the objects used to quantify the incremental importance of the principal components. By exploiting the theory of continuous-time, reversible Markov processes, we give a different interpretation of the principal components and the smoothness constraints. When the diffusion matrix is used to enforce smoothness, the principal components maximize long-run variation relative to the overall variation subject to orthogonality constraints. Moreover, the principal components behave as scalar autoregressions with heteroskedastic innovations. Finally, we explore implications for a more general class of stationary, multivariate diffusion processes.

]]>

This paper seeks to address this, by an in-depth examination of scanner data from one company, Taylor Nelson Sofres (TNS), on grocery purchases over a five-year period. We assess how far the ongoing demands of participation inherent in this kind of survey lead to 'fatigue' in respondents' recording of their spending and compare the demographic representativeness of the data to the well-established Expenditure and Food Survey (EFS), constructing weights for the TNS that account for observed demographic differences. We also look at demographic transitions, comparing the panel aspect of the TNS to the British Household Panel Study (BHPS). We examine in detail the expenditure data in the TNS and EFS surveys and discuss the implications of this method of data collection for survey attrition. Broadly, we suggest that problems of fatigue and attrition may not be so severe as may be expected, though there are some differences in expenditure levels (and to some extent patterns of spending) that cannot be attributed to demographic or time differences in the two surveys alone and may be suggestive of survey mode effects. Demographic transitions appear to occur less frequently than we might expect which may limit the usefulness of the panel aspect of the data for some applications.

]]>As private sector employers have moved away from providing final salary defined benefit (DB) pensions to their employees, attention has increasingly focused on the public sector's continued provision of such pensions and the value of these pension promises to public sector employees. The estimated underlying liabilities of such plans have increased sharply in recent years, at least in part due to unanticipated increases in longevity. This has led to reforms of all the major public sector pension schemes, the net result of which has been to reduce the level of benefits offered by the schemes (predominantly to new, rather than existing members).

This paper examines, in the context of the Teachers' Pension Scheme (TPS), how much the pension promises are worth and what effect the change in scheme rules has had on them. This paper also addresses a number of other issues that are important when valuing DB pension rights and their relation to overall remuneration. First, how increases in current pay feed through into pension values. Second, how the age profile of earnings affects the profile of pension accrual. Finally, how the value of pension rights in DB schemes compares to that in a stylised defined contribution (DC) scheme.

The figures presented in this paper relate specifically to the composition of members and the specific scheme rules of the TPS. However, the issues raised apply equally to other DB schemes, both public and private sector.

]]>model of training choice, employment and wage growth. The model allows for returns to experience and tenure, match specific effects, job mobility and search frictions. We show how apprenticeship training affects labour market careers and we quantify its benefits, relative to the overall costs. We then use our model to show how two welfare reforms change life-cycle decisions and human capital accumulation: One is the introduction of an Earned Income Tax Credit in Germany, and the other is a reform to Unemployment Insurance. In both reforms we find very significant impacts of the policy on training choices and on the value of realized matches, demonstrating the importance of considering such longer term implications.

]]>

Our findings suggest that, on average, 13 mothers start to work for every 100 youngest children in the household that start preschool (though, in our preferred specification, this estimate is not statistically significant at conventional levels). Furthermore, mothers are 19.1 percentage points more likely to work for more than 20 hours a week (i.e., more time than their children spend in school) and they work, on average, 7.8 more hours per week as consequence of their youngest offspring attending preschool. We find no effect on maternal labor outcomes when a child that is not the youngest in the household attends preschool. Finally, we find that at the point of transition from kindergarten to primary school some employment effects persist.

Our preferred estimates condition on mother's schooling and other exogenous covariates, given evidence that mothers' schooling is unbalanced in the vicinity of the July 1 cutoff in the sample of 4 year-olds. Using a large set of natality records, we found no evidence that this is due to precise birth date manipulation by parents. Other explanations, like sample selection, are also not fully consistent with the data, and we must remain agnostic on this point. Despite this shortcoming, the credibility of the estimates is partly enhanced by the consistency of point estimates with Argentine research using a different EPH sample and sources of variation in preschool attendance (Berlinski and Galiani 2007).

A growing body of research suggests that pre-primary school can improve educational outcomes for children in the short and long run (Blau and Currie 2006; Schady 2006). This paper provides further evidence that, *ceteris paribus*, an expansion in preschool education may enhance the employment prospects of mothers of children in preschool age.

This paper extends the method of local instrumental variables developed by Heckman and Vytlacil (1999, 2001, 2005) to the estimation of not only means, but also distributions of potential outcomes. The newly developed method is illustrated by applying it to changes in college enrollment and wage inequality using data from the National Longitudinal Survey of Youth of 1979. Increases in college enrollment cause changes in the distribution of ability among college and high school graduates. This paper estimates a semiparametric selection model of schooling and wages to show that, for fixed skill prices, a 14% increase in college participation (analogous to the increase observed in the 1980s), reduces the college premium by 12% and increases the 90-10 percentile ratio among college graduates by 2%.

]]>

2. There are two mechanisms through which the temporary VAT cut might affect spending:

first, it will increase spending power, making households feel as if they have more income. This mechanism is likely to be small partly because the tax cut increases income only for one year, and so the increase in total lifetime resources is very small, and partly because the lost revenue will have to be paid back.

3. However, the second (often ignored) mechanism is likely to be much more important. This second mechanism is the effect that the tax cut will have through changing the price of goods bought in 2009 compared to 2010: the cost of goods bought in 2009 has fallen compared to goods bought in 2010 and this change in prices gives an incentive to bring forward consumer spending to this year, rather than waiting until next.

4. Economic evidence on households' willingness to move spending from one year into an earlier (or later) year suggests that a 1% fall in the price today will translate into a 1% increase in spending. Since roughly only half of goods purchased are subject to VAT, the cut in the rate by 2.5% is like a cut in prices today by 1.25% and we would expect this to boost spending by about 1.25% over what it would otherwise be.

5. Of course, this issue of what the spending would otherwise be is crucial: we will not now know what spending in 2009 would have been without the cut in VAT and even with the VAT cut, spending is likely to decline. Our point is simply that economic analysis shows that the cut in VAT will make the situation significantly less bad than it might otherwise have been.

6. A natural comparison to the fiscal stimulus of a cut in VAT is a monetary stimulus through a cut in the interest rate: both make the price of spending today low compared to next year - an interest rate cut makes saving less attractive than current spending, as does the cut in VAT. The 1.25% fall in prices due to the cut in VAT reduces the price of spending today by more than a 1% point cut in the interest rate. It is surprising that some commentators have labeled the former as "small", while the latter would typically be considered a large cut.

7. There is however a difference between cutting interest rates and cutting VAT: a cut in interest rates penalises savers, whose spending power falls, and rewards borrowers. By contrast, the cut in VAT increases the spending power of savers (as well as borrowers) and this seems a fairer way to stimulate the economy.

]]>1. Survey responses are always subject to measurement error. In general surveys (and especially longitudinal surveys), there are severe constraints on the time that can be spent eliciting a less noisy response for any target variable. In this paper we consider when it may be better to consider multiple noisy measures of the target measure rather than improving the reliability of a single measure.

2. The Kotlarski result states that if the measurement errors in two measures of the same

target variable are mutually independent and independent of the true value then we can recover the entire distribution of the quantity of interest, up to location.

3. We consider designing surveys to deliver measurement error with desirable properties. This shifts the emphasis from reliability (the signal to noise ratio for any given measure) to the joint properties of the multiple measures.

4. To illustrate our ideas, we consider a concrete example: the measurement of consumption inequality. A small simulation study suggests that the approach we propose has promise. The next step in this research agenda is experiments in survey data collection.

]]>We study noncooperative household models with two agents and several voluntarily contributed public goods, deriving the counterpart to the Slutsky matrix and demonstrating the nature of the deviation of its properties from those of a true Slutsky matrix in the unitary model. We provide results characterising both cases in which there are and are not jointly contributed public goods. Demand properties are contrasted with those for collective models and conclusions drawn regarding the possibility of empirically testing the collective model against noncooperative alternatives and the noncooperative model against a general alternative.

]]>Single equation instrumental variable models for discrete outcomes are shown to be set not point identifying for the structural functions that deliver the values of the discrete outcome. Identified sets are derived for a general nonparametric model and sharp set identification is demonstrated. Point identification is typically not achieved by imposing parametric restrictions. The extent of an identified set varies with the strength and support of instruments and typically shrinks as the support of a discrete outcome grows. The paper extends the analysis of structural quantile functions with endogenous arguments to cases in which there are discrete outcomes.

This paper is a revised version of the original issued in December 2008.]]>Several decades of conflict, rebellion and unrest severely weakened civil society in parts of Colombia. Paz y Desarrollo is the umbrella term used to describe the set of locally-led initiatives that aim at addressing this problem through initiatives to promote sustainable economic development and community cohesion and action.

This project analyses the findings from a series of "public goods" games that were conducted in the spring and winter of 2006 in 103 municipalities in rural and urban Colombia with predominantly poor participants. These municipalities included both those with and without Paz y Desarrollo in place, and within those municipalities where it was ("treatment" municipalities), both individuals who are participants in the programme and those who are not. The municipalities where PYD is not in place ("control" municipalities) were surveyed as part of the evaluation of another programme - Familias en Accion (FEA), and this project also analyses the impact of this programme on game-play. The game is structured as a typical free-rider problem with the act of contributing to the "public good" (a collective money pot) being always dominated by non-contribution. We interpret contribution as an act consistent with a high degree of social capital.

We find weak evidence that the programme acts at the group level: game sessions involving programme participants have higher levels of contribution than those not involving participants. In addition, there is some evidence that intensity of the programme matters: the more participants, the larger the impact. However, there is no evidence that the programme impacts at the individual level with participants no more likely to contribute than non-participants in treatment areas.

]]>In this paper we introduce a new flexible mixed model for multinomial discrete choice where the key individual- and alternative-specific parameters of interest are allowed to follow an assumption-free nonparametric density specification while other alternative-specific coefficients are assumed to be drawn from a multivariate normal distribution which eliminates the independence of irrelevant alternatives assumption at the individual level. A hierarchical specification of our model allows us to break down a complex data structure into a set of submodels with the desired features that are naturally assembled in the original system. We estimate the model using a Bayesian Markov Chain Monte Carlo technique with a multivariate Dirichlet Process (DP) prior on the coefficients with nonparametrically estimated density. We employ a "latent class" sampling algorithm which is applicable to a general class of models including non-conjugate DP base priors. The model is applied to supermarket choices of a panel of Houston households whose shopping behavior was observed over a 24-month period in years 2004-2005. We estimate the nonparametric density of two key variables of interest: the price of a basket of goods based on scanner data, and driving distance to the supermarket based on their respective locations. Our semi-parametric approach allows us to identify a complex multi-modal preference distribution which distinguishes between inframarginal consumers and consumers who strongly value either lower prices or shopping convenience.

]]>**Please note:** This paper is a revised version of cemmap working Paper CWP09/07.

This paper studies nonparametric estimation of conditional moment models in which the residual functions could be nonsmooth with respect to the unknown functions of endogenous variables. It is a problem of nonparametric nonlinear instrumental variables (IV) estimation, and a difficult nonlinear ill-posed inverse problem with an unknown operator. We first propose a penalized sieve minimum distance (SMD) estimator of the unknown functions that are identified via the conditional moment models. We then establish its consistency and convergence rate (in strong metric), allowing for possibly non-compact function parameter spaces, possibly non-compact finite or infinite dimensional sieves with flexible lower semicompact or convex penalty, or finite dimensional linear sieves without penalty. Under relatively low-level sufficient conditions, and for both mildly and severely ill-posed problems, we show that the convergence rates for the nonlinear ill-posed inverse problems coincide with the known minimax optimal rates for the nonparametric mean IV regression. We illustrate the theory by two important applications: root-n asymptotic normality of the plug-in penalized SMD estimator of a weighted average derivative of a nonparametric nonlinear IV regression, and the convergence rate of a nonparametric additive quantile IV regression. We also present a simulation study and an empirical estimation of a system of nonparametric quantile IV Engel curves.

]]>This paper develops a broad theme about policy choice under ambiguity through study of a particular decision criterion. The broad theme is that, where feasible, choice between a status quo policy and an innovation is better framed as selection of a treatment allocation than as a binary decision. Study of the static minimax-regret criterion and its adaptive extension substantiate the theme. When the optimal policy is ambiguous, the static minimax-regret allocation always is fractional absent large fixed costs or deontological considerations. In dynamic choice problems, the adaptive minimax-regret criterion treats each cohort as well as possible, given the knowledge available at the time, and maximizes intertemporal learning about treatment response.

]]>We propose a new method of testing stochastic dominance which improves on existing tests based on bootstrap or subsampling. Our test requires estimation of the contact sets between the marginal distributions. Our tests have asymptotic sizes that are exactly equal to the nominal level uniformly over the boundary points of the null hypothesis and are therefore valid over the whole null hypothesis. We also allow the prospects to be indexed by infinite as well as finite dimensional unknown parameters, so that the variables may be residuals from nonparametric and semiparametric models. Our simulation results show that our tests are indeed more powerful than the existing subsampling and recentered bootstrap.

]]>This paper provides the first firm-level econometric evidence on the skill-bias of ICT in developing countries using a unique new dataset of manufacturing firms in Brazil and India. I use detailed information on firms' adoption of ICT and the educational composition of their workforce to estimate skill-share equations in levels and long differences. The results are strongly suggestive of skill-biased ICT adoption, with ICT able to explain up to a third of the average increase in the share of skilled workers in Brazil and up to one half in India. I then use variation in the relative supply of skilled workers across states within each country to identify the skill-bias of ICT. The results are again consistent with skill-bias in both countries, and are mainly robust to various methods of controlling for unobserved heterogeneity across states. The magnitudes of the estimated effects from both approaches are surprisingly similar for the two countries. Overall, the results suggest that new developments in ICT are diffusing rapidly through the manufacturing sectors of both Brazil and India, with similar implications for the demand for skills in two very different and geographically distant countries. This evidence is consistent with ongoing pervasive skill-biased technological change associated with ICT throughout much of the developed and developing world. The implications for future developments in inequality both within and between countries are potentially far-reaching.

]]>Consider an observed binary regressor D and an unobserved binary variable D*, both of which affect some other variable Y . This paper considers nonparametric identification and estimation of the effect of D on Y , conditioning on D* = 0. For example, suppose Y is a person's wage, the unobserved D* indicates if the person has been to college, and the observed D indicates whether the individual claims to have been to college. This paper then identifies and estimates the difference in average wages between those who falsely claim college experience versus those who tell the truth about not having college.We estimate this average returns to lying to be about 7% to 20%. Nonparametric identification without observing D* is obtained either by observing a variable V that is roughly analogous to an instrument for ordinary measurement error, or by imposing restrictions on model error moments.

]]>