I mention that because although using a lagged DV among the IVs may be theoretically important and methodologically necessary, it may also introduce a risky amount of endongeneity in the model, depending on the substantial relation between variables and time units and, also, on the AR order that may exist in the model.
Unless you and us have more details on the variables and on the estimation, I would not feel confortable to recomend lagging the DV unless you are thinking of some instrumental variable technique or something like Arellano-Bond estimation. Sign up to join this community. The best answers are voted up and rise to the top. Stack Overflow for Teams — Collaborate and share knowledge with a private group. Create a free Team What is Teams? Learn more. Inclusion of lagged dependent variable in regression Ask Question.
Asked 8 years, 8 months ago. Active 1 year, 5 months ago. Viewed 94k times. Improve this question. Maximizing R-squared is rarely a good model-selection criteria. It is true that although the R-squared differ a lot, the predicted values are actually the same using Y or the change of Y. However, given the low R-squared value using the change of Y as DV, does it mean that the current set of IVs is not able to explain the change very well and there must be some omitted variables?
Add a comment. Active Oldest Votes. Improve this answer. Antoine Vernet Antoine Vernet 1, 16 16 silver badges 24 24 bronze badges. Tony Ladson Tony Ladson 7 7 silver badges 14 14 bronze badges.
Nick Stauner Economerics Economerics 41 1 1 bronze badge. I obtained data by digitising a plot which meant the data were sorted. This sorting and the non-linear relationship caused autocorrelation in the residuals.
References: Kendall, Maurice G. Ian Barnett Ian Barnett 46 2 2 bronze badges. Please, give us more details so we may know better on what kind of model we are talking about. Astur R. Astur 1, 1 1 gold badge 9 9 silver badges 14 14 bronze badges. Featured on Meta. Now live: A fully responsive profile. Linked That it nevertheless gives accurate estimates for a range of sample sizes has practical consequences that are less widely appreciated.
We describe this behavior further in the section Dynamic and Correlation Effects. The principal difference between the two sets of simulations above, in terms of OLS estimation, is whether or not there is a delay in the interaction between the innovations and the predictor.
In the AR 1 process with NID innovations, the predictor y t - 1 is contemporaneously uncorrelated with e t , but correlated with all of the previous innovations, as described earlier. In the AR 1 process with AR 1 innovations, the predictor y t - 1 becomes correlated with e t as well, through the autocorrelation between e t and e t - 1.
To see these relationships, we compute the correlation coefficients between y t - 1 and both e t and e t - 1 , respectively, for each process:. The plots show correlation between y t - 1 and e t - 1 in both cases. Contemporaneous correlation between y t - 1 and e t , however, persists asymptotically only in the case of AR 1 innovations.
The correlation coefficient is the basis for standard measures of autocorrelation. The plots above highlight the bias and variance of the correlation coefficient in finite samples, which complicates the practical evaluation of autocorrelations in model residuals. Correlation measures were examined extensively by Fisher [3] , [4] , [5] , who suggested a number of alternatives. This produces a distorted sense of goodness of fit, and a misrepresentation of the significance of dynamic terms.
Durbin's h test is similarly ineffective in this context [7]. Durbin's m test, or the equivalent Breusch-Godfrey test, are often preferred [1]. In practice, the process that produces a time series must be discovered from the available data, and this analysis is ultimately limited by the loss of confidence that comes with estimator bias and variance.
Sample sizes for economic data are often at the lower end of those considered in the simulations above, so inaccuracies can be significant. Effects on the forecast performance of autoregressive models can be severe. For simple AR models with simple innovations structures, approximations of the OLS estimator bias are obtained theoretically.
These formulas are useful when assessing the reliability of AR model coefficients derived from a single data sample. In the case of NID innovations, we can compare the simulation bias with the widely-used approximate value of [11] , [13] :. Asymptotically, it is approximated by [6] :. Here we see the bias move from negative to positive values as the sample size increases, then eventually approach the asymptotic bound.
There is a range of sample sizes, from about 25 to , where the absolute value of the bias is below 0. In such a "sweet spot," the OLS estimator may outperform alternative estimators designed to specifically account for the presence of autocorrelation. Of course, the bias may be considerably less in finite samples. Two violations are critical, and we discuss their effects here in more detail.
The first is the dynamic effect , caused by the correlation between the predictor y t - 1 and all of the previous innovation e t - k.
In the absence of other violations, OLS nevertheless remains consistent, and the bias disappears in large samples. The second is the correlation effect , caused by the contemporaneous correlation between the predictor y t - 1 and the innovation e t. This occurs when the innovations process is autocorrelated, and results in the OLS coefficient of the predictor receiving too much, or too little, credit for contemporaneous variations in the response, depending on the sign of the correlation.
That is, it produces a persistent bias. Thus, in the first set of simulations there is a negative bias across sample sizes. In the second set of simulations, however, there is a competition between the two effects, with the dynamic effect dominating in small samples, and the correlation effect dominating in large samples. Positive AR coefficients are common in econometric models, so it is typical for the two effects to offset each other, creating a range of sample sizes for which the OLS bias is significantly reduced.
Some of the factors affecting the size of the dynamic and correlation effects are summarized in [9]. Among them:. The influence of these factors can be tested by changing the coefficients in the simulations above.
In general, the larger the dynamic effect and the smaller the correlation effect, the wider the OLS-superior range. The jackknife procedure is a cross-validation technique commonly used to reduce the bias of sample statistics. Jacknife estimators of model coefficients are relatively easy to compute, without the need for large simulations or resampling.
The basic idea is to compute an estimate from the full sample and from a sequence of subsamples, then combine the estimates in a manner that eliminates some portion of the bias. Whether the bias is actually reduced depends on the size of the remaining terms in the expansion, but jackknife estimators have performed well in practice.
In particular, the technique is robust with respect to nonnormal innovations, ARCH effects, and various model misspecifications [2]. For time series, deleting observations alters the autocorrelation structure. To maintain the dependence structure in a time series, a jackknife procedure must use nonoverlapping subsamples, such as partitions or moving blocks. We compare the performance before and after jackknifing, on the simulated data with either NID or AR 1 innovations:.
Larger m may improve performance in larger samples, but there is no accepted heuristic for choosing the subsample sizes, so some experimentation is necessary. The code is easily adapted to use alternative subsampling methods, such as moving blocks. The results show a uniform reduction in bias for the case of NID innovations.
In the case of AR 1 innovations, the procedure seems to push the estimate more quickly through the OLS-superior range. This example shows a simple AR model, together with a few simple innovations structures, as a way of illustrating some general issues related to the estimation of dynamic models. The code here is easily modified to observe the effects of changing parameter values, adjusting the innovations variance, using different lag structures, and so on. Explanatory DL terms can also be added to the models.
The general set-up here allows for a great deal of experimentation, as is often required when evaluating models in practice. When considering the trade-offs presented by the bias and variance of any estimator, it is important to remember that biased estimators with reduced variance may have superior mean-squared error characteristics when compared to higher-variance unbiased estimators.
A strong point of the OLS estimator, beyond its simplicity in computation, is its relative efficiency in reducing its variance with increasing sample size. This is often enough to adopt OLS as the estimator of choice, even for dynamic models.
Another strong point, as this example has shown, is the presence of an OLS-superior range, where OLS may outperform other estimators, even under what are generally regarded as adverse conditions. The weakest point of the OLS estimator is its performance in small samples, where the bias and variance may be unacceptable.
The estimation issues raised in this example suggest the need for new indicators of autocorrelation, and more robust estimation methods to be used in its presence. However, as we have seen, the inconsistency of the OLS estimator for AR models with autocorrelation is not enough to rule it out, in general, as a viable competitor to more complicated, consistent estimators such as maximum likelihood, feasible generalized least squares, and instrumental variables, which attempt to eliminate the correlation effect, but do not alter the dynamic effect.
The best choice will depend on the sample size, the lag structure, the presence of exogenous variables, and so on, and often requires the kinds of simulations presented in this example.
Nobay, and D. Peel Eds. London: Croom Helm, Sociological Methodology. San Francisco: Jossey-Bass, Econometric Methods. New York: McGraw-Hill, Statistical Methods of Econometrics.
Amsterdam: North-Holland, Vol 48, , pp. Choose a web site to get translated content where available and see local events and offers. Based on your location, we recommend that you select:. Select the China site in Chinese or English for best site performance.
0コメント