How to build a vector autoregression model (VAR model)?

It is necessary to evaluate the reaction of one variable to the shock of another (using vector autoregression). Given the specifics of the series, the following questions arose:

  1. When checking for the presence of a single root (ADF-test) for one of the series according to the Akayka information criterion, the best model was 4 lags (the series is stationary, the probability of accepting 0 of the hypothesis is about 3%), according to the Schwartz criterion with a lag of 3 (the series is non-stationary, the probability of accepting 0 of the hypothesis of the presence of the unit root is approximately equal to 50%). What is the correct way to interpret the result if there is such a difference in the results when the number of lags in the autoregression model changes? Can a series be considered stationary at a significance level of 5%? If VAR with a lag of 3 is used in the future, then ADF-test should also be considered with 3 lags (regardless of the information criteria).
  2. The second row (in levels) is non-stationary, taking the first differences indicates that, that it is a 1st-order integrated. Do I understand correctly that the order is determined by the minimum level difference, that is, the stationarity of the second differences does not affect anything?
  3. Is it required that both series included in the VAR model are stationary? It seems that there is no such strict requirement anywhere, but I am a little confused by the fact that the evaluation is carried out by the OLS method, the condition of which is the stationarity of all series in order to avoid false regression.
  4. If you choose what you need the number of lags in order for both variables to become first-order integrated, when performing the Johansen test, 1(in some models 2) cointegrating vectors are found? Under these conditions, you can use the VAR model if the series is short and the main focus is on analyzing the short-term reaction. A few more controversial points (for a beginner) on the very methodology of the test for the presence of cointegration:
  • Is the test performed on non-formed (raw) rows?
  • If the number of lags in VAR or VECM is n, then the number of lags in the test is n-1? For example, in VAR (1 5), here we write (1 4)?
  1. If there is no way to avoid in this case the construction of a vector error correction model (VECM) and we do not have any a priori assumptions, is it necessary to put any restrictions on the coefficients? How can I construct a vector of these constraints? Will the interpretation of the pulse response function be different in this case (here short-term deviations are analyzed) and how is it possible to separate short-term dynamics from long-term dynamics in this model?Is the VECM built on levels (raw series) or differences of variables?
  2. When constructing a VAR, are there any restrictions on taking one of the series in levels and the other in differences? Of course, this complicates the interpretation of the results, but are there strict prohibitions on the part of mathematics? This also applies to the use of integrated series of different non-zero order.
  3. Accepted choose models based on information criteria, what are the conditions for circumventing this requirement (that is, one of the models looks both more logical and more human, but it is not yet possible to formally prove its use).
  4. The results of the Granger test also differ significantly when choosing a different number of lags. At the same time, in most cases, causality is found in both directions? This means that you need to find the 3 factor that affects the first 2, or it can be considered as evidence of co-integration and the need to build a VECM?
  5. What is the fundamental difference between the pulse response function and the accumulated pulse response? In which cases is it preferable to apply each of the functions?
  6. The minimum number of tests that indicate the adequacy of the VAR model? (in addition to the simplest ones, evaluation of the significance of the coefficients in the equation and the coefficient of determination, information criteria, standard residual statistics (heteroscedasticity and autocorrelation), whether the multidimensional time series satisfies the stability condition, i.e., whether it is a weakly stationary multidimensional process, as well as the hypothesis of the significance of the delayed values of endogenous variables. I would be grateful if someone can even clearly describe the algorithm of actions in the evaluation in human language? Or it will tell you where you can find answers to your questions. So far, I have used Eviews Help, respectively, various tutorials for it, as well as textbooks on econometrics Nosko V. P. and Dougherty.
Author: Meow-Meow, 2020-07-05