Applied ARDL model step by step.

Share:

Hello friends,

In this post, I will describe how to apply all the ARDL methodology for free. By free, I mean that we will use not paid software to perform the ARDL methodology. More precisely, we will combine R with Microfit 5.5. As I always mention, if you are not proficient in R, I have all codes commented in order for you to understand everything. To perform the ARDL methodology, we will use the ARDL package which is, in my opinion, the most complete R package to apply ARDL. It is important to stress that here we will not go through the details and/or arguments of all the functions in the ARDL packages. Our main purpose is to describe all the steps you can follow for a complete ARDL analysis. In case you require details about the ARDL package, you can read the help file of the package. Finally, here I assume that you have basic previous knowledge of the theory and the estimation of ARDL models.

Here we will use economic time series for Cuba going from 1920 to 1957. We can download the data here. Concretely, we will use gross domestic product (gdp, in the csv file), energy consumption per capita (energy) and length of the train lines (train). We will include a dummy for the Great Depression (crisis29 in the csv).

Then, so to speak, we want to determine if the infrastructure (proxied by the train lines), the energy consumption and the Cuban economic growth follow a common trend in the long-run. In addition, we want to know if there is long-run causality running from “energy” and “train” to “gdp”. Here we will not show and comment any results. We do it in two YouTuBe videos. One in English and the other in Spanish. We can advance that we investigate this during the period 1959-2000 in some preliminary analysis and we did not obtain cointegration. Therefore, the energy consumption and the length of the train lines did not have any long-run nexus with the economic growth of the Cuba during the communist period.

Before starting any econometric analysis, you should ensure that there is no feedback from the dependent variable to any regressor. This means that if theory indicates a strong bidirectional causality between your dependent variable and any regressor, you are better off choosing other methodology because this no feedback requirement is crucial for ARDL methodology. For, example, energy consumption can boost economic growth, but it is very possible that economic growth can drive energy consumption. Therefore, ARDL maybe not be a good choice in our case, but we are here just demonstrating ARDL methodology.

Once the no feedback condition is verified, the ARDL methodology starts by applying unit root tests to ensure that the dependent variable is I(1) and the regressors are at most I(1) because they may be I(0) as well. I have posted about unit root testing with breaks here, to ensure the no autocorrelation assumption in the ADF test here, and the Elder and Kennedy strategy when applying unit root tests here.

When applying ARLD methodology, as a first step, we would proceed to the optimal lag selection which will give us the best combinations of lags of the variables in the ARDL models. Because we have relatively short series, we will consider up to 5 lags for the best model selection.

AIC_selection <- auto_ardl(data = cuba_ardl, max_order = 5, selection = "AIC",
formula = gdp ~ energy + train | crisis29,
selection_minmax = "min", search_type = "horizontal",
start = 1920, end = 1957, grid = TRUE)

After the selection of the best combination of lags, we proceed to estimate our first ARDL model using the best combination of lags (AIC_selection$best_order in the following line of code)

ardl_model<- ardl(data = cuba_ardl, order = AIC_selection$best_order,
formula = gdp ~ energy + train | crisis29, start = 1920, end = 1957)
summary(ardl_model)

Now we proceed to perform the analysis of the residuals of this model. The following lines of codes are to apply the Breusch Godfrey autocorrelation test. The “n” is up to which order you want to analyze autocorrelation. This means, if n = 2 (as we do in the next line of code), we are investigating the existence of first and second order autocorrelation.

n <- 2
for (i in 1:n){
godfrey<-bgtest(ardl_model, order = i, type = "Chisq")
print(godfrey)
}

The following line is to apply the Breusch-Pagan homoskedasticity test

print(bptest(ardl_model, studentize = TRUE))

In the case of normality tests, we can apply three of them. You can choose which of them you prefer.

shapiro.test(ardl_model$residuals) ######## Shapiro Wilk test #######
jb.norm.test(ardl_model$residuals) ######## Jarque Bera test #######
ajb.norm.test(ardl_model$residuals) ######## adjusted Jarque Bera test (Urzua, 1996) #######

You will see that there are persistent residual problems of autocorrelation, non-normality and/or heteroskedasticity. Therefore, in general, it is very important to plot the residuals. There you can find probable important events like crises that you may have failed to consider. Then, we can include the following code to plot the residuals:

plot(ardl_model$residuals,)

After the residual tests, we can perform stability tests. The following lines of codes are to apply the Ramsey’s RESET test.

for (i in 2:3){
resettestt<-resettest(ardl_model, power = 2:i, type = "fitted")
print(resettestt)
}

To perform the CUSUM and the CUSUM of the squared residuals, we need to use Microfit 5.5 which is very straightforward to use. However, I will explain the steps with Microfit 5.5 at the end of this post. I proceed like this because I think it is better to have all the codes with R and after move to Microfit 5.5.

If all the residual and stability tests are right, we can be pretty confident that this is a good choice of unrestricted ARDL model to test for the existence of cointegration. Therefore, we proceed to apply the F-bound and t-bound tests as follows:

f_bound <- bounds_f_test(ardl_model, case = 3, exact = TRUE, R = 40000, test = "F" #, vcov_matrix = NULL)
f_bound
t_bound <- bounds_t_test(ardl_model, case = 3, exact = TRUE, R = 40000)
t_bound

Before to proceeding to the following step, I just want to explain that the “case = 4” indicates that we are applying the bound tests to an ARDL model with a restricted trend and “case = 5” if the trend is unrestricted. If you want to include a trend, when obtaining “ardl_model” and “AIC_selection”, we included a trend ( by adding “trend(gdp)”, for example). In that case, we should set “ formula = gdp ~ energy + train + trend(gdp) | crisis29”. As in the preliminary analysis, a trend never was significant, we did not include it and that is why we are in “case = 3”.Even in some published articles, we can see that only the F bound tests are reported and the rejection of the null hypothesis is considered as sufficient to favor cointegration. However, we value of the t bound statistic also should be larger than the critical value for the existence of cointegration. If cointegration is found, you can obtain the long-run coefficients,

long_run <- multipliers(ardl_model, type = "lr", vcov_matrix = NULL)
print(long_run)

Unfortunately, the ARDL package only permit the estimation of the restricted ARDL model if none of the variables have zero lags in the unrestricted ARDL model. As our ARDL model has two lags in “gdp”, three in “energy” and one in “train”, we cannot obtain the restricted model using the ARDL package by using the “recm” function.

restricted<- recm(ardl_model, case = 3)
summary(restricted)

However, if any of the variables has zero lags, we will need to use Microfit 5.5 to obtain the restricted ARDL model. In addition we will need Microfit 5.5 to obtain the CUSUM charts as ARDL package is not capable to obtain the charts. Logically, it is very probable that in the near future the ARDL package may be able to obtain the charts and the restricted error correction model (regardless of the lags in the variables). It is important to indicate that there may be discrepancies in the calculation of the ARDL models between the ARDL package and Microfit 5.5. To obtain the exact coincidence, we should adjust the “ start” argument in “AIC_selection” and in “ardl_model”.

In Microfit 5.5 you should follow the following steps (it is important to do it BY THIS ORDER)

File > Open File
Univariate > ARDL approach to cointegration
Edit > Constant (Intercept) term

Then in “Name” you can write “drift”.
In “Order of ARDL” you write 5 and where you input the variables you write “gdp energy train & crisis29 drift” and click in “Run” and after “Yes”. As we used the AIC criterion in R, we choose “Akaike Information Criterion” and after option 3 (“Display Error Correction Model”). Then, we will obtain the restricted ARDL model.
Then, we close the output window and close the window that will appear
Then, you go to “1. Display the estimates of the selected ARDL regression” and then “OK”
After closing the output window you choose “2. Move to Hypothesis Testing Menu” and then “OK”
You choose “4. CUSUM and CUSUMSQ tests (OLS).” and then “OK”

You can download a txt file with the R code that we just explained in this post here. Therefore, you only need to open the txt file with RStudio and save it as an R file. I don’t want to finish this demonstration without mentioning that you can do a lot more with Microfit 5.5. For example, you can perform Wald tests in any of the coefficients. I really hope you found this post useful. If there is any comment, correction, or question; feel free to reach me in Twitter (click here to visit my account) as very probably you will get a very fast response. Spanish or English is ok.