Application of ARIMA-GARCH Model for Prediction of Indonesian Crude Oil Prices

Crude oil is one of the most important energy commodities for various sectors. Changes in crude oil prices will have an impact on oil-related sectors, and even on the stock price index. Therefore, the prediction of crude oil prices needs to be done to avoid the future prices of these non-renewable natural resources to increase dramatically. In this paper, the prediction of crude oil prices is carried out using the Auto-Regressive Integrated Moving Average (ARIMA) and Generalized Auto-Regressive Conditional Heteroscedasticity (GARCH) models. The data used for forecasting are Indonesian Crude Price (ICP) crude oil data for the period January 2005 to November 2012. The results show that the data analyzed follows the ARIMA(1,2,1)-GARCH(0,3) model, and the crude oil price forecast for December 2012 is 105.5528 USD per barrel. The prediction results of crude oil prices are expected to be important information for all sectors related to crude oil.


Introduction
The price of Indonesian crude oil fluctuates with the development of world crude oil prices. The significant increase in crude oil prices is certainly not expected by many governments in the world including the crude oil-producing countries (Kulkarni & Haidar, 2009). For importing countries, this increase will disrupt economic growth due to high inflation. As for exporting countries, rising oil prices will trigger a decline in demand in the future. Therefore, we need a model to predict future prices (Rosch & Schmidbauer, 2011;Yu et al., 2008).
Time series analysis and forecasting (prediction) have become research material in various fields (Abledu & Kobina, 2012). Several time series models can be used to estimate and predict crude oil prices. This paper discusses the use of the Auto-Regressive Integrated Moving Average (ARIMA) model and the General Auto-Regressive Conditional Heteroscedasticity (GARCH) model (Makiel, 2012;Bosler, e-ISSN: 2722-0974 2010), to estimate an appropriate model using past data and predict future prices from Indonesia Crude Price data.
The ARIMA model is a linear model that can be used to present a stationary and non-stationary time series data (Sukono et al., 2011). Because the price of crude oil contains volatility in time series, it is necessary to test its heteroscedasticity in the data series. The GARCH model is used because this model can detect volatility groups in crude oil prices with a time series (Rosadi, 2012;Tsay, 2005). These two models will be used in estimating suitable models for Indonesian crude oil price data.
Indonesian crude oil price data used in the writing of this paper is Indonesia Crude Price (ICP) data published by the Indonesian Ministry of Energy and Mineral Resources (ESDM) from January 2005 to November 2012. The purpose of this paper is to estimate the mean model with ARIMA modeling and variance model with GARCH modeling and using the model to predict the price of Indonesian crude oil (ICP) in the coming month.

Research Methods
This section discusses previous research studies as references, the process of differentiation to obtain stationary time series data, ARIMA models, GARCH models, and predictions using ARIMA-GARCH models. Lee (2009), researched forecasting crude oil prices in Malaysia using the Box-Jenkins method, and the Generalized Autoregressive Conditional Heteroscedasticity (GARCH) approach. In Lee's research, an Autoregressive Integrated Moving Average (ARIMA) model was established as a benchmark model. Lee's research found ARIMA(1,2,1) and GARCH(1,1) are suitable models under model identification, parameter estimation, diagnostic testing and future price forecasting. In that study, an analysis was conducted with the help of EViews software where the potential of the software will be explored in predicting the daily crude oil price of time series data. Finally, using several measures, the comparison of performance (performance) between the ARIMA(1,2,1) and GARCH(1,1) models tested. GARCH(1,1) was found to be a better model than the ARIMA(1,2,1) model. Based on this research, Lee concluded that the ARIMA(1,2,1) model can produce accurate estimates based on the information patterns in the history of crude oil prices. However, GARCH(1,1) is a better model for daily crude oil prices because of its ability to capture volatility by pemalar instead of conditional variance.

Previous Research Studies
In this paper, research is conducted on the prediction of crude oil prices in Indonesia using the method used by Lee. The research was carried out with the following steps.

Differentiation Process
Time series modeling requires that the data analyzed is stationary. When time series data is not stationary, it is necessary to do differentiation to get stationary data (Sukono et al., 2011). For example time series data is not stationary, to get time series data which is stationary can be done with the following differentiation processes (3) etc.

ARIMA Model
The general form of Autoregressive Integrated Moving Average with the order p, d, q (notated as ARIMA(p,d,q)) is stated as follows (Jhohura & Rayhan, 2012)      , where is AR stationary operation not an ordinary factor divisor. The parameter θ 0 has a different role for d = 0 and d < 0. When d = 0, the process is said to be stationary, and is related to the average or mean value, i.e.
. But when d > 0, θ 0 is called a deterministic trend and is often ignored from the model unless it is really needed (Tsay, 2005).
ARIMA modeling stages. The ARIMA modeling process is as follows. (i) Model identification: i.e. assign tentative values of p and q using correlograms. (ii) Parameter estimation: done using the leastsquares method or the maximum likelihood method to estimate the autoregressive integrated moving average (ARIMA) model. (iii) Diagnosis test: how to test whether the residuals from the average model are random so that they are relatively small residuals, or residuals are white noise. (iv) Prediction: i.e. uses a chosen average model to predict the l-step forward (Tsay, 2005).

GARCH Model
The GARCH model was developed by the Bollerslev Team in 1986 and 1994 to improve the ARCH model. This model is a time series model with not constant variance, 2 t  (Rosadi, 2012). Variance is not constant means that there is heteroscedasticity, OLS assumptions are not met, parameters are still not biased and the standard error estimates and interval configurations are too narrow or limited. Specifically, a GARCH(r,s) model can be assumed as (Chand et al., 2012) , ; where α 0 > 0, α i ≥ 0, i = 1, ..., r, β j ≥ 0, j = 1, ..., s and {ε t } is a set of random variables and normally identical distributed with a mean of 0 and variance of 1. For this model conditional variance is defined positively (Tsay, 2005).
GARCH modeling stages. The GARCH modeling process is carried out as follows. (i) Estimating the average model, which is estimating and choosing a good average model as done in the above-average modeling. (ii) ARCH effect test, which is testing the effect of ARCH on the residuals of the average model with the ARCH-LM test. (iii) Identification of the model, i.e. if the ARCH effect is statistically significant, then determine the values of r and s with the help of a correlogram. (iv) Model estimation, which is to estimate simultaneously the average model and the variance model, is carried out using the least-squares method or the maximum likelihood method to estimate the GARCH(r,s) model. (v) Diagnosis test, i.e. testing whether the residuals of the variance model are white noise. (vi) Prediction, i.e. using the chosen average model and variance to predict the average or mean , for the l-step forward (Tsay, 2005).

Results and Discussion
In this section, the discussion includes data on the price of crude oil used, the process of modeling the mean with the ARIMA model, the process of modeling the mean simultaneously with the GARCH model, and then prediction (forecasting).

Data
The data used in this paper is the price of Indonesian Crude Price (ICP) which has been published on the website of the Indonesian Ministry of Energy and Mineral Resources (www.esdm.or.id). Observation data is monthly data on Indonesian crude oil prices for 95 months (January 2005 to November 2012).

Average/Mean Modeling Process with ARIMA Model
The following is an analysis of ARIMA modeling on ICP data using Eviews-6 software. Test stationary data. To ascertain whether the data has been stationary will be carried out stationary tests using the unit root test (unit root test) or the Augmented Dickey-Fuller (ADF) test. From the ADF test results, it can be seen that the absolute value of the t statistic is smaller than the critical value in the table with a significance level of 5%. The test results can also be seen from the probability value that is greater than the significance level of 0.05 (5%). So it can be concluded that ICP data is not stationary.
If the data is not stationary, then to transform the data can be done differentiation transformation. In this paper, differentiation transformation is carried out twice to get stationary time series data. This shows that later the model used is a model with the order d = 2 or ARIMA(p,2,q).

Model-identification.
After stationary data is obtained, the model can be identified by looking at the autocorrelation function (ACF) and partial autocorrelation function (PACF) from the data. From the second differentiation correlogram, it can be seen that the ACF plot was significantly interrupted at lag 1 and the PACF plot was significantly interrupted at lag 1. So there are indications for modeling data using the ARMA(1,1) model.

Model estimation.
After the identification of the model, it was found that the possible model for the data is the ARMA(1,1) model. Next, the ARMA(1,1) model parameters were estimated. Estimates made using Eviews-6 software found that the ARMA(1,1) model, with the parameter probability, is smaller than the significant level of 5%. So, the ARIMA(1,2,1) model is good enough for data, with the following equation: . Test the significance of the model. After getting the best mean model, then the model needs to be done the t-statistic test to determine the significance of each independent variable in its effect on the dependent variable.
Test the constant ϕ 0 , with the following hypothesis: H 0 : ϕ 0 = 0 and H 1 : ϕ 0 ≠ 0, where the test statistic is t ratio = ϕ 0 /SEϕ 0 or prob(t ratio ) value. The test criteria are reject H 0 if t ratio > t α or prob(t ratio ) < α. Based on Table 2, the obtained t ratio = -0.000088/0.000586 = -0.150171 is smaller than t α = -0.150020 (for α = 0.05) and the probability value is 0.8811 > 0.05. This shows that H 0 is accepted, so ϕ 0 constant has no effect on the dependent variable (Z t ).
Test the coefficient ϕ 1 , with the following hypothesis: H 0 : ϕ 1 = 0 and H 1 : ϕ 1 ≠ 0, where the test statistic is t ratio = ϕ 1 /SEϕ 1 or prob(t ratio ) value. The test criteria are reject H 0 if t ratio > t α or prob(t ratio ) < α. Based on Table 2, the t ratio = 0.426421/0.094884 = 4.494130 is greater than t α = 4.494115 and the probability value is 0.0000 > 0.05. This shows that H 0 is rejected, so the ϕ 1 constant affects the dependent variable (Z t ).
Test the coefficient θ 1 , with the following hypothesis: H 0 : θ 1 = 0 and H 1 : θ 1 ≠ 0, where the test statistic is t ratio = θ 1 /SEθ 1 or prob(t ratio ) value. The test criteria are reject H 0 if t ratio > t α or prob(t ratio ) < α. Based on Table 2, the obtained t ratio = -0.988759/0.010765 = -91.84941 is greater than t α = -91.85157 and the probability value is 0.0000 > 0.05. This shows that H 0 is rejected, so θ 1 constant affects the dependent variable (Z t ).
In addition, the F-test statistic test is also performed to determine the significance of all independent variables as a whole or measure the effect of the independent variables together.
From the results of the partial verification test it is found that the constant c in the ARIMA(1,2,1) model is not significant, so to get the best mean model, the constant c cannot be included in the ARIMA(1,2,1) model equation. Therefore, the equation of the ARIMA(1,2,1) model becomes .
The diagnostic test, the analysis used is to perform Q-Ljung-Box statistical tests and ACF/PACF plots, to see whether there is a serial correlation in the a t residual and is normally distributed. Based on the results of processing with Eviews-6 software, it is found that the residual data is white noise. This can be shown by the probabilistic value is greater than the significance level of 0.05 (5%). In addition, it was found that the histogram of the a t residuals was normally distributed. So, from the diagnostic test, it can be concluded that the ARIMA(1,2,1) model is white noise, which is normally distributed with an average of 0 and a variance of 0.006132. So that the ARIMA(1,2,1) model is appropriate for ICP data.

The Process of Simultaneous Mean Modeling with the GARCH Model
The following is an analysis of GARCH modeling on ICP data using Eviews-6 software. ARCH-LM Test. After modeling a good enough mean is obtained, then it is tested whether there is a heteroscedasticity effect or not in the model. To test the effects of heteroscedasticity, the ARCH-LM test was carried out with the following hypothesis: H 0 : there was no ARCH element, and H 1 : there was an ARCH element; where the test criterion is rejected H 0 if F statistics > F table or prob.Obs * R < 0.05. Based on Table 3, it was found that the probability value of Obs * R-squared is smaller than the significance level of 0.05 (5%). So the hypothesis H 0 is rejected and it can be concluded that there is an ARCH effect in the ARIMA(1,2,1) residual model. Identify the GARCH model. The GARCH model can be identified by looking at the autocorrelation function (ACF) and the partial autocorrelation function (PACF) of the squared residual data corelogram. From the squared residual corelogram, the ACF plot cuts the Bartlett's line significantly in lag 3 and the PACF plot cuts the Bartlett's line significantly in lag 1, so that the identified model used is GARCH(1,3) or GARCH(0,3) to model volatility on ICP data.

GARCH model estimation.
After identification of the model, it was found that the possible models for ICP data are GARCH(1,3) or GARCH(0,3). Next, the estimated parameters of the ARIMA(1,2,1)-GARCH(1,3) and ARIMA(1,2,1)-GARCH(0,3) model parameters. The estimation results with Eviews-6 software are given in Table 4. Based on the results in Table 4 shows that the coefficient of GARCH(0,3) is significant. This can be seen from the probability value that is smaller than the 0.05 (5%) significance level.
So, based on the estimation of the two models, it is obtained that the best model for ICP data is ARIMA(1,2,1)-GARCH(0,3) with the following equation: Test the significance of the GARCH model. After obtaining the best variance model, then the model needs to be performed the t-statistic test to determine the significance of each constant and the coefficient of the independent variable in its effect on the dependent variable. Using the same method as the significance test on the mean modeling, it was found that the constants and coefficients of the three independent variables namely 2 After modeling the variance, the next test is the re-existence of the ARCH effect in residuals using the LM ARCH test. The results show that the probability of Obs * R-squared is greater than the significance level of 0.05 (or 5%), so it can be concluded that there has been no ARCH effect in the ARIMA(1,2,1)-GARCH(0,3 ) residual model.
Diagnostic test. After the model is confirmed to have no ARCH effect, a diagnostic test is then performed in the same way as testing the mean modeling diagnostic. Diagnostic tests give results that ε t residuals have white noise. So it can be concluded that the ARIMA(1,2,1)-GARCH(0,3) model is quite good in representing ICP data.

Prediction (Forecasting)
After getting the results from the mean modeling stage and the variance modeling stage, crude oil prices will be predicted for the next 1 month. From the above modeling, results obtained a model that is good enough to represent ICP crude oil price data is the ARIMA(1,2,1)-GARCH (0,3)  The actual ICP data plot and the forecasting data from the ARIMA(1,2,1)-GARCH(0,3) model with Eviews software can be seen in Figure 1.

Conclussion
In this paper, an analysis of the application of the ARIMA-GARCH model has been carried out to predict crude oil prices in Indonesia. Based on the results of diagnostic tests in modeling the mean and variance equations, a time series model that is good enough to represent ICP data in the last 95 months (January 2005 to November 2012) is the ARIMA(1,2,1)-GARCH(1,3) model. Using this model, ICP price predictions for the coming month or December 2012 period are 106.28760 USD per barrel with a 95% confidence level.