[시계열] 일반 선형확률과정 ARMA / ARIMA(5)

2019-2-4 Mon 15:58

Math

일반 선형확률과정 모형(general linear process model)

3) ARMA(p, q)

AR과 구분 불가 cf. AR 간에도 p차 구분불가
MA & AR은 구분 가능

$Y_t = -\phi_1 Y_{t-1} -\phi_2 Y_{t-2} - \phi_3 Y_{t-3} - \cdots + \epsilon_t \\ + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q}$

AR(p)와 MA(q) 성질 모두 가지는 모형
- $AR(p): Y_t = -\phi_1 Y_{t-1} - \phi_2 Y_{t-2} - \cdots$
- $MA(q): Y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q}$

ARMA(1, 1)
#샘플링
import statsmodels.api as sm
np.random.seed(0)
phi = 0.7; theta = -0.4
ar=[1, -phi];ma=[1, theta]
p1 = sm.tsa.ArmaProcess(ar, ma)
y1 = p1.generate_sample(120)

plt.figure(figsize=(10, 10))
plt.subplot(411)
plt.plot(y1, 'o-')

plt.subplot(412)
plt.stem(p1.acf(100))

ax = plt.subplot(413)
sm.graphics.tsa.plot_acf(y1, lags=100, ax=ax)

ax = plt.subplot(414)
sm.graphics.tsa.plot_pacf(y1, lags=100, ax=ax, method='ywm')

plt.tight_layout()
plt.show()

4) ARIMA 모형: 비정상과정

차분한 결과 $\nabla Y_t = Y_t - Y_{t-1}$가 ARMA 모형
ARIMA(p, d, q): $\nabla^d Y_t = ARMA(p, q) \rightarrow$ d번 차분 후 ARMA
특징: 자기상관계수 빠르게 감소하지 않음
ARMA VS ARIMA
- 특성방정식의 해 X=1 유무
- ARMA는 과거 데이터를 반영, ARIMA는 과거 데이터 + 그 추세까지 반 (참고)
  4.1) 단위근 특성
$(Y_t - Y_{t-1}) + \phi_1(Y_{t-1} - Y_{t-2}) + \cdots + \phi_p (Y_{t-p} - Y_{t-p-1}) =\\
\epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$
ARIMA(p, 1, q)모형은 특성 방정식 해가 x=1 단위근 가짐
특성방정식: $(1-x)(1 + \phi_1 x + \cdots + \phi_p x^p)=0$

4.2) 단위근 검정: ADF(Augmented Dickey-Fuller) test

DF 일반화
H0: 적분차수1 이상
- sm.tsa.adfuller
  - adf: test statistic
  - pvalue: float

예) IMA(1, 1)

$Y_t - Y_{t-1} = \epsilon_t - \theta \epsilon_{t-1}$
$Y_t = \epsilon_t + (1-\theta) \epsilon_{t-1} + (1 - \theta) \epsilon_{t-2} + (1-\theta) \epsilon_{t-3} \cdots +$
백색잡음의 누적 cumulation

예) IMA(2, 2)

$(Y_t - Y _{t-1}) - (Y_{t-1} - Y_{t-2}) = \epsilon_t - \theta_1 \epsilon_{t-1} - \theta_2 \epsilon_{t-2}$

#IMA(2,2): theta1 = 1, theta2 = -0.6
np.random.seed(0)
theta1 = 1; theta2 = -0.6
ar = [1]; ma=[1, theta1, theta2]
p = sm.tsa.ArmaProcess(ar, ma)
sample = p.generate_sample(100)

y2 = sample.cumsum().cumsum()
y1 = np.diff(y2)
y0 = np.diff(y1)

# y0 = sample
# y1 = y0.cumsum()
# y2 = y1.cumsum()

plt.figure(figsize=(10, 10))
plt.subplot(313)
plt.title('IMA(2,1): diff(2)')
plt.plot(y0, 'o-')

plt.subplot(312)
plt.title('IMA(2,1): diff(1)')
plt.plot(y1, 'o-')

plt.subplot(311)
plt.title('IMA(2,1): diff(0)')
plt.plot(y2, 'o-')

plt.tight_layout()
plt.show()

plt.figure(figsize=(10, 10))
ax1 = plt.subplot(313)
sm.tsa.graphics.plot_acf(y0, lags=90, ax=ax1)
ax1.set_title('ACF: IMA(2, 1): diff(2)')

ax2 = plt.subplot(312)
sm.tsa.graphics.plot_acf(y1, lags=90, ax=ax2)
ax2.set_title('ACF: IMA(2, 1): diff(1)')

ax3 = plt.subplot(311)
sm.tsa.graphics.plot_acf(y2, lags=90, ax=ax3)
ax3.set_title('ACF: IMA(2, 1): diff(0)')

plt.tight_layout()
plt.grid(False)
plt.show()

ADF test 예시: y0 -> 귀무가설 기각

1	sm.tsa.adfuller(y2)

(-2.251162211352469,
 0.1882029374890175,
 8,
 91,
 {'1%': -3.50434289821397,
  '5%': -2.8938659630479413,
  '10%': -2.5840147047458037},
 307.88958150914243)

1	sm.tsa.adfuller(y1)

(-1.7069994687237129,
 0.42750979630229596,
 12,
 86,
 {'1%': -3.5087828609430614,
  '5%': -2.895783561573195,
  '10%': -2.5850381719848565},
 307.69019010591194)

1	sm.tsa.adfuller(y0)

(-2.5363882098874595,
 0.10686979989276602,
 6,
 91,
 {'1%': -3.50434289821397,
  '5%': -2.8938659630479413,
  '10%': -2.5840147047458037},
 305.82342887519786)

질문

특성방정식 - 이해 부족

Henry's blog

Step by step

[시계열] 일반 선형확률과정 ARMA / ARIMA(5)

일반 선형확률과정 모형(general linear process model)

3) ARMA(p, q)

4) ARIMA 모형: 비정상과정

4.1) 단위근 특성

4.2) 단위근 검정: ADF(Augmented Dickey-Fuller) test

예) IMA(1, 1)

예) IMA(2, 2)