[시계열] 일반 선형확률과정 ARMA / ARIMA(5)

일반 선형확률과정 모형(general linear process model)

3) ARMA(p, q)

  • AR과 구분 불가 cf. AR 간에도 p차 구분불가
  • MA & AR은 구분 가능
  • AR(p)와 MA(q) 성질 모두 가지는 모형

    • $AR(p): Y_t = -\phi_1 Y_{t-1} - \phi_2 Y_{t-2} - \cdots$
    • $MA(q): Y_t = \epsilon_t + \theta_1 \epsilon_{t-1} + \theta_2 \epsilon_{t-2} + \cdots + \theta_q \epsilon_{t-q}$
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
ARMA(1, 1)
#샘플링
import statsmodels.api as sm
np.random.seed(0)
phi = 0.7; theta = -0.4
ar=[1, -phi];ma=[1, theta]
p1 = sm.tsa.ArmaProcess(ar, ma)
y1 = p1.generate_sample(120)

plt.figure(figsize=(10, 10))
plt.subplot(411)
plt.plot(y1, 'o-')

plt.subplot(412)
plt.stem(p1.acf(100))

ax = plt.subplot(413)
sm.graphics.tsa.plot_acf(y1, lags=100, ax=ax)

ax = plt.subplot(414)
sm.graphics.tsa.plot_pacf(y1, lags=100, ax=ax, method='ywm')

plt.tight_layout()
plt.show()


4) ARIMA 모형: 비정상과정

  • 차분한 결과 $\nabla Y_t = Y_t - Y_{t-1}$가 ARMA 모형
  • ARIMA(p, d, q): $\nabla^d Y_t = ARMA(p, q) \rightarrow$ d번 차분 후 ARMA
  • 특징: 자기상관계수 빠르게 감소하지 않음
  • ARMA VS ARIMA
    • 특성방정식의 해 X=1 유무
    • ARMA는 과거 데이터를 반영, ARIMA는 과거 데이터 + 그 추세까지 반 (참고)

      4.1) 단위근 특성

  • $(Y_t - Y_{t-1}) + \phi_1(Y_{t-1} - Y_{t-2}) + \cdots + \phi_p (Y_{t-p} - Y_{t-p-1}) =\\
    \epsilon_t + \theta_1 \epsilon_{t-1} + \cdots + \theta_q \epsilon_{t-q}$
  • ARIMA(p, 1, q)모형은 특성 방정식 해가 x=1 단위근 가짐
  • 특성방정식: $(1-x)(1 + \phi_1 x + \cdots + \phi_p x^p)=0$

4.2) 단위근 검정: ADF(Augmented Dickey-Fuller) test

  • DF 일반화
  • H0: 적분차수1 이상
    • sm.tsa.adfuller
      • adf: test statistic
      • pvalue: float

예) IMA(1, 1)

  • $Y_t - Y_{t-1} = \epsilon_t - \theta \epsilon_{t-1}$
  • $Y_t = \epsilon_t + (1-\theta) \epsilon_{t-1} + (1 - \theta) \epsilon_{t-2} + (1-\theta) \epsilon_{t-3} \cdots +$
  • 백색잡음의 누적 cumulation

예) IMA(2, 2)

  • $(Y_t - Y _{t-1}) - (Y_{t-1} - Y_{t-2}) = \epsilon_t - \theta_1 \epsilon_{t-1} - \theta_2 \epsilon_{t-2}$

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#IMA(2,2): theta1 = 1, theta2 = -0.6
np.random.seed(0)
theta1 = 1; theta2 = -0.6
ar = [1]; ma=[1, theta1, theta2]
p = sm.tsa.ArmaProcess(ar, ma)
sample = p.generate_sample(100)

y2 = sample.cumsum().cumsum()
y1 = np.diff(y2)
y0 = np.diff(y1)

# y0 = sample
# y1 = y0.cumsum()
# y2 = y1.cumsum()

plt.figure(figsize=(10, 10))
plt.subplot(313)
plt.title('IMA(2,1): diff(2)')
plt.plot(y0, 'o-')

plt.subplot(312)
plt.title('IMA(2,1): diff(1)')
plt.plot(y1, 'o-')

plt.subplot(311)
plt.title('IMA(2,1): diff(0)')
plt.plot(y2, 'o-')

plt.tight_layout()
plt.show()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
plt.figure(figsize=(10, 10))
ax1 = plt.subplot(313)
sm.tsa.graphics.plot_acf(y0, lags=90, ax=ax1)
ax1.set_title('ACF: IMA(2, 1): diff(2)')

ax2 = plt.subplot(312)
sm.tsa.graphics.plot_acf(y1, lags=90, ax=ax2)
ax2.set_title('ACF: IMA(2, 1): diff(1)')

ax3 = plt.subplot(311)
sm.tsa.graphics.plot_acf(y2, lags=90, ax=ax3)
ax3.set_title('ACF: IMA(2, 1): diff(0)')

plt.tight_layout()
plt.grid(False)
plt.show()


  • ADF test 예시: y0 -> 귀무가설 기각
1
sm.tsa.adfuller(y2)
(-2.251162211352469,
 0.1882029374890175,
 8,
 91,
 {'1%': -3.50434289821397,
  '5%': -2.8938659630479413,
  '10%': -2.5840147047458037},
 307.88958150914243)
1
sm.tsa.adfuller(y1)
(-1.7069994687237129,
 0.42750979630229596,
 12,
 86,
 {'1%': -3.5087828609430614,
  '5%': -2.895783561573195,
  '10%': -2.5850381719848565},
 307.69019010591194)
1
sm.tsa.adfuller(y0)
(-2.5363882098874595,
 0.10686979989276602,
 6,
 91,
 {'1%': -3.50434289821397,
  '5%': -2.8938659630479413,
  '10%': -2.5840147047458037},
 305.82342887519786)

질문

  • 특성방정식 - 이해 부족
< !-- add by yurixu 替换Google的jquery并且添加判断逻辑 -->