Time Series Forecasting in Python

Time Series Forecasting in Python

Justin updated 3 years ago 1 Member · 2 Posts
Python Forum
Justin

Administrator
June 25, 2021 at 11:30 am

Up
0
Down

Time Series forecasting is the process of using a statistical model to predict future values of a time series based on past results.

Some Use Cases:

To predict the number of incoming or churning customers.
To explaining seasonal patterns in sales.
To detect unusual events and estimate the magnitude of their effect.
To Estimate the effect of a newly launched product on number of sold units.

ARIMA Model Python Example — Time Series Forecasting

https://towardsdatascience.com/machine-learning-part-19-time-series-and-autoregressive-integrated-moving-average-model-arima-c1005347b0d7

Python | ARIMA Model for Time Series Forecasting

https://www.geeksforgeeks.org/python-arima-model-for-time-series-forecasting/

https://towardsdatascience.com/time-series-forecasting-using-auto-arima-in-python-bb83e49210cd

In the domain of machine learning, there’s a specific collection of methods and techniques particularly well suited for predicting the value of a dependent variable according to time.

We refer to a series of data points indexed (or graphed) in time order as a time series. A time series can be broken down into 3 components.

Trend: Upward & downward movement of the data with time over a large period of time (i.e. house appreciation)

Seasonality: Seasonal variance (i.e. an increase in demand for ice cream during summer)

Noise: Spikes & troughs at random intervals

Autoregressive models operate under the premise that past values have an effect on current values. AR models

are commonly used in analyzing nature, economics, and other time-varying processes. As long as the assumption holds, we can build a linear regression model that attempts to predict value of a dependent variable today, given the values it had on previous days.

AutoRegressive Integrated Moving Average Model (ARIMA)
The
ARIMA (aka Box-Jenkins) model adds differencing to an ARMA
model. Differencing subtracts the current value from the previous and
can be used to transform

a time series into one that’s stationary. For example, first-order differencing addresses linear trends, and employs the transformation zi= yi — yi-1. Second-order differencing addresses quadratic trends and employs a first-order difference of a first-order difference, namely zi= (yi — yi-1) — (yi-1 — yi-2), and so on.

Three integers (p, d, q) are typically used to parametrize ARIMA models.

p: number of autoregressive terms (AR order)

d: number of nonseasonal differences (differencing order)

q: number of moving-average terms (MA order)
Justin

Administrator
July 7, 2022 at 9:37 am

Up
0
Down

### 3) Deactivation forecasting

######### Install the pmdarima library to perform time series analysis.
!pip install pmdarima
!pip install matplotlib
!pip install statsmodels

# !pip uninstall statsmodels -y
# !pip install statsmodels==0.11.0

# Import the library
from pmdarima import auto_arima
from pmdarima.arima import ADFTest
from statsmodels.tsa.seasonal import seasonal_decompose

# Ignore harmless warnings
import warnings
warnings.filterwarnings(“ignore”)

# Fit auto_arima function to deactivation dataset

churn = impute[impute[‘active’]==0].groupby([‘deact_yymm’]).agg( churn=(‘acctno’, ‘nunique’) )
churn.info()

churn[‘yymm_day’]= churn.index + ’01’
churn

# ETS Decomposition
result = seasonal_decompose(churn[‘churn’],
model =’multiplicative’, period=1)

# ETS plot
result.plot()

### Test for Stationarity: H0: unit root is present (non-stationary). H1: the series is stationary. ###########################
# In statistics and econometrics, an augmented Dickey–Fuller test (ADF) tests the null hypothesis
# of a unit root is present in a time series sample. The alternative hypothesis is different depending
# on which version of the test is used, but is usually stationarity or trend-stationarity.
# It is an augmented version of the Dickey–Fuller test for a larger and more complicated set of time series models.

### should_diff(x): Test whether the time series is stationary or needs differencing.

ADF_Test=ADFTest(alpha=0.05)
pval, should_diff= ADF_Test.should_diff(churn[‘churn’])
print(‘p-value={}, should differencing = {}’.format(pval, should_diff))

# Test results: p-value=0.9770785716280405, should differencing = True.
# It is non stationary because p-value>0.05. True means that it needs differencing.

stepwise_fit = auto_arima(churn[‘churn’],
start_p = 1, start_q = 1,
max_p = 3, max_q = 3, m = 12,
start_P = 0, seasonal = True,
d = None, D = 1, trace = True,
error_action =’ignore’, # we don’t want to know if an order does not work
suppress_warnings = True, # we don’t want convergence warnings
stepwise = True) # set to stepwise

# To print the summary
stepwise_fit.summary()

# Best model: ARIMA(2,0,1)(0,1,0)[12] intercept
# Total fit time: 6.703 seconds
# ARIMA(2,0,1)(0,1,0)[12]: AIC=197.856, Time=0.13 sec

# Fit the best model
from statsmodels.tsa.statespace.sarimax import SARIMAX

model = SARIMAX(churn[‘churn’],
order = (2,0,1),
seasonal_order =(0,1,0,12) )

result = model.fit()
result.summary()

##### Use the time series model to forecast for next 6 months. ###############
### predict() function: generate in-sample predictions from the fit ARIMA model.

forecast = result.predict(start = len(churn), #### start=25, len(churn)=25
end = (len(churn)-1) + 6, #### end=25-1+6= 30.
typ = ‘levels’).rename(‘Forecast’)

# Plot the forecast values
churn[‘churn’].plot(figsize = (12, 5), legend = True)
forecast.plot(legend = True)

Time Series Forecasting in Python

Justin

Justin