Home › Forums › Main Forums › Python Forum › Time Series Forecasting in Python
-
Time Series Forecasting in Python
-
Time Series forecasting is the process of using a statistical model to predict future values of a time series based on past results.
Some Use Cases:
To predict the number of incoming or churning customers.
To explaining seasonal patterns in sales.
To detect unusual events and estimate the magnitude of their effect.
To Estimate the effect of a newly launched product on number of sold units.ARIMA Model Python Example — Time Series Forecasting
Python | ARIMA Model for Time Series Forecasting
https://www.geeksforgeeks.org/python-arima-model-for-time-series-forecasting/
https://towardsdatascience.com/time-series-forecasting-using-auto-arima-in-python-bb83e49210cd
In the domain of machine learning, there’s a specific collection of methods and techniques particularly well suited for predicting the value of a dependent variable according to time.
We refer to a series of data points indexed (or graphed) in time order as a time series. A time series can be broken down into 3 components.
Trend: Upward & downward movement of the data with time over a large period of time (i.e. house appreciation)
Seasonality: Seasonal variance (i.e. an increase in demand for ice cream during summer)
Noise: Spikes & troughs at random intervals
Autoregressive models operate under the premise that past values have an effect on current values. AR models
are commonly used in analyzing nature, economics, and other time-varying processes. As long as the assumption holds, we can build a linear regression model that attempts to predict value of a dependent variable today, given the values it had on previous days.
AutoRegressive Integrated Moving Average Model (ARIMA)
The
ARIMA (aka Box-Jenkins) model adds differencing to an ARMA
model. Differencing subtracts the current value from the previous and
can be used to transforma time series into one that’s stationary. For example, first-order differencing addresses linear trends, and employs the transformation
zi= yi — yi-1
. Second-order differencing addresses quadratic trends and employs a first-order difference of a first-order difference, namelyzi= (yi — yi-1) — (yi-1 — yi-2)
, and so on.Three integers (p, d, q) are typically used to parametrize ARIMA models.
p: number of autoregressive terms (AR order)
d: number of nonseasonal differences (differencing order)
q: number of moving-average terms (MA order)
-
### 3) Deactivation forecasting
######### Install the pmdarima library to perform time series analysis.
!pip install pmdarima
!pip install matplotlib
!pip install statsmodels# !pip uninstall statsmodels -y
# !pip install statsmodels==0.11.0# Import the library
from pmdarima import auto_arima
from pmdarima.arima import ADFTest
from statsmodels.tsa.seasonal import seasonal_decompose# Ignore harmless warnings
import warnings
warnings.filterwarnings(“ignore”)# Fit auto_arima function to deactivation dataset
churn = impute[impute[‘active’]==0].groupby([‘deact_yymm’]).agg( churn=(‘acctno’, ‘nunique’) )
churn.info()churn[‘yymm_day’]= churn.index + ’01’
churn# ETS Decomposition
result = seasonal_decompose(churn[‘churn’],
model =’multiplicative’, period=1)# ETS plot
result.plot()### Test for Stationarity: H0: unit root is present (non-stationary). H1: the series is stationary. ###########################
# In statistics and econometrics, an augmented Dickey–Fuller test (ADF) tests the null hypothesis
# of a unit root is present in a time series sample. The alternative hypothesis is different depending
# on which version of the test is used, but is usually stationarity or trend-stationarity.
# It is an augmented version of the Dickey–Fuller test for a larger and more complicated set of time series models.### should_diff(x): Test whether the time series is stationary or needs differencing.
ADF_Test=ADFTest(alpha=0.05)
pval, should_diff= ADF_Test.should_diff(churn[‘churn’])
print(‘p-value={}, should differencing = {}’.format(pval, should_diff))# Test results: p-value=0.9770785716280405, should differencing = True.
# It is non stationary because p-value>0.05. True means that it needs differencing.stepwise_fit = auto_arima(churn[‘churn’],
start_p = 1, start_q = 1,
max_p = 3, max_q = 3, m = 12,
start_P = 0, seasonal = True,
d = None, D = 1, trace = True,
error_action =’ignore’, # we don’t want to know if an order does not work
suppress_warnings = True, # we don’t want convergence warnings
stepwise = True) # set to stepwise# To print the summary
stepwise_fit.summary()# Best model: ARIMA(2,0,1)(0,1,0)[12] intercept
# Total fit time: 6.703 seconds
# ARIMA(2,0,1)(0,1,0)[12]: AIC=197.856, Time=0.13 sec# Fit the best model
from statsmodels.tsa.statespace.sarimax import SARIMAXmodel = SARIMAX(churn[‘churn’],
order = (2,0,1),
seasonal_order =(0,1,0,12) )result = model.fit()
result.summary()##### Use the time series model to forecast for next 6 months. ###############
### predict() function: generate in-sample predictions from the fit ARIMA model.forecast = result.predict(start = len(churn), #### start=25, len(churn)=25
end = (len(churn)-1) + 6, #### end=25-1+6= 30.
typ = ‘levels’).rename(‘Forecast’)# Plot the forecast values
churn[‘churn’].plot(figsize = (12, 5), legend = True)
forecast.plot(legend = True)
Log in to reply.