When you’re working with time series data, naturally things are going to point towards forecasting.

And two of the most reliable classical tools for this are the ARIMA and SARIMA models.

To be honest I was quite confused when I first learnt about these a year ago but they do make a ton of intuitive sense once you understand them on a deeper level.

The names are similar, and so is their underlying logic, but there is a key difference that more or less decides which one you will pick in what scenario.

ARIMA SARIMA Data Science Meme

So, what separates them, and how do you decide which model fits your forecasting needs?

That’s what I want to take you through in this post, are you ready? Are you ready??????

Okay, seems like you are, so let’s do it.

First, Let’s Look at ARIMA

ARIMA, or AutoRegressive Integrated Moving Average, is a model designed to understand and predict future values in a time series.

Best you didn’t know that and this wasn’t redundant at all.

Anyways, it breaks down a series into three components:

  • AR (AutoRegressive): This part of the model assumes that future values have a dependency on past values. It’s the “look-back” component. For eg. this month’s sales might be influenced by last month’s sales.
  • I (Integrated): This is the differencing step used to make the data stationary. In normal human language, many time series models work best when the data’s statistical properties (like its mean and variance) are stable over time. Differencing which means subtracting the previous value from the current value is a common way to achieve this stability.
  • MA (Moving Average): This component looks at past forecast errors to improve the current prediction. It helps the model adjust for shocks or randomness that weren’t captured by the autoregressive term.

ARIMA is at its best when your data has a clear trend but lacks any repeating, cyclical patterns. (Although most business metrics especially sales numbers etc do tend to have some seasonality at least)

A good use case for ARIMA?

Forecasting the monthly electricity consumption for a manufacturing plant.

The usage might be steadily increasing as the company grows, but it doesn’t necessarily spike in the same month each year.(at least that’s the assumption.) Which it might, for eg. AC usage spikes in summer months every year, heater usage spikes in winter months - but manufacturing facility might stay consistent throughout the year to maintain temperatures either ways.

So, cut me some slack, its difficult to find good examples!

Anyways, this is a classic trend-based pattern, making it a perfect time series data set for an ARIMA model.

What Happens When There is Seasonality in Our Time Series Data?

This is where SARIMA enters the picture.

If your data has predictable, repeating patterns over a fixed period, you’re dealing with seasonality, and a standard ARIMA model might think of these cycles as noise.

SARIMA, or Seasonal ARIMA, extends the base model by adding a seasonal component.

AR, I, and MA remain exactly the same but it applies them to the seasonal patterns as well.

SARIMA = ARIMA (for the trend) + A Seasonal Layer (for the cycles)

The model doesn’t just look back at the last few data points (like t-1, t-2); it also looks back at seasonal intervals (like t-12 for yearly data or t-7 for weekly data).

When would you use it?

Well, think a business forecasting monthly sales for wine (my project that I did for time series forecasting).

You’d almost certainly see a massive spike in sales during the summer months and another smaller peak around the holidays.

An ARIMA model might capture the general upward or downward trend, but it would likely miss these seasonal peaks. SARIMA is built to pick on them and forecast them properly.

A Quick Example for SARIMA

Let’s say your data looks something like this, showing a clear yearly pattern:

MonthSales
Jan30
Feb28
Mar40
Dec80
Jan32
Feb27

An ARIMA model would process this as a general trend with some random fluctuations.

A SARIMA model will identify the repeating 12-month cycle as a primary signal for making predictions.

Real-World Scenarios for SARIMA vs ARIMA

Choosing between them often comes down to the nature of your data.

ScenarioBest FitRationale
Forecasting monthly website visits for a B2B SaaS productARIMAData is mainly driven by a clean, non-cyclical trend.
Predicting daily ice cream salesSARIMASales will most probably have strong seasonal peaks in summer.
Estimating weekly foot traffic for a grocery storeSARIMACaptures both weekly cycles (weekends) and holiday spikes.
Projecting the annual depreciation of a piece of equipmentARIMAFollows a slow, steady trend without seasonal cycles.

Okay, Let’s Look at the Code

The difference is also clear in the code.

In Python’s statsmodels library, you’ll see a separate argument for the seasonal component in SARIMA.

ARIMA Model: The order parameter (p,d,q) is for AR, I, and MA terms.

from statsmodels.tsa.arima.model import ARIMA

# For non-seasonal data
model = ARIMA(data, order=(2,1,2))
result = model.fit()
forecast = result.forecast(steps=12)

SARIMA Model:

Here, we add the seasonal_order parameter (P,D,Q,m), where m is the length of the seasonal cycle (e.g., 12 for yearly seasonality on monthly data).

from statsmodels.tsa.statespace.sarimax import SARIMAX

# For seasonal data
model = SARIMAX(data, order=(2,1,2), seasonal_order=(1,0,1,12))
result = model.fit()
forecast = result.forecast(steps=12)

When to Use Which?

The decision is quite straightforward:

  • Use ARIMA when your time series data shows a trend but has no obvious repeating, seasonal patterns.
  • Use SARIMA when the data exhibits clear and predictable seasonal cycles.

Bonus Tip: Let the Model Do the Work

If you’re unsure whether a seasonal component would improve your forecast, you don’t have to guess.

The pmdarima library includes an auto_arima function that can test various model configurations for you.

This is what I did as well apart from looking at the diagnostics myself for my project too.

By setting seasonal=True and specifying the seasonal period m, you can let the function figure out whether a SARIMA model provides a better fit.

Think Grid Search for ML hyperparams but just time series in this case.

import pmdarima as pm

# Let auto_arima find the best parameters, including seasonal ones
model = pm.auto_arima(data, seasonal=True, m=12)

# You can then check the model summary and compare performance metrics like AIC or RMSE
print(model.summary())

Let’s Wrap This Up

The machine learning landscape is filled with complex models like LSTMs and Transformers, ARIMA and SARIMA imo are still quite useful.

They are workhorses for a reason: they are fast to train, their results are highly interpretable, and they perform exceptionally well on datasets that aren’t massive enough to require deep learning.

And as data scientists I believe our job is not to make the best model but to make the best model that solves the business problem while taking care of resource constraints. (time, effort & money)

For setting up a solid baseline forecast in a business context, these models are going to be the perfect place to start.

Let the data make the choice for you, and you’ll have a strong and sensible model.