10.2 Decomposition: Additive and Multiplicative
Right, let’s talk about decomposition. You’ve probably looked at a time series plot and thought, “Okay, there’s a trend, some wiggly seasonality, and a bunch of noise… but how do I actually pull them apart to see what’s really going on?” That’s what we’re here to do. Think of it as time series surgery, and we’re going to be very precise with our scalpel.
The core idea is almost stupidly simple: we assume any time series is built from a combination of three components—Trend (T), Seasonality (S), and Residuals (R) (which is just a fancy word for “the stuff we can’t explain, aka the noise”). The magic, and the part where everyone gets tripped up, is how these components are assembled. The designers gave us two main models, and picking the wrong one is the fastest way to end up with a decomposed mess that makes no sense.
Additive vs. Multiplicative: The Big Choice
This isn’t just an academic distinction; it’s a question of how your data behaves. You have to choose your model based on reality, not which one has a prettier name.
Additive:
Observation = Trend + Seasonality + ResidualsUse this when your seasonal swings and trend are roughly constant in size throughout time. If the peaks and troughs of your seasonality look like they were drawn with a cookie cutter, getting neither bigger nor smaller as the overall trend rises, additive is your friend. Think something like monthly average temperature in a specific city—the summer-winter difference is about the same number of degrees whether it’s 1990 or 2020.Multiplicative:
Observation = Trend * Seasonality * ResidualsUse this when the seasonal variations scale with the level of the trend. As the trend goes up, the seasonal swings get bigger in proportion. This is far more common in business and economics. Think retail sales for a growing company—the Christmas peak isn’t a fixed +$1 million every year; it’s a percentage increase. If sales are $10 million a month, the peak might be +$2M. If sales grow to $20 million a month, the peak might be +$4M. The absolute swing gets bigger, but the relative swing (the pattern) is consistent.
Why does this matter? If you apply an additive model to a multiplicative series, you’ll see your seasonal component get weirdly dwarfed by the growing trend, and your residuals will be massive on the recent, high-value data. It’ll be obvious you screwed up. The joke is that statsmodels in Python will often try to guess for you, but its guess is about as reliable as a weather forecast from a groundhog. You need to know how to check.
Let’s make this concrete. Let’s generate some fake, but realistic, data and tear it apart.
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from statsmodels.tsa.seasonal import seasonal_decompose
# Create a date range
dates = pd.date_range(start='2018-01-01', end='2023-12-31', freq='M')
n_periods = len(dates)
# Create a trend component (upward curve)
trend = np.linspace(10, 50, n_periods) ** 1.2
# Create a seasonal component (12-month cycle)
seasonal = 5 * np.sin(2 * np.pi * np.arange(n_periods) / 12)
# Let's create an additive and a multiplicative series
additive_series = trend + seasonal + np.random.normal(scale=2, size=n_periods)
multiplicative_series = trend * (1 + 0.2 * np.sin(2 * np.pi * np.arange(n_periods) / 12)) * np.random.lognormal(mean=0, sigma=0.1, size=n_periods)
# Create DataFrames
additive_df = pd.DataFrame({'value': additive_series}, index=dates)
multiplicative_df = pd.DataFrame({'value': multiplicative_series}, index=dates)
# Plot them side-by-side
fig, (ax1, ax2) = plt.subplots(2, 1, figsize=(10, 6))
additive_df.plot(ax=ax1, title='Additive Series (Constant Swings)')
multiplicative_df.plot(ax=ax2, title='Multiplicative Series (Growing Swings)')
plt.tight_layout()
plt.show()
Run that. See how the wiggles in the top chart are the same size throughout, while the ones on the bottom get bigger as the line goes up? That’s the visual cue.
Performing the Decomposition
Now, let’s run the actual decomposition. We’ll use the seasonal_decompose function from statsmodels. The critical parameter is model.
# Decompose the additive series correctly
additive_decomposition = seasonal_decompose(additive_df, model='additive', period=12)
# Decompose the multiplicative series correctly
multiplicative_decomposition = seasonal_decompose(multiplicative_df, model='multiplicative', period=12)
# And for a laugh, let's do it wrong
multiplicative_wrong_decomposition = seasonal_decompose(multiplicative_df, model='additive', period=12)
# Plot the correct multiplicative result
multiplicative_decomposition.plot()
plt.suptitle('Correct Multiplicative Decomposition', y=1.02)
plt.show()
# Now plot the horribly wrong one
multiplicative_wrong_decomposition.plot()
plt.suptitle('Incorrect Additive Decomposition of Multiplicative Data', y=1.02)
plt.show()
Look at the residuals (resid) in the wrong decomposition. They’re not random noise centered around zero; they show a clear, repeating pattern. This is a dead giveaway that your model is misspecified. The residuals are where the model’s lack of understanding went to die, and they’re screaming at you to try a multiplicative approach.
The Gotchas and Best Practices
Defining the Period: This is the biggest pitfall. The
periodparameter is not optional. For monthly data with a yearly pattern, it’s 12. For quarterly data, it’s 4. For daily data with a weekly pattern, it’s 7. If you get this wrong, the entire decomposition is nonsense. There’s no joke here; just double-check it.Edge Cases and Missing Data:
seasonal_decomposedoes not handleNaNvalues gracefully. It will often just fail. Your first job is to make sure your time series is complete and has no gaps. Use.asfreq()and.fillna()with extreme prejudice before you even think about decomposition.It’s a Blunt Instrument: This classical decomposition is useful for explanation and visualization, but it’s often not the final step for advanced forecasting. The trend is often oversmoothed, and the seasonal component is assumed to be fixed, which is rarely true over long time horizons. We use it to understand the anatomy of our series before we move on to more sophisticated models like SARIMA or Facebook Prophet, which can handle evolving trends and seasonality.
The takeaway? Always plot your raw data first and ask yourself: “Do the seasonal swings scale with the trend level?” Let the data tell you which model to use. Then, always, always plot the residuals. If they look structured and not like random noise, your decomposition has failed its physical, and it’s time to go back to the choice between additive and multiplicative.