LSTM Time Series Forecasting

Using a stacked LSTM model to predict future sales values from historical time-series data.

What It Is

This notebook applies LSTM to a regression task: predicting the next value in a time series. Instead of text tokens, the input is a sliding window of past numeric values, and the output is the next value in the sequence.

Time-series forecasting with LSTM: use a lookback window of past values to predict the next value. The model learns temporal patterns like trends and seasonality.

How It Works

Pipeline Steps

Generate data -- Create synthetic sales data with a trend
Normalize -- Scale values to [0, 1] using MinMaxScaler
Create sequences -- Sliding window of 5 past values predicts the next value
Reshape for LSTM -- Input shape: (samples, timesteps, features)
Train stacked LSTM -- Two LSTM layers (50 and 25 units)
Predict and inverse-transform -- Convert predictions back to original scale

Code: Generate and Prepare Data

import numpy as np import pandas as pd import matplotlib.pyplot as plt from tensorflow.keras.models import Sequential from tensorflow.keras.layers import LSTM, Dense from sklearn.preprocessing import MinMaxScaler from sklearn.model_selection import train_test_split # Create a simple sales dataset (hypothetical) np.random.seed(42) data = np.cumsum(np.random.randn(100, 1)) + 100 # Cumulative sum for trend # Convert to DataFrame df = pd.DataFrame(data, columns=['Sales']) df.plot(title="Sales Over Time", figsize=(10, 5)) plt.show()

Code: Create Sequences and Split

# Create sequences (lookback window of 5) def create_sequences(data, lookback=5): X, y = [], [] for i in range(len(data) - lookback): X.append(data[i:i+lookback]) y.append(data[i+lookback]) return np.array(X), np.array(y) # Normalize Data scaler = MinMaxScaler(feature_range=(0, 1)) scaled_data = scaler.fit_transform(df) # Create sequences X, y = create_sequences(scaled_data) # Split into Train and Test X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, shuffle=False) # Reshape for LSTM (samples, timesteps, features) X_train = X_train.reshape((X_train.shape[0], X_train.shape[1], 1)) X_test = X_test.reshape((X_test.shape[0], X_test.shape[1], 1))

Code: Build and Train Model

# Define LSTM Model model = Sequential([ LSTM(50, activation='relu', return_sequences=True, input_shape=(X_train.shape[1], 1)), LSTM(25, activation='relu', return_sequences=False), Dense(1) # Output Layer (regression -- no activation) ]) # Compile Model model.compile(optimizer='adam', loss='mse') # Train Model history = model.fit(X_train, y_train, epochs=50, batch_size=8, validation_data=(X_test, y_test), verbose=1)

Key Architecture Details

return_sequences=True on the first LSTM: outputs a sequence (needed to stack LSTM layers)
return_sequences=False on the second LSTM: outputs only the last hidden state
Dense(1) with no activation: linear output for regression
Loss: MSE (Mean Squared Error) -- standard for regression tasks

Code: Predict and Visualize

# Plot Training Loss plt.plot(history.history['loss'], label='Train Loss') plt.plot(history.history['val_loss'], label='Validation Loss') plt.legend() plt.title("Loss Over Epochs") plt.show() # Predict Test Data y_pred = model.predict(X_test) # Inverse Transform to Original Scale y_pred_inv = scaler.inverse_transform(y_pred) y_test_inv = scaler.inverse_transform(y_test.reshape(-1, 1)) # Plot Predictions vs Actual plt.figure(figsize=(10, 5)) plt.plot(y_test_inv, label="Actual Sales", marker='o') plt.plot(y_pred_inv, label="Predicted Sales", marker='x') plt.legend() plt.title("LSTM Predictions vs Actual Sales") plt.show()

Always normalize your data before feeding it to LSTM. Neural networks train much better with values in [0,1] range. Remember to inverse-transform predictions before interpreting results.

When to Use LSTM for Time Series

Good For	Not Ideal For
Data with temporal dependencies	Stationary data (ARIMA may suffice)
Non-linear trends	Very short sequences
Multi-step forecasting	When interpretability is needed
Multivariate time series	Small datasets (overfitting risk)

LSTMTime SeriesForecastingMinMaxScalerStacked LSTM