DeepLearning For Finance
  • Back to Main Website
  • Home
  • Introduction to Deep Learning
    • Introduction to Deep Learning
    • From Traditional Models to Deep Learning
    • The Multi-Layer Perceptron (MLP)
    • Automatic Differentiation: The Engine of Deep Learning
    • Computation Backends & Keras 3
    • GPUs and Deep Learning: When Hardware Matters
    • Keras Fundamentals: Models & Layers
    • Keras Matrix Operations: The Building Blocks
    • Activation Functions: Adding Non-linearity
    • Model Training Fundamentals

    • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
  • Recurrent Neural Networks
    • Recurrent Neural Networks
    • Sequential Data Processing: From MLPs to RNNs
    • Long Short-Term Memory Networks (LSTM)
    • Modern RNN Architectures
    • RNN Limitations: Computational Challenges

    • Travaux Pratiques
    • TP: Recurrent Neural Networks for Time Series Prediction
    • TP Corrected: Recurrent Neural Networks for Time Series Prediction
  • Training a Neural Network
    • Training a Neural Network
    • Understanding the Training Loop
    • Understanding Optimizers
    • Understanding Callbacks
    • Training Parameters and Practical Considerations

    • Travaux Pratiques
    • TP: Using Deep Learning Frameworks for General Optimization
    • tp_general_optimization_corrected.html
    • TP: Impact of Callbacks on Training
  • Essential Building Blocks of Modern Neural Networks
    • Essential Building Blocks of Modern Neural Networks
    • Residual Connections and Gating Mechanisms
    • Convolutional Layers: From Images to Time Series
    • Neural Network Embeddings: Learning Meaningful Representations
    • Attention Mechanisms: Learning What to Focus On
    • Encoder-Decoder Architectures

    • Travaux Pratiques
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
  • Projets
    • Projets
  • Code source
  1. Convolutional Layers: From Images to Time Series
  • Essential Building Blocks of Modern Neural Networks
  • Residual Connections and Gating Mechanisms
  • Convolutional Layers: From Images to Time Series
  • Neural Network Embeddings: Learning Meaningful Representations
  • Attention Mechanisms: Learning What to Focus On
  • Encoder-Decoder Architectures
  • Travaux Pratiques
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting

On this page

  • Convolutional Operations: A Unified Mathematical Framework
    • Section 4.9 - The Mathematical Foundation of Convolutions
      • The Basic Convolution Operation
    • Section 4.10 - From 1D to Multi-Dimensional Convolutions
      • 1D Convolution (Time Series)
      • 2D Convolution (Images)
      • 3D Convolution (Videos/Volumes)
    • Section 4.11 - Convolutions in Time Series Analysis
      • Moving Average as Convolution
      • Learnable Temporal Patterns
      • Dilated Convolutions
    • Section 4.12 - Theoretical Properties

Convolutional Layers: From Images to Time Series

Course
Advanced Concepts
Understanding convolution operations in neural networks and their applications beyond computer vision.
Author

Remi Genet

Published

2025-04-03

Convolutional Operations: A Unified Mathematical Framework

Section 4.9 - The Mathematical Foundation of Convolutions

At its core, a convolution is an operation between two functions that produces a third function expressing how the shape of one is modified by the other. In deep learning, we use discrete convolutions where one function is our input data and the other is our learnable kernel.

The Basic Convolution Operation

For a 1-dimensional input signal (x) and a kernel (w), the convolution operation is defined as:

\[ (x * w)(t) = \sum_{k} x(t - k) \, w(k) \]

In practice, we often work with finite, discrete signals. For an input vector (x ^n) and a kernel (w ^k), the discrete convolution becomes:

\[ y[t] = \sum_{k=0}^{k-1} x[t - k] \, w[k], \]

where (k) is the kernel size.

Section 4.10 - From 1D to Multi-Dimensional Convolutions

While convolutions are often associated with image processing (2D convolutions), the operation generalizes naturally across dimensions:

1D Convolution (Time Series)

Used for temporal data, where the convolution slides over time:

\[ y[t] = \sum_{k} x[t - k] \, w[k] \]

2D Convolution (Images)

For spatial data with input (X) and kernel (W):

\[ Y[i,j] = \sum_{m} \sum_{n} X[i - m,\, j - n] \, W[m,n] \]

3D Convolution (Videos/Volumes)

Extends to three dimensions for spatio-temporal or volumetric data:

\[ Y[i,j,k] = \sum_{l} \sum_{m} \sum_{n} X[i - l,\, j - m,\, k - n] \, W[l,m,n] \]

Section 4.11 - Convolutions in Time Series Analysis

In time series analysis, 1D convolutions serve several crucial purposes:

Moving Average as Convolution

A simple moving average can be expressed as a convolution with a uniform kernel:

\[ w = \left[\frac{1}{k},\, \frac{1}{k},\, \dots,\, \frac{1}{k}\right] \]

The output at each point becomes an average of (k) surrounding points:

\[ y[t] = \frac{1}{k} \sum_{i=0}^{k-1} x[t - i] \]

Learnable Temporal Patterns

In neural networks, the kernel weights are learned from data. A 1D convolutional layer with input (x ^n) and (c) kernels (w_{(i)} ^k) produces output:

\[ y_{(i)}[t] = \sigma\Bigl(\sum_{k} x[t - k] \, w_{(i)}[k] + b_{(i)}\Bigr) \]

where: - () is a nonlinear activation function, - (b_{(i)}) is a learnable bias term, - (i) ranges from 1 to (c) (number of output channels).

This operation can learn to detect various temporal patterns: - Short-term dependencies: Captured with small kernel sizes. - Long-term patterns: Captured using dilated convolutions. - Multi-scale features: Achieved using parallel convolutions with different kernel sizes.

Dilated Convolutions

To capture long-range dependencies without increasing the parameter count, dilated convolutions introduce gaps in the kernel:

\[ y[t] = \sum_{k} x[t - d\, k] \, w[k], \]

where (d) is the dilation rate. This effectively increases the receptive field exponentially with layer depth while maintaining computational efficiency.

Section 4.12 - Theoretical Properties

Convolutions possess several important properties that make them particularly effective for pattern recognition:

  1. Translation Equivariance: If the input is shifted by (), the output shifts by ():

    \[ \operatorname{Conv}(T_\delta x) = T_\delta \operatorname{Conv}(x), \]

    where (T_) represents translation by ().

  2. Local Connectivity: Each output point depends only on a local region of the input, reducing computational complexity.

  3. Parameter Sharing: The same kernel is applied across all positions, dramatically reducing the number of parameters compared to fully connected layers.

These properties make convolutional layers particularly effective for tasks where patterns may appear at different positions in the input sequence, while maintaining both computational and statistical efficiency in learning.

Back to top
Residual Connections and Gating Mechanisms
Neural Network Embeddings: Learning Meaningful Representations

Deep Learning For Finance, Rémi Genet.
Licence
Code source disponible sur Github

 

Site construit avec et Quarto
Inspiration pour la mise en forme du site ici
Code source disponible sur GitHub