DeepLearning For Finance
  • Back to Main Website
  • Home
  • Introduction to Deep Learning
    • Introduction to Deep Learning
    • From Traditional Models to Deep Learning
    • The Multi-Layer Perceptron (MLP)
    • Automatic Differentiation: The Engine of Deep Learning
    • Computation Backends & Keras 3
    • GPUs and Deep Learning: When Hardware Matters
    • Keras Fundamentals: Models & Layers
    • Keras Matrix Operations: The Building Blocks
    • Activation Functions: Adding Non-linearity
    • Model Training Fundamentals

    • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
  • Recurrent Neural Networks
    • Recurrent Neural Networks
    • Sequential Data Processing: From MLPs to RNNs
    • Long Short-Term Memory Networks (LSTM)
    • Modern RNN Architectures
    • RNN Limitations: Computational Challenges

    • Travaux Pratiques
    • TP: Recurrent Neural Networks for Time Series Prediction
    • TP Corrected: Recurrent Neural Networks for Time Series Prediction
  • Training a Neural Network
    • Training a Neural Network
    • Understanding the Training Loop
    • Understanding Optimizers
    • Understanding Callbacks
    • Training Parameters and Practical Considerations

    • Travaux Pratiques
    • TP: Using Deep Learning Frameworks for General Optimization
    • tp_general_optimization_corrected.html
    • TP: Impact of Callbacks on Training
  • Essential Building Blocks of Modern Neural Networks
    • Essential Building Blocks of Modern Neural Networks
    • Residual Connections and Gating Mechanisms
    • Convolutional Layers: From Images to Time Series
    • Neural Network Embeddings: Learning Meaningful Representations
    • Attention Mechanisms: Learning What to Focus On
    • Encoder-Decoder Architectures

    • Travaux Pratiques
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
  • Projets
    • Projets
  • Code source
  1. From Traditional Models to Deep Learning
  • Introduction to Deep Learning
  • From Traditional Models to Deep Learning
  • The Multi-Layer Perceptron (MLP)
  • Automatic Differentiation: The Engine of Deep Learning
  • Computation Backends & Keras 3
  • GPUs and Deep Learning: When Hardware Matters
  • Keras Fundamentals: Models & Layers
  • Keras Matrix Operations: The Building Blocks
  • Activation Functions: Adding Non-linearity
  • Model Training Fundamentals
  • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations

On this page

  • From Standard Models to Neural Networks
    • Section 1.1 - Linear Regression: The Simplest Machine Learning Model
      • What It Does
    • Section 1.2 - Decision Trees: Learning Simple Rules
      • How It Works Step-by-Step
    • Section 1.3 - GARCH Models: Handling Time-Dependent Variance
      • Why We Need It
    • Section 1.4 - Enter Deep Learning
      • Core Idea
      • Why It Matters for Finance

From Traditional Models to Deep Learning

Cours
Fundamentals
Foundations of machine learning and econometric modeling, introducing deep learning as a flexible function approximation paradigm for financial problems.
Author

Remi Genet

Published

2025-04-03

From Standard Models to Neural Networks


Section 1.1 - Linear Regression: The Simplest Machine Learning Model

What It Does

Predicts a number (e.g., stock price tomorrow) using a weighted combination of input features (e.g., P/E ratio, volatility):

Mathematical Form
For input features \[ \mathbf{x} = [x_1, \dots, x_n] \] and weights \[ \boldsymbol{\theta} = [\theta_1, \dots, \theta_n], \] the prediction is given by: \[ \hat{y} = \theta_0 + \theta_1 x_1 + \theta_2 x_2 + \dots + \theta_n x_n. \]

How We Find the Weights
The best weights are those that minimize the prediction error on the training data: \[ \hat{\theta} = \operatorname{argmin}_{\theta} \sum_{i=1}^{N} \left( y_i - \hat{y}_i \right)^2. \]

Closed-form solution: \[ \hat{\theta} = (X^\top X)^{-1} X^\top y, \] where (X) is the data matrix with a column of 1s for the intercept.


Section 1.2 - Decision Trees: Learning Simple Rules

How It Works Step-by-Step

  1. Start: All data in one group (e.g., all historical stock returns).
  2. Find Split: Test all possible feature thresholds (e.g., “P/E ratio < 15?”) to create two groups where predictions are most accurate.
  3. Repeat: Keep splitting subgroups until reaching stopping criteria (max depth or minimum samples).

Example: Predicting stock outperformance

Is P/E ratio < 20?  
 ├─ Yes → Check ROE > 15%  
 │    ├─ Yes → Predict "Outperform"  
 │    └─ No → Predict "Neutral"  
 └─ No → Predict "Underperform"

Mathematical Criterion (Classification)
At each split, maximize the purity gain: \[ \text{Gain} = H(\text{parent}) - \left[\frac{N_{\text{left}}}{N}\, H(\text{left}) + \frac{N_{\text{right}}}{N}\, H(\text{right})\right], \] where (H) represents the impurity (e.g., Gini or entropy).


Section 1.3 - GARCH Models: Handling Time-Dependent Variance

Why We Need It

Financial returns often exhibit volatility clustering (calm vs. turbulent periods). GARCH models capture this time-varying variance.

Model Equations

Return at time (t): \[ r_t = \mu + \varepsilon_t, \quad \varepsilon_t \sim \mathcal{N}(0, \sigma_t^2). \]

Volatility dynamics (GARCH(1,1)): \[ \sigma_t^2 = \omega + \alpha \varepsilon_{t-1}^2 + \beta \sigma_{t-1}^2. \]

Calibration Process 1. Initialize parameters (), (), (). 2. Compute the volatility series ({_t^2}) using past ()’s. 3. Adjust parameters to maximize the likelihood of observing the returns. 4. Repeat until convergence (no closed-form solution exists).


Section 1.4 - Enter Deep Learning

Core Idea

Instead of hand-crafting models (linear terms, tree splits, GARCH lags), let the algorithm learn the feature transformations:

Traditional Approach
\[ y = f(\mathbf{x}), \] where (f) is designed by humans (e.g., \[ y = \theta_0 + \theta_1 x_1 + \dots). \]

Deep Learning Approach
\[ y = f(\mathbf{x}; \theta), \] where (f) is a learned sequence of nonlinear transformations: \[ h_1 = \varphi(W_1 \mathbf{x} + b_1), \] \[ h_2 = \varphi(W_2 h_1 + b_2), \] \[ \vdots \] \[ \hat{y} = W_{\text{out}} h_n + b_{\text{out}}. \]

Here, () is the activation function (e.g., ReLU: ((z) = (0, z))).

Why It Matters for Finance

  • Handles raw, high-dimensional data (order books, news text).
  • Discovers complex patterns (nonlinear factor interactions).
  • Offers flexible architecture design (time series, graphs, etc.).
Back to top
Introduction to Deep Learning
The Multi-Layer Perceptron (MLP)

Deep Learning For Finance, Rémi Genet.
Licence
Code source disponible sur Github

 

Site construit avec et Quarto
Inspiration pour la mise en forme du site ici
Code source disponible sur GitHub