DeepLearning For Finance
  • Back to Main Website
  • Home
  • Introduction to Deep Learning
    • Introduction to Deep Learning
    • From Traditional Models to Deep Learning
    • The Multi-Layer Perceptron (MLP)
    • Automatic Differentiation: The Engine of Deep Learning
    • Computation Backends & Keras 3
    • GPUs and Deep Learning: When Hardware Matters
    • Keras Fundamentals: Models & Layers
    • Keras Matrix Operations: The Building Blocks
    • Activation Functions: Adding Non-linearity
    • Model Training Fundamentals

    • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
  • Recurrent Neural Networks
    • Recurrent Neural Networks
    • Sequential Data Processing: From MLPs to RNNs
    • Long Short-Term Memory Networks (LSTM)
    • Modern RNN Architectures
    • RNN Limitations: Computational Challenges

    • Travaux Pratiques
    • TP: Recurrent Neural Networks for Time Series Prediction
    • TP Corrected: Recurrent Neural Networks for Time Series Prediction
  • Training a Neural Network
    • Training a Neural Network
    • Understanding the Training Loop
    • Understanding Optimizers
    • Understanding Callbacks
    • Training Parameters and Practical Considerations

    • Travaux Pratiques
    • TP: Using Deep Learning Frameworks for General Optimization
    • tp_general_optimization_corrected.html
    • TP: Impact of Callbacks on Training
  • Essential Building Blocks of Modern Neural Networks
    • Essential Building Blocks of Modern Neural Networks
    • Residual Connections and Gating Mechanisms
    • Convolutional Layers: From Images to Time Series
    • Neural Network Embeddings: Learning Meaningful Representations
    • Attention Mechanisms: Learning What to Focus On
    • Encoder-Decoder Architectures

    • Travaux Pratiques
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
  • Projets
    • Projets
  • Code source
  1. The Multi-Layer Perceptron (MLP)
  • Introduction to Deep Learning
  • From Traditional Models to Deep Learning
  • The Multi-Layer Perceptron (MLP)
  • Automatic Differentiation: The Engine of Deep Learning
  • Computation Backends & Keras 3
  • GPUs and Deep Learning: When Hardware Matters
  • Keras Fundamentals: Models & Layers
  • Keras Matrix Operations: The Building Blocks
  • Activation Functions: Adding Non-linearity
  • Model Training Fundamentals
  • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations

On this page

  • The Building Block: MLP Architecture
    • Section 1.6 - From Biological Neurons to Artificial Networks
      • Inspiration
    • Section 1.7 - Mathematical Formulation
      • Single Neuron (Perceptron)
      • Full MLP Layer
    • Section 1.8 - Layered Composition
      • Key Components
    • Section 1.9 - Activation Functions
    • Section 1.10 - Why Hierarchical Layers Matter
      • Financial Feature Learning
    • Section 1.11 - Code Sketch (Keras)
    • Historical Note

The Multi-Layer Perceptron (MLP)

Cours
Fundamentals
The MLP is the fundamental building block of deep learning. This chapter breaks down its biological inspiration, mathematical formulation, and financial applications.
Author

Remi Genet

Published

2025-04-03

The Building Block: MLP Architecture


Section 1.6 - From Biological Neurons to Artificial Networks

Inspiration

MLPs are loosely inspired by how biological neurons process information:

  • Dendrites: Receive inputs like the feature vector \[ \mathbf{x} \]
  • Cell body: Aggregates signals by computing the weighted sum \[ \mathbf{w}^\top \mathbf{x} + b \]
  • Axon: Transmits the output through synapses via the activation function \[ \varphi \]

MLP Structure

MLP Structure

Section 1.7 - Mathematical Formulation

Single Neuron (Perceptron)

For an input vector \[ \mathbf{x} \in \mathbb{R}^d, \] the computation is as follows:

Computation:
\[ z = \mathbf{w}^\top \mathbf{x} + b, \] \[ a = \varphi(z). \]

Where: - \[\mathbf{w} \in \mathbb{R}^d\] is the weight vector (learnable) - \[b \in \mathbb{R}\] is the bias term (learnable) - \[\varphi\] is the non-linear activation function

Full MLP Layer

A layer with ( n ) neurons produces an output vector \[ \mathbf{a} \in \mathbb{R}^n: \]

Matrix Form:
\[ \mathbf{a} = \varphi(\mathbf{W}\mathbf{x} + \mathbf{b}), \]

Where: - \[\mathbf{W} \in \mathbb{R}^{n \times d}\] is the weight matrix - \[\mathbf{b} \in \mathbb{R}^n\] is the bias vector


Section 1.8 - Layered Composition

Key Components

  1. Input Layer: Raw features (e.g., financial ratios, price returns)
  2. Hidden Layers: Successive transformations
    \[ \mathbf{h}^{(l)} = \varphi\left(\mathbf{W}^{(l)}\mathbf{h}^{(l-1)} + \mathbf{b}^{(l)}\right) \]
  3. Output Layer: Task-specific format
    • Regression: Uses a linear activation (e.g., to predict stock price)
    • Classification: Uses softmax (e.g., to generate buy/hold/sell signals)

Section 1.9 - Activation Functions

Function Formula Financial Use Case
ReLU \[\max(0, z)\] Default for hidden layers
Sigmoid \[\frac{1}{1+e^{-z}}\] Probability outputs (0-1)
Tanh \[\frac{e^z - e^{-z}}{e^z + e^{-z}}\] Normalized outputs (-1 to 1)
Linear \[z\] Final layer for regression

Section 1.10 - Why Hierarchical Layers Matter

Financial Feature Learning

  • Layer 1: Extracts simple patterns
    (e.g., momentum, mean reversion)
  • Layer 2: Combines patterns
    (e.g., momentum + volatility regime)
  • Layer 3: Leads to a strategic decision
    (e.g., optimal portfolio weight)

Example:
Raw Input → Volatility Estimates → Regime Detection → Trade Signal


Section 1.11 - Code Sketch (Keras)

from keras import layers, models

# Simple MLP for return prediction
model = models.Sequential([
    layers.Dense(32, activation='relu', input_shape=(num_features,)),
    layers.Dense(16, activation='tanh'),
    layers.Dense(1)  # Linear activation for regression
])

Historical Note

MLPs gained prominence in the 1980s-90s for: - Stock price prediction (White, 1988) - Credit scoring (Altman et al., 1994)

However, they were limited by computational power until the resurgence of deep learning in the 2010s.


Back to top
From Traditional Models to Deep Learning
Automatic Differentiation: The Engine of Deep Learning

Deep Learning For Finance, Rémi Genet.
Licence
Code source disponible sur Github

 

Site construit avec et Quarto
Inspiration pour la mise en forme du site ici
Code source disponible sur GitHub