DeepLearning For Finance
  • Back to Main Website
  • Home
  • Introduction to Deep Learning
    • Introduction to Deep Learning
    • From Traditional Models to Deep Learning
    • The Multi-Layer Perceptron (MLP)
    • Automatic Differentiation: The Engine of Deep Learning
    • Computation Backends & Keras 3
    • GPUs and Deep Learning: When Hardware Matters
    • Keras Fundamentals: Models & Layers
    • Keras Matrix Operations: The Building Blocks
    • Activation Functions: Adding Non-linearity
    • Model Training Fundamentals

    • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
  • Recurrent Neural Networks
    • Recurrent Neural Networks
    • Sequential Data Processing: From MLPs to RNNs
    • Long Short-Term Memory Networks (LSTM)
    • Modern RNN Architectures
    • RNN Limitations: Computational Challenges

    • Travaux Pratiques
    • TP: Recurrent Neural Networks for Time Series Prediction
    • TP Corrected: Recurrent Neural Networks for Time Series Prediction
  • Training a Neural Network
    • Training a Neural Network
    • Understanding the Training Loop
    • Understanding Optimizers
    • Understanding Callbacks
    • Training Parameters and Practical Considerations

    • Travaux Pratiques
    • TP: Using Deep Learning Frameworks for General Optimization
    • tp_general_optimization_corrected.html
    • TP: Impact of Callbacks on Training
  • Essential Building Blocks of Modern Neural Networks
    • Essential Building Blocks of Modern Neural Networks
    • Residual Connections and Gating Mechanisms
    • Convolutional Layers: From Images to Time Series
    • Neural Network Embeddings: Learning Meaningful Representations
    • Attention Mechanisms: Learning What to Focus On
    • Encoder-Decoder Architectures

    • Travaux Pratiques
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
    • Practical Assignment: Building a Transformer-Based Architecture for Time Series Forecasting
  • Projets
    • Projets
  • Code source
  1. Keras Fundamentals: Models & Layers
  • Introduction to Deep Learning
  • From Traditional Models to Deep Learning
  • The Multi-Layer Perceptron (MLP)
  • Automatic Differentiation: The Engine of Deep Learning
  • Computation Backends & Keras 3
  • GPUs and Deep Learning: When Hardware Matters
  • Keras Fundamentals: Models & Layers
  • Keras Matrix Operations: The Building Blocks
  • Activation Functions: Adding Non-linearity
  • Model Training Fundamentals
  • Travaux Pratiques
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations
    • TP1 Corrected: Building Neural Networks - From Simple to Custom Implementations

On this page

  • Building Blocks of Keras
    • Section 1.27 - Core Concepts: Layer and Model
      • The Layer Class
      • The Model Class
    • Section 1.28 - Three Paths to Build Models
      • 1. Sequential API (Simplest)
      • 2. Functional API (Most Flexible)
      • 3. Model Subclassing (Full Control)
    • Section 1.29 - Model Composability
      • Models as Layers
    • Section 1.30 - Choosing an API

Keras Fundamentals: Models & Layers

Cours
Fundamentals
Understanding Keras’ core abstractions for building neural networks through its layered architecture and model composition paradigms.
Author

Remi Genet

Published

2025-04-03

Building Blocks of Keras

Section 1.27 - Core Concepts: Layer and Model

The Layer Class

What it is: - Fundamental building block of neural networks - Encapsulates state (weights) and computation (forward pass)

Layer Lifecycle: 1. __init__: Define layer parameters 2. build: Create weights when input shape is known 3. call: Define forward computation

from keras import layers
import keras.ops as ops

# Built-in layer example
dense = layers.Dense(units=64, activation='relu')

# Custom layer with explicit weight management
class MyLayer(layers.Layer):
    def __init__(self, units, **kwargs):
        super().__init__(**kwargs)
        self.units = units
        # Note: weights are not created here
        
    def build(self, input_shape):
        # Create weights when input shape is known
        input_dim = input_shape[-1]
        
        # Initialize weights using add_weight
        self.w = self.add_weight(
            shape=(input_dim, self.units),
            initializer='glorot_uniform',
            name='kernel',
            trainable=True
        )
        
        # Initialize bias
        self.b = self.add_weight(
            shape=(self.units,),
            initializer='zeros',
            name='bias',
            trainable=True
        )
        
        # Mark layer as built
        self.built = True
        
    def call(self, inputs):
        # Define forward pass
        return ops.relu(ops.dot(inputs, self.w) + self.b)
        
    def get_config(self):
        # Enable serialization
        config = super().get_config()
        config.update({
            "units": self.units
        })
        return config

Weight Management Details:

  1. Lazy Weight Creation:
    • Weights are not created until layer sees input shape
    • Allows dynamic sizing based on input
  2. The build Method:
    • Called automatically on first use
    • Creates weights with proper shapes
    • Sets self.built = True
  3. add_weight Function:
self.add_weight(
    shape,              # Tuple of dimensions
    initializer,        # 'zeros', 'ones', 'random_normal', etc.
    name,              # For debugging/serialization
    trainable=True,    # Whether to update in training
    dtype=None         # Optional weight dtype
)

The Model Class

What it is: - Container for layers that defines: - Training logic (compile(), fit()) - Inference logic (predict()) - Saving/loading (save(), load_model())

Key Insight: - A Model is itself a Layer - Models can be nested like LEGO blocks

Section 1.28 - Three Paths to Build Models

1. Sequential API (Simplest)

For: Linear stacks of layers

from keras import Sequential

# Weights created automatically when model sees data
model = Sequential([
    layers.Dense(64, activation='relu', input_shape=(100,)),
    layers.Dense(10)
])

2. Functional API (Most Flexible)

For: Complex architectures (multi-input/output, shared layers)

from keras import Input, Model

# Input shape defines weight shapes for all layers
inputs = Input(shape=(100,))
x = layers.Dense(64, activation='relu')(inputs)
outputs = layers.Dense(10)(x)
model = Model(inputs=inputs, outputs=outputs)

3. Model Subclassing (Full Control)

For: Research-level customization

class MyModel(Model):
    def __init__(self):
        super().__init__()
        # Layers defined but not built yet
        self.dense1 = MyLayer(64)  # Using our custom layer
        self.dense2 = MyLayer(10)
        
    def build(self, input_shape):
        # Optional: explicit build if needed
        self.dense1.build(input_shape)
        # Get output shape of dense1
        dense1_out_shape = (*input_shape[:-1], self.dense1.units)
        self.dense2.build(dense1_out_shape)
        self.built = True
        
    def call(self, inputs):
        x = self.dense1(inputs)  # Builds layer if needed
        return self.dense2(x)    # Builds layer if needed

model = MyModel()

Section 1.29 - Model Composability

Models as Layers

Since Model inherits from Layer, you can nest models:

# Build a feature extractor
inputs = Input(shape=(256,))
x = MyLayer(128)(inputs)  # Using our custom layer
feature_extractor = Model(inputs, x)

# Use in larger model
main_input = Input(shape=(256,))
features = feature_extractor(main_input)  # Treated as a layer
predictions = MyLayer(5)(features)
combined_model = Model(main_input, predictions)

Financial Example: - Pretrain a market regime classifier - Use it as a feature layer in portfolio optimization model

Section 1.30 - Choosing an API

Approach When to Use Weight Creation Debugging
Sequential Quick prototypes, simple MLPs Automatic Easy
Functional Complex architectures Automatic based on Input shape Moderate
Subclassing Custom training loops, research Manual in build() or automatic in call() Challenging
Back to top
GPUs and Deep Learning: When Hardware Matters
Keras Matrix Operations: The Building Blocks

Deep Learning For Finance, Rémi Genet.
Licence
Code source disponible sur Github

 

Site construit avec et Quarto
Inspiration pour la mise en forme du site ici
Code source disponible sur GitHub