Keras Fundamentals: Models & Layers
Building Blocks of Keras
Section 1.27 - Core Concepts: Layer and Model
The Layer Class
What it is: - Fundamental building block of neural networks - Encapsulates state (weights) and computation (forward pass)
Layer Lifecycle: 1. __init__
: Define layer parameters 2. build
: Create weights when input shape is known 3. call
: Define forward computation
from keras import layers
import keras.ops as ops
# Built-in layer example
= layers.Dense(units=64, activation='relu')
dense
# Custom layer with explicit weight management
class MyLayer(layers.Layer):
def __init__(self, units, **kwargs):
super().__init__(**kwargs)
self.units = units
# Note: weights are not created here
def build(self, input_shape):
# Create weights when input shape is known
= input_shape[-1]
input_dim
# Initialize weights using add_weight
self.w = self.add_weight(
=(input_dim, self.units),
shape='glorot_uniform',
initializer='kernel',
name=True
trainable
)
# Initialize bias
self.b = self.add_weight(
=(self.units,),
shape='zeros',
initializer='bias',
name=True
trainable
)
# Mark layer as built
self.built = True
def call(self, inputs):
# Define forward pass
return ops.relu(ops.dot(inputs, self.w) + self.b)
def get_config(self):
# Enable serialization
= super().get_config()
config
config.update({"units": self.units
})return config
Weight Management Details:
- Lazy Weight Creation:
- Weights are not created until layer sees input shape
- Allows dynamic sizing based on input
- The build Method:
- Called automatically on first use
- Creates weights with proper shapes
- Sets
self.built = True
- add_weight Function:
self.add_weight(
# Tuple of dimensions
shape, # 'zeros', 'ones', 'random_normal', etc.
initializer, # For debugging/serialization
name, =True, # Whether to update in training
trainable=None # Optional weight dtype
dtype )
The Model Class
What it is: - Container for layers that defines: - Training logic (compile()
, fit()
) - Inference logic (predict()
) - Saving/loading (save()
, load_model()
)
Key Insight: - A Model
is itself a Layer
- Models can be nested like LEGO blocks
Section 1.28 - Three Paths to Build Models
1. Sequential API (Simplest)
For: Linear stacks of layers
from keras import Sequential
# Weights created automatically when model sees data
= Sequential([
model 64, activation='relu', input_shape=(100,)),
layers.Dense(10)
layers.Dense( ])
2. Functional API (Most Flexible)
For: Complex architectures (multi-input/output, shared layers)
from keras import Input, Model
# Input shape defines weight shapes for all layers
= Input(shape=(100,))
inputs = layers.Dense(64, activation='relu')(inputs)
x = layers.Dense(10)(x)
outputs = Model(inputs=inputs, outputs=outputs) model
3. Model Subclassing (Full Control)
For: Research-level customization
class MyModel(Model):
def __init__(self):
super().__init__()
# Layers defined but not built yet
self.dense1 = MyLayer(64) # Using our custom layer
self.dense2 = MyLayer(10)
def build(self, input_shape):
# Optional: explicit build if needed
self.dense1.build(input_shape)
# Get output shape of dense1
= (*input_shape[:-1], self.dense1.units)
dense1_out_shape self.dense2.build(dense1_out_shape)
self.built = True
def call(self, inputs):
= self.dense1(inputs) # Builds layer if needed
x return self.dense2(x) # Builds layer if needed
= MyModel() model
Section 1.29 - Model Composability
Models as Layers
Since Model
inherits from Layer
, you can nest models:
# Build a feature extractor
= Input(shape=(256,))
inputs = MyLayer(128)(inputs) # Using our custom layer
x = Model(inputs, x)
feature_extractor
# Use in larger model
= Input(shape=(256,))
main_input = feature_extractor(main_input) # Treated as a layer
features = MyLayer(5)(features)
predictions = Model(main_input, predictions) combined_model
Financial Example: - Pretrain a market regime classifier - Use it as a feature layer in portfolio optimization model
Section 1.30 - Choosing an API
Approach | When to Use | Weight Creation | Debugging |
---|---|---|---|
Sequential | Quick prototypes, simple MLPs | Automatic | Easy |
Functional | Complex architectures | Automatic based on Input shape | Moderate |
Subclassing | Custom training loops, research | Manual in build() or automatic in call() | Challenging |