TP: Dunder Method with Tensor for Automatic Differentiation
Understanding Tensors and Automatic Differentiation
The Power of Composition
Imagine you have a complex mathematical function. It might look intimidating at first, but it’s actually just a composition of simple operations. For example, consider this function:
\(f(x, y) = (x^2 + 2y) * \sin(x + y)\)
We can break this down into simpler operations:
- \(a = x^2\)
- \(b = 2y\)
- \(c = a + b\) (which is \(x^2 + 2y\))
- \(d = x + y\)
- \(e = \sin(d)\) (which is \(\sin(x + y)\))
- \(f = c * e\) (our final result)
Automatic Differentiation: It’s All About the Chain Rule
Now, here’s the magic: if we know how to differentiate each of these simple operations, we can automatically compute the derivative of the entire complex function. This is the essence of automatic differentiation.
Let’s say we want to find \(\frac{\partial f}{\partial x}\). We can use the chain rule:
\(\frac{\partial f}{\partial x} = \frac{\partial f}{\partial c} \cdot \frac{\partial c}{\partial x} + \frac{\partial f}{\partial e} \cdot \frac{\partial e}{\partial d} \cdot \frac{\partial d}{\partial x}\)
Breaking it down: - \(\frac{\partial f}{\partial c} = e\) - \(\frac{\partial c}{\partial x} = 2x\) - \(\frac{\partial f}{\partial e} = c\) - \(\frac{\partial e}{\partial d} = \cos(d)\) - \(\frac{\partial d}{\partial x} = 1\)
Putting it all together:
\(\frac{\partial f}{\partial x} = e \cdot 2x + c \cdot \cos(d) \cdot 1\)
Tensors: Tracking Operations and Their Derivatives
This is where tensors come in. In our implementation, a tensor will not only store its value but also remember: 1. The operation that created it 2. The tensors that were inputs to this operation 3. How to compute its derivative with respect to its inputs
By doing this for each operation, we create a computational graph. When we want to compute the derivative of our final result with respect to any input, we can simply walk backwards through this graph, applying the chain rule at each step.
TP Instructions: Implementing a Basic Tensor Class with Automatic Differentiation
Your task is to implement a simplified Tensor class that supports basic mathematical operations and automatic differentiation. This class will allow us to build simple computational graphs and compute gradients automatically.
Here’s a skeleton of the Tensor class to get you started:
Your tasks:
- Implement the reversed methods for already existing one like
__radd__
or__rmul__
. - Implement the
__sub__
and__truediv__
methods for subtraction and division operations. - Add support for operations between Tensors and regular numbers (scalars) in all methods.
- Implement a
sin()
method that computes the sine of a Tensor. - Add proper string representation methods (
__repr__
and__str__
).
Example Usage: Computing Gradients of a Complex Function
After implementing the Tensor class, you can use it to compute gradients of complex functions. Here’s an example using the function we discussed earlier:
This example demonstrates how your Tensor class can be used to automatically compute gradients of a complex function. The backward()
method computes the gradients with respect to all input tensors.
Unitary Tests
Here are some unitary tests to verify your implementation. Implement your Tensor class in a file named tensor.py
, and then use these tests to check your work: