Building a Neural Network with torch.nn

The torch.nn module provides the building blocks for constructing neural networks. By inheriting from nn.Module, developers can define custom layers and forward passes, utilizing PyTorch's parameter tracking.


The nn.Module Interface

nn.Module is the base class for all neural network architectures in PyTorch, managing weights and forward execution.

Module Hierarchy and Constructor

All custom neural networks in PyTorch must inherit from the base nn.Module class. The constructor __init__() is used to define and instantiate the layers (such as linear, convolutional, or pooling layers) that contain learnable parameters.

By registering these layers as attributes, PyTorch automatically tracks their weights and biases. Calling super().__init__() is mandatory to initialize the internal state of the base class, setting up parameter tracking.

The forward Method

The forward() method defines the computation graph of the network. It specifies how input tensors are transformed by the registered layers to generate predictions. The backward pass is derived automatically from this method.

We apply activation functions (such as ReLU or Sigmoid) inside the forward method. These can be applied using module instances (like nn.ReLU registered in the constructor) or functional alternatives (like torch.relu in the forward pass), depending on preference.

Structural Layers and Containers

PyTorch provides core layers and container modules to organize complex model architectures.

Core Structural Layers

The torch.nn module contains pre-defined classes for standard layers. nn.Linear represents fully connected layers, nn.Conv2d represents 2D convolutional layers, and nn.BatchNorm2d handles batch normalization.

These modules initialize their weights and biases automatically based on the input and output dimensions. Understanding the expected tensor dimensions (e.g. $[B, C, H, W]$ for convolutions) is critical to prevent shape mismatch errors during execution.

Container Modules

To organize layers, PyTorch provides container modules like nn.Sequential, which stacks layers in order. For models with conditional paths or repetitive blocks, we use nn.ModuleList or nn.ModuleDict.

These containers register their submodules correctly, ensuring that PyTorch can find all learnable parameters during training. Stacking layers inside containers improves code readability and structure.

PyTorch Implementation

We can write a complete custom neural network class and inspect its learnable parameters in PyTorch.

Designing a Custom Network

Here is a complete PyTorch model class showing layer construction and shape comments:

<pre><code class="language-python">import torch import torch.nn as nn class Classifier(nn.Module): def __init__(self, input_dim, hidden_dim, num_classes): super().__init__() # Construct layers self.network = nn.Sequential( nn.Linear(input_dim, hidden_dim), nn.ReLU(), nn.Linear(hidden_dim, num_classes) ) def forward(self, x): # x shape: [batch_size, input_dim] logits = self.network(x) # [batch_size, num_classes] return logits # Instantiate and test model model = Classifier(input_dim=10, hidden_dim=20, num_classes=3) x = torch.randn(5, 10) out = model(x) print("Output logits shape:", out.shape) # torch.Size([5, 3])</pre>

In this code, we inherit from nn.Module and define the model architecture using nn.Sequential. The forward pass applies the layers in sequence, maintaining shape properties.

Parameter Inspection

We can inspect the weights and biases of our model using the parameters() or named_parameters() methods. This is useful for debugging weight scales or verifying that gradients are flowing correctly during training:

<pre><code class="language-python">for name, param in model.named_parameters(): print(f"Parameter: {name} | Shape: {param.shape} | Requires Grad: {param.requires_grad}")</pre>

This loop prints the names and shapes of the weights and biases in the model. PyTorch tracks these parameters, ensuring they are updated by the optimizer during training.