Your starting code looks like this:
class L2Loss(nn.Module):
def __init__(self):
super(L2Loss, self).__init__()
def forward(self, pred, gt):
# compute absolute differences between prediction and ground truth
diff = torch.square(pred - gt)
# a sum would work here too, but mean is easier to interpret
loss = diff.mean()
return lossYou have two tasks:
L1LossIn PyTorch, loss functions are network layers
(nn.Module) like all of the other components of a neural
network. The job of a loss function is to quantitatively compare two
inputs (pred and gt) and produce a single
non-negative number. A result of zero would mean that pred
and gt are indentical, and larger numbers mean a great
difference.
PyTorch Modules all have a function called forward. This
is where the programmer specifies the computation that should be
performed when data flows “forwards” through this network layer. (There
is also a notion of backwards flow, to compute gradients, but that is
handled by autodifferentiation. You don’t have to worry about that.)
There are three inputs:
self : python’s reference to this layer. You don’t need
to use this.pred : a torch.Tensor with dimensions
(N, C, H, W)gt : a torch.Tensor with dimensions
(N, C, H, W)In our use, we’re comparing depth images, which only have one
channel. In the starting code, the batch size is 8 images at a time, and
the images are 320 pixels wide by 240 pixels tall. So you should be
seeing tensors with dimensions (8, 1, 240, 320)
This means that the loss functions aren’t computing a number for a single prediction. This is actually a function that computes a single loss number for a batch of depth predictions.
L1 loss functions are popular for depth prediction. The code looks very similar to the L2 loss. Instead of averaging squared differences, an L1 loss averages absolute differences. Swap in an absolute value function.
You’ll need to write on loss function on your own. What options are out there?
Notice that L1 and L2 losses measure absolute error. If the true depth is 1 meter, and the prediction is 1.5 meters, that contributes to the loss in the same ways as a true depth of 20 meters and a prediction of 20.5 meters. But the latter sounds like a pretty good prediction, and the former feels off by a lot. In other words, can you write a loss function that measures relative error or percent error?
Alternatively, some researchers have remarked that depth prediction
works better when instead of predicting depth directly, we
predict 1 / depth. How can you write a loss function that
measures how far the predictions are from being the correct inverse
depth?
Heads up: any formulation that involves computing
1 / pred is sketchy. While we can trust gt to
consist of positive numbers, a poorly-trained neural network could
easily create negative numbers, or even predict depths of zero. Dividing
by zero is bad. It’s better to fix this problem by design (writing loss
functions that don’t ever compute 1 / pred) than by
detecting or modifying negative/zero values when they arise.
A good loss function doesn’t have any Python for loops.
Try to work entirely with torch.Tensor objects and the functions that
work on them, since all of this code will run on a GPU. Pytorch
functions are all written to use GPUs as fast back-ends when
available.