'Weight tensor should be defined either for all 1000 classes or no classes but got weight tensor of shape: [5]

I'm trying to use VGG16 for ** 5 classes data set**. I've already added 5 new layers to adjust the output for logit as 5.

model = models.vgg16(pretrained=True) #Downloads the vgg16 model which is pretrained on Imagenet dataset.

#Replace the Final layer of pretrained vgg16 with 5 new layers.
model.fc = nn.Sequential(nn.Linear(1000,512),
                         nn.ReLU(inplace=True),
                         nn.Linear(512,256),
                         nn.ReLU(inplace=True),
                         nn.Linear(256,128),
                         nn.ReLU(inplace=True),
                         nn.Linear(128,64),
                         nn.ReLU(inplace=True),
                         nn.Linear(64,5),
                    )

And my loss function is as follows

loss_fn   = nn.CrossEntropyLoss(weight=class_weights) #CrossEntropyLoss with class_weights.

where class_weights is defined as such

from sklearn.utils import class_weight #For calculating weights for each class.
class_weights = class_weight.compute_class_weight(class_weight='balanced',classes=np.array([0,1,2,3,4]),y=train_df['level'].values)
class_weights = torch.tensor(class_weights,dtype=torch.float).to(device)
 
print(class_weights) #Prints the calculated weights for the classes.

output: tensor([0.2556, 4.6000, 1.5333, 9.2000, 9.2000], device='cuda:0')

After first epoch I get the error given below.

---------------------------------------------------------------------------
RuntimeError                              Traceback (most recent call last)
Input In [15], in <cell line: 5>()
      3 nb_epochs = 3
      4 #Call the optimize function.
----> 5 train_losses, valid_losses = optimize(train_dataloader,valid_dataloader,model,loss_fn,optimizer,nb_epochs)

Input In [14], in optimize(train_dataloader, valid_dataloader, model, loss_fn, optimizer, nb_epochs)
     21 print(f'\nEpoch {epoch+1}/{nb_epochs}')
     22 print('-------------------------------')
---> 23 train_loss = train(train_dataloader,model,loss_fn,optimizer, epoch) #Calls the train function.
     24 train_losses.append(train_loss)
     25 valid_loss = validate(valid_dataloader,model,loss_fn) #Calls the validate function.

Input In [12], in train(dataloader, model, loss_fn, optimizer, epoch)
     24 for batch,(x,y) in enumerate(dataloader): #Iterates through the batches.
     26     output = model(x.to(device)) #model's predictions.
---> 27     loss   = loss_fn(output,y.to(device)) #loss calculation.
     29     running_loss += loss.item()
     31     total        += y.size(0)

File ~/anaconda3/envs/Ammar/lib/python3.9/site-packages/torch/nn/modules/module.py:1113, in Module._call_impl(self, *input, **kwargs)
   1109 # If we don't have any hooks, we want to skip the rest of the logic in
   1110 # this function, and just call forward.
   1111 if not (self._backward_hooks or self._forward_hooks or self._forward_pre_hooks or _global_backward_hooks
   1112         or _global_forward_hooks or _global_forward_pre_hooks):
-> 1113     return forward_call(*input, **kwargs)
   1114 # Do not call functions when jit is used
   1115 full_backward_hooks, non_full_backward_hooks = [], []

File ~/anaconda3/envs/Ammar/lib/python3.9/site-packages/torch/nn/modules/loss.py:1163, in CrossEntropyLoss.forward(self, input, target)
   1162 def forward(self, input: Tensor, target: Tensor) -> Tensor:
-> 1163     return F.cross_entropy(input, target, weight=self.weight,
   1164                            ignore_index=self.ignore_index, reduction=self.reduction,
   1165                            label_smoothing=self.label_smoothing)

File ~/anaconda3/envs/Ammar/lib/python3.9/site-packages/torch/nn/functional.py:2961, in cross_entropy(input, target, weight, size_average, ignore_index, reduce, reduction, label_smoothing)
   2959 if size_average is not None or reduce is not None:
   2960     reduction = _Reduction.legacy_get_string(size_average, reduce)
-> 2961 return torch._C._nn.cross_entropy_loss(input, target, weight, _Reduction.get_enum(reduction), ignore_index, label_smoothing)

RuntimeError: weight tensor should be defined either for all 1000 classes or no classes but got weight tensor of shape: [5]


Solution 1:[1]

I faced the same problem as you. I started by changing the size of my final classifier layer (I copied the code from here):

model = models.mobilenet_v2(pretrained=True)

last_item_index = len(model.classifier) - 1
old_fc = model.classifier.__getitem__(last_item_index )
new_fc = nn.Linear(in_features=old_fc.in_features, 
                   out_features= 129, bias=True)
model.classifier.__setitem__(last_item_index , new_fc)

After changing this, I printed the model architecture using the following code:

from torchsummary import summary
summary(model, (3, 224, 224))

And it's working (number of classes in my dataset is 129):

(classifier): Sequential(
  (0): Dropout(p=0.2, inplace=False)
  (1): Linear(in_features=1280, out_features=129, bias=True)
)

Solution 2:[2]

Sorry, my reputation is not 50 enough to comment directly. According to the error message, you end up with 1000 outputs, but you only defined five weights. So you can try to output the model to check if the output of the last layer of the model is 5. It is possible that model.fc is not working.

Sources

This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.

Source: Stack Overflow

Solution Source
Solution 1 Amine Sehaba
Solution 2 ki-ljl