'Reproducibility issue with PyTorch
I'm running a script with the same seed and I see results are reproduced on consecutive runs but somehow running the same script with the same seed changes the output after a few days. I'm only getting a short-term reproducibility which is weird. For reproducibility my script includes the following statements already:
torch.backends.cudnn.benchmark = False
torch.backends.cudnn.deterministic = True
torch.use_deterministic_algorithms(True)
random.seed(args.seed)
np.random.seed(args.seed)
torch.manual_seed(args.seed)
I also checked the sequence of instance ids created by the RandomSampler for train Dataloader which is maintained across runs. Also set the num_workers=0 in the dataloader.What could be causing the output to change?
Solution 1:[1]
PyTorch is actually not fully deterministic. Meaning that with a set seed, some PyTorch operations will simply behave differently and diverge from previous runs, given enough time. This is due to algorithm, CUDA, and backprop optimizations.
This is a good read: https://pytorch.org/docs/stable/notes/randomness.html
The above page lists which operations are non-deterministic. It is generally discouraged that you disable their use, but it can be done with:
torch.use_deterministic_algorithms()
This might also disable which operation can be used.
Solution 2:[2]
torch.cuda.manual_seed(args.seed)
torch.cuda.manual_seed_all(args.seed)
Try adding these to your current reproducibility setting.
Sources
This article follows the attribution requirements of Stack Overflow and is licensed under CC BY-SA 3.0.
Source: Stack Overflow
Solution | Source |
---|---|
Solution 1 | Zoom |
Solution 2 | starriet |