-
Notifications
You must be signed in to change notification settings - Fork 90
Closed
Labels
bugSomething isn't workingSomething isn't working
Description
Describe the bug
The diff_privacy trainer breaks with Opacus 1.1.3, after it adds new safeguards to check consistency between model parameters and optimizer parameters:
This issue will not occur with Opacus 1.0.2, when this safeguard is not in place.
Training will terminate in the second or third round due to this check:
[INFO][05:54:02]: Training on client #5 failed.
Process Process-1:3:
Traceback (most recent call last):
File "/opt/homebrew/Caskroom/miniforge/base/envs/m1/lib/python3.9/multiprocessing/process.py", line 315, in _bootstrap
self.run()
File "/opt/homebrew/Caskroom/miniforge/base/envs/m1/lib/python3.9/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/Users/bli/Playground/plato/plato/trainers/basic.py", line 143, in train_process
raise training_exception
File "/Users/bli/Playground/plato/plato/trainers/basic.py", line 137, in train_process
self.train_model(config, trainset, sampler.get(), cut_layer,
File "/Users/bli/Playground/plato/plato/trainers/diff_privacy.py", line 97, in train_model
self.model, optimizer, train_loader = privacy_engine.make_private_with_epsilon(
File "/opt/homebrew/Caskroom/miniforge/base/envs/m1/lib/python3.9/site-packages/opacus/privacy_engine.py", line 485, in make_private_with_epsilon
return self.make_private(
File "/opt/homebrew/Caskroom/miniforge/base/envs/m1/lib/python3.9/site-packages/opacus/privacy_engine.py", line 374, in make_private
raise ValueError(
ValueError: Module parameters are different than optimizer Parameters
To Reproduce
Change the partition size in configs/MNIST/mnist_iid.yml to 2000 (to make the training process run faster), and then run:
./run -c configs/MNIST/fedavg_lenet5_dp.yml
Expected behavior
Training proceeds normally.
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't working