Skip to content

Commit 7555297

Browse files
aparna-aketifacebook-github-bot
authored andcommitted
Remove presence of grad_sample from optimizer for FGC (#756)
Summary: Pull Request resolved: #756 In case of FGC, grad_samples is set to None in the backward hook after computing the norm per layer. There is no need to set p.grad_samples to None in the optimizer. Reviewed By: EnayatUllah Differential Revision: D74418221 fbshipit-source-id: 0f91288e0839d35887e5ec6add36fc3baf89dd85
1 parent 4acea9f commit 7555297

File tree

2 files changed

+4
-6
lines changed

2 files changed

+4
-6
lines changed

opacus/grad_sample/grad_sample_module_fast_gradient_clipping.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -240,7 +240,7 @@ def capture_backprops_hook(
240240
grad_sample=p.grad_sample,
241241
max_batch_len=module.max_batch_len,
242242
)
243-
del p.grad_sample
243+
p.grad_sample = None
244244
if len(module.activations) == 0:
245245
if hasattr(module, "max_batch_len"):
246246
del module.max_batch_len

opacus/optimizers/optimizer_fast_gradient_clipping.py

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -123,10 +123,10 @@ def zero_grad(self, set_to_none: bool = False):
123123
"""
124124
Clear gradients.
125125
126-
Clears ``p.grad``, ``p.grad_sample`` and ``p.summed_grad`` for all of it's parameters
126+
Clears ``p.grad`` and ``p.summed_grad`` for all of it's parameters
127127
128128
Notes:
129-
``set_to_none`` argument only affects ``p.grad``. ``p.grad_sample`` and
129+
``set_to_none`` argument only affects ``p.grad`` and
130130
``p.summed_grad`` is never zeroed out and always set to None.
131131
Normal grads can do this, because their shape is always the same.
132132
Grad samples do not behave like this, as we accumulate gradients from different
@@ -140,13 +140,11 @@ def zero_grad(self, set_to_none: bool = False):
140140
if set_to_none is False:
141141
logger.debug(
142142
"Despite set_to_none is set to False, "
143-
"opacus will set p.grad_sample and p.summed_grad to None due to "
143+
"opacus will set p.summed_grad to None due to "
144144
"non-trivial gradient accumulation behaviour"
145145
)
146146

147147
for p in self.params:
148-
p.grad_sample = None
149-
150148
if not self._is_last_step_skipped:
151149
p.summed_grad = None
152150
self.original_optimizer.zero_grad(set_to_none)

0 commit comments

Comments
 (0)