meta-pytorch · karthikprasad · May 17, 2022 · May 17, 2022 · May 17, 2022 · May 17, 2022
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -14,7 +14,7 @@
 ### Bug fixes
 * Fix accountant when using number of steps instead of epochs
 * Add params check when converting BatchNorm to GroupNorm (#390)
-* Fix typo in gdp accountant mechansim name (#386)
+* Fix typo in gdp accountant mechanism name (#386)
 * Fix linter errors (#392)
 * Add friendly and detailed message for unsupported layers (#401)
 * Run linter on nightly workflow (#399)

diff --git a/docs/faq.md b/docs/faq.md
@@ -88,7 +88,7 @@ This statement extends to all downstream uses of this model: its inferences, fin
 
 From the expression above it is obvious that epsilon and delta play different roles: epsilon controls the multiplicative increase in the baseline probability while delta lifts all probabilities by the same amount. For instance, if your baseline scenario (the model trained on *D*′, without your data) assigns 0 probability to some event, the bound on observing this event on *D* (that includes your data) is delta. Because of that, we’d like to target epsilon to be a small constant and select delta to be tiny. A rule of thumb is to set delta to be less than the inverse of the size of the training dataset.
 
-Epsilon and delta are computed *ex post*, following an optimizer run. In fact, for each delta there’s some epsilon, depending on that delta, such that the run satisfies (epsilon, delta)-DP. The call `privacy_engine.accountant.get_privacy_spent(delta=delta)` outputs that epsilon in its first return value.
+Epsilon and delta are computed *ex post*, following an optimizer run. In fact, for each delta there’s some epsilon, depending on that delta, such that the run satisfies (epsilon, delta)-DP. The call `privacy_engine.get_epsilon(delta=delta)` outputs that epsilon in its first return value.
 
 Importantly, (epsilon, delta)-DP is a *conservative upper bound* on the actual privacy loss. There’s [growing](https://arxiv.org/abs/2006.07709) [evidence](https://arxiv.org/pdf/2006.11601.pdf) that the observable privacy loss of the DP-SGD algorithm can be significantly smaller.
 
@@ -110,7 +110,7 @@ Although we report expended privacy budget using the (epsilon, delta) language,
 
 When the privacy engine needs to bound the privacy loss of a training run using (epsilon, delta)-DP for a given delta, it searches for the optimal order from among `alphas`. There’s very little additional cost in expanding the list of orders. We suggest using a list `[1 + x / 10.0 for x in range(1, 100)] + list(range(12, 64))`. You can pass your own alphas by passing `alphas=custom_alphas` when calling `privacy_engine.make_private_with_epsilon`.
 
-A call to `privacy_engine.accountant.get_privacy_spent(delta=delta)` returns a pair: an epsilon such that the training run satisfies (epsilon, delta)-DP and an optimal order alpha. An easy diagnostic to determine whether the list of `alphas` ought to be expanded is whether the returned value alpha is one of the two boundary values of `alphas`.
+A call to `privacy_engine.get_epsilon(delta=delta)` returns a pair: an epsilon such that the training run satisfies (epsilon, delta)-DP and an optimal order alpha. An easy diagnostic to determine whether the list of `alphas` ought to be expanded is whether the returned value alpha is one of the two boundary values of `alphas`.
 
 <!-- ## How do I run Opacus in Colab?
 

diff --git a/opacus/accountants/accountant.py b/opacus/accountants/accountant.py
@@ -13,15 +13,20 @@
 # limitations under the License.
 
 import abc
-from typing import Callable
+from collections import OrderedDict
+from copy import deepcopy
+from typing import Any, Callable, Mapping, TypeVar
 
 from opacus.optimizers import DPOptimizer
 
 
+T_state_dict = TypeVar("T_state_dict", bound=Mapping[str, Any])
+
+
 class IAccountant(abc.ABC):
     @abc.abstractmethod
     def __init__(self):
-        pass
+        self.history = []  # history of noise multiplier, sample rate, and steps
 
     @abc.abstractmethod
     def step(self, *, noise_multiplier: float, sample_rate: float):
@@ -67,7 +72,7 @@ def get_optimizer_hook_fn(
         """
         Returns a callback function which can be used to attach to DPOptimizer
         Args:
-            sample_rate: Expested samping rate used for accounting
+            sample_rate: Expected samping rate used for accounting
         """
 
         def hook_fn(optim: DPOptimizer):
@@ -80,3 +85,50 @@ def hook_fn(optim: DPOptimizer):
             )
 
         return hook_fn
+
+    def state_dict(self, destination: T_state_dict = None) -> T_state_dict:
+        """
+        Retruns a dictionary containing the state of the accountant.
+        Args:
+            destination: a mappable object to populate the current state_dict into.
+                If this arg is None, an OrderedDict is created and populated.
+                Default: None
+        """
+        if destination is None:
+            destination = OrderedDict()
+        destination["history"] = deepcopy(self.history)
+        destination["mechanism"] = self.__class__.mechanism
+        return destination
+
+    def load_state_dict(self, state_dict: T_state_dict):
+        """
+        Validates the supplied state_dict and populates the current
+        Privacy Accountant's state dict.
+
+        Args:
+            state_dict: state_dict to load.
+
+        Raises:
+            ValueError if supplied state_dict is invalid and cannot be loaded.
+        """
+        if state_dict is None or len(state_dict) == 0:
+            raise ValueError(
+                "state dict is either None or empty and hence cannot be loaded"
+                " into Privacy Accountant."
+            )
+        if "history" not in state_dict.keys():
+            raise ValueError(
+                "state_dict does not have the key `history`."
+                " Cannot be loaded into Privacy Accountant."
+            )
+        if "mechanism" not in state_dict.keys():
+            raise ValueError(
+                "state_dict does not have the key `mechanism`."
+                " Cannot be loaded into Privacy Accountant."
+            )
+        if self.__class__.mechanism != state_dict["mechanism"]:
+            raise ValueError(
+                f"state_dict of {state_dict['mechanism']} cannot be loaded into "
+                f" Privacy Accountant with mechanism {self.__class__.mechanism}"
+            )
+        self.history = state_dict["history"]
diff --git a/opacus/accountants/gdp.py b/opacus/accountants/gdp.py
@@ -24,7 +24,7 @@ def __init__(self):
             "GDP accounting is experimental and can underestimate privacy expenditure."
             "Proceed with caution. More details: https://arxiv.org/pdf/2106.02848.pdf"
         )
-        self.history = []  # history of noise multiplier, sample rate, and steps
+        super().__init__()
 
     def step(self, *, noise_multiplier: float, sample_rate: float):
         if len(self.history) >= 1:

diff --git a/opacus/accountants/rdp.py b/opacus/accountants/rdp.py
@@ -22,7 +22,7 @@ class RDPAccountant(IAccountant):
     DEFAULT_ALPHAS = [1 + x / 10.0 for x in range(1, 100)] + list(range(12, 64))
 
     def __init__(self):
-        self.history = []
+        super().__init__()
 
     def step(self, *, noise_multiplier: float, sample_rate: float):
         if len(self.history) >= 1:

diff --git a/opacus/grad_sample/grad_sample_module.py b/opacus/grad_sample/grad_sample_module.py
@@ -93,7 +93,7 @@ def __init__(
                 ``[K, batch_size, ...]``
             loss_reduction: Indicates if the loss reduction (for aggregating the gradients)
                 is a sum or a mean operation. Can take values "sum" or "mean"
-            strict: If set to ``True``, the input module will be validater to check that
+            strict: If set to ``True``, the input module will be validated to check that
                 ``GradSampleModule`` has grad sampler functions for all submodules of
                 the input module (i.e. if it knows how to calculate per sample gradients)
                 for all model parameters. If set to ``False``, per sample gradients will

diff --git a/opacus/optimizers/optimizer.py b/opacus/optimizers/optimizer.py
@@ -280,9 +280,9 @@ def __init__(
         self.generator = generator
         self.secure_mode = secure_mode
 
-        self.param_groups = optimizer.param_groups
+        self.param_groups = self.original_optimizer.param_groups
         self.defaults = self.original_optimizer.defaults
-        self.state = optimizer.state
+        self.state = self.original_optimizer.state
         self._step_skip_queue = []
         self._is_last_step_skipped = False
 

diff --git a/opacus/privacy_engine.py b/opacus/privacy_engine.py
@@ -12,8 +12,9 @@
 # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
 # See the License for the specific language governing permissions and
 # limitations under the License.
+import os
 import warnings
-from typing import List, Optional, Tuple, Union
+from typing import IO, Any, BinaryIO, Dict, List, Optional, Tuple, Union
 
 import torch
 from opacus.accountants import create_accountant
@@ -22,6 +23,7 @@
 from opacus.distributed import DifferentiallyPrivateDistributedDataParallel as DPDDP
 from opacus.grad_sample.grad_sample_module import GradSampleModule
 from opacus.optimizers import DPOptimizer, get_optimizer_class
+from opacus.scheduler import _NoiseScheduler
 from opacus.validators.module_validator import ModuleValidator
 from torch import nn, optim
 from torch.nn.parallel import DistributedDataParallel as DDP
@@ -493,3 +495,68 @@ def get_epsilon(self, delta):
             Privacy budget (epsilon) expended so far.
         """
         return self.accountant.get_epsilon(delta)
+
+    def save_checkpoint(
+        self,
+        *,
+        path: Union[str, os.PathLike, BinaryIO, IO[bytes]],
+        module: GradSampleModule,
+        optimizer: Optional[DPOptimizer] = None,
+        noise_scheduler: Optional[_NoiseScheduler] = None,
+        checkpoint_dict: Optional[Dict[str, Any]] = None,
+        module_state_dict_kwargs: Optional[Dict[str, Any]] = None,
+        torch_save_kwargs: Optional[Dict[str, Any]] = None,
+    ):
+        """
+        Saves the state_dict of module, optimzer, and accountant at path.
+        Args:
+            path: Path to save the state dict objects.
+            module: GradSampleModule to save; wrapped module's state_dict is saved.
+            optimizer: DPOptimizer to save; wrapped optimizer's state_dict is saved.
+            module_state_dict_kwargs: dict of kwargs to pass to ``module.state_dict()``
+            torch_save_kwargs: dict of kwargs to pass to ``torch.save()``
+
+        """
+        checkpoint_dict = checkpoint_dict or {}
+        checkpoint_dict["module_state_dict"] = module.state_dict(
+            **(module_state_dict_kwargs or {})
+        )
+        checkpoint_dict["privacy_accountant_state_dict"] = self.accountant.state_dict()
+        if optimizer is not None:
+            checkpoint_dict["optimizer_state_dict"] = optimizer.state_dict()
+        if noise_scheduler is not None:
+            checkpoint_dict["noise_scheduler_state_dict"] = noise_scheduler.state_dict()
+
+        torch.save(checkpoint_dict, path, **(torch_save_kwargs or {}))
+
+    def load_checkpoint(
+        self,
+        *,
+        path: Union[str, os.PathLike, BinaryIO, IO[bytes]],
+        module: GradSampleModule,
+        optimizer: Optional[DPOptimizer] = None,
+        noise_scheduler: Optional[_NoiseScheduler] = None,
+        module_load_dict_kwargs: Optional[Dict[str, Any]] = None,
+        torch_load_kwargs: Optional[Dict[str, Any]] = None,
+    ) -> Dict:
+        checkpoint = torch.load(path, **(torch_load_kwargs or {}))
+        module.load_state_dict(
+            checkpoint["module_state_dict"], **(module_load_dict_kwargs or {})
+        )
+        self.accountant.load_state_dict(checkpoint["privacy_accountant_state_dict"])
+
+        optimizer_state_dict = checkpoint.pop("optimizer_state_dict", {})
+        if optimizer is not None and len(optimizer_state_dict) > 0:
+            optimizer.load_state_dict(optimizer_state_dict)
+        elif (optimizer is not None) ^ (len(optimizer_state_dict) > 0):
+            # warn if only one of them is available
+            warnings.warn(
+                f"optimizer_state_dict has {len(optimizer_state_dict)} items"
+                f" but optimizer is {'' if optimizer else 'not'} provided."
+            )
+
+        noise_scheduler_state_dict = checkpoint.pop("noise_scheduler_state_dict", {})
+        if noise_scheduler is not None and len(noise_scheduler_state_dict) > 0:
+            noise_scheduler.load_state_dict(noise_scheduler_state_dict)
+
+        return checkpoint
diff --git a/opacus/tests/accountants_test.py b/opacus/tests/accountants_test.py
@@ -119,3 +119,59 @@ def test_get_noise_multiplier_gdp(self):
         )
 
         self.assertAlmostEqual(noise_multiplier, 1.3232421875)
+
+    def test_accountant_state_dict(self):
+        noise_multiplier = 1.5
+        sample_rate = 0.04
+        steps = int(90 / 0.04)
+
+        accountant = RDPAccountant()
+        for _ in range(steps):
+            accountant.step(noise_multiplier=noise_multiplier, sample_rate=sample_rate)
+
+        dummy_dest = {"dummy_k": "dummy_v"}
+        # history should be equal but not the same instance
+        self.assertEqual(accountant.state_dict()["history"], accountant.history)
+        self.assertFalse(accountant.state_dict()["history"] is accountant.history)
+        # mechanism populated to supplied dict
+        self.assertEqual(
+            accountant.state_dict(dummy_dest)["mechanism"], accountant.mechanism
+        )
+        # existing values in supplied dict unchanged
+        self.assertEqual(
+            accountant.state_dict(dummy_dest)["dummy_k"], dummy_dest["dummy_k"]
+        )
+
+    def test_accountant_load_state_dict(self):
+        noise_multiplier = 1.5
+        sample_rate = 0.04
+        steps = int(90 / 0.04)
+
+        accountant = RDPAccountant()
+        for _ in range(steps - 1000):
+            accountant.step(noise_multiplier=noise_multiplier, sample_rate=sample_rate)
+
+        new_rdp_accountant = RDPAccountant()
+        new_gdp_accountant = GaussianAccountant()
+        # check corner cases
+        with self.assertRaises(ValueError):
+            new_rdp_accountant.load_state_dict({})
+        with self.assertRaises(ValueError):
+            new_rdp_accountant.load_state_dict({"1": 2})
+        with self.assertRaises(ValueError):
+            new_rdp_accountant.load_state_dict({"history": []})
+        with self.assertRaises(ValueError):
+            new_gdp_accountant.load_state_dict(accountant.state_dict())
+        # check loading logic
+        self.assertNotEqual(new_rdp_accountant.state_dict(), accountant.state_dict())
+        new_rdp_accountant.load_state_dict(accountant.state_dict())
+        self.assertEqual(new_rdp_accountant.state_dict(), accountant.state_dict())
+
+        # ensure correct output after completion
+        for _ in range(steps - 1000, steps):
+            new_rdp_accountant.step(
+                noise_multiplier=noise_multiplier, sample_rate=sample_rate
+            )
+
+        epsilon = new_rdp_accountant.get_epsilon(delta=1e-5)
+        self.assertAlmostEqual(epsilon, 7.32911117143)
diff --git a/opacus/tests/grad_sample_module_test.py b/opacus/tests/grad_sample_module_test.py
@@ -228,3 +228,24 @@ def test_submodule_access(self):
 
         with self.assertRaises(AttributeError):
             _ = self.grad_sample_module.fc3
+
+    def test_state_dict(self):
+        gs_state_dict = self.grad_sample_module.state_dict()
+        og_state_dict = self.original_model.state_dict()
+        # check wrapped module state dict
+        for key in og_state_dict.keys():
+            self.assertTrue(f"_module.{key}" in gs_state_dict)
+            assert_allclose(og_state_dict[key], gs_state_dict[f"_module.{key}"])
+
+    def test_load_state_dict(self):
+        gs_state_dict = self.grad_sample_module.state_dict()
+        new_gs = GradSampleModule(
+            SampleConvNet(), batch_first=False, loss_reduction="mean"
+        )
+        new_gs.load_state_dict(gs_state_dict)
+        # wrapped module is the same
+        for key in self.original_model.state_dict().keys():
+            self.assertTrue(key in new_gs._module.state_dict())
+            assert_allclose(
+                self.original_model.state_dict()[key], new_gs._module.state_dict()[key]
+            )