-
Notifications
You must be signed in to change notification settings - Fork 388
Description
🚀 Feature
Add support for a custom shifted window attention layer to enable efficient training of Swin Transformers with DP-SGD in Opacus. This enhancement would complement the existing multi-head attention support, extending Opacus' capabilities to handle modern vision transformer architectures.
Motivation
The Swin Transformer architecture is widely used for its hierarchical design and computational efficiency, achieved through shifted window attention. However, there is currently no native support for window attention layers in Opacus, which restricts the ability to train Swin Transformers with differential privacy guarantees.
Implementing this feature would enable privacy-preserving training on sensitive user data, such as medical images, personal photo collections, or other proprietary datasets.
Pitch
The proposed implementation will:
-
Add support for a custom shifted window attention layer in Opacus, tailored for Swin Transformers.
-
Ensure compatibility with existing DP-SGD mechanisms in Opacus.
-
Leverage Opacus' modular design to integrate seamlessly with PyTorch and existing vision transformer workflows.
This feature would provide a robust, privacy-preserving training pipeline for Swin Transformers, expanding their usability in sensitive applications.
Alternatives
-
Using existing support for multi-head attention in Opacus and adapting it for window attention layers. However, this approach may lead to inefficiencies and require significant manual customization.
-
Implementing a standalone DP-SGD pipeline for Swin Transformers outside of Opacus. While feasible, it would lack the integration and optimizations offered by Opacus.
Additional context
I want to fix this issue as it would he helpful for a part of my research project at my University.