-
Notifications
You must be signed in to change notification settings - Fork 388
Support layers with a mix of frozen and learnable parameters #437
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
@karthikprasad has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
|
@karthikprasad has updated the pull request. You must reimport the pull request before landing. |
|
@karthikprasad has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
ffuuugor
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(5.7 GB --> 27MB)
I now bestow upon you the proud title of "The Chief Metric Gaming Officer", that's a beautiful experiment design.
But in all seriousness, great change, thanks a lot!
Having to do that in every grad sampler is not ideal, but I can't see any other option either
|
@karthikprasad has updated the pull request. You must reimport the pull request before landing. |
|
@karthikprasad has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
Types of changes
Motivation and Context / Related issue
#436
How Has This Been Tested (if it applies)
[WIP] unit tests
In addition to fixing the issue, this also results in massive memory savings. See CUDA memory usage before and after this change in https://colab.research.google.com/drive/1jYpZ2uVz2UXCoj2v0Vf30uqWO2oAl3de?usp=sharing (5.7 GB --> 27MB)
Checklist