Skip to content

Conversation

@lqf96
Copy link
Contributor

@lqf96 lqf96 commented Dec 18, 2020

Rationale

While most of the torch.Generator properties and methods are implemented as a thin wrapper of the corresponding at::Generator methods, torch.Generator.get_state() and torch.Generator.set_state() are implemented in legacy Torch code and are not dispatched through the c10::GeneratorImpl interface. This is not structured well and makes implementing generators for new backends (e.g. XLAGeneratorImpl for the XLA backend) inconvenient. As such, this pull request seeks to move these generator state APIs to c10 and ATen.

What is being refactored?

  • Interfaces
    • Added c10::GeneratorImpl::set_state and c10::GeneratorImpl::state for getting and setting the internal state of a random number generator.
    • at::Generator::set_state and at::Generator::state wraps the above-mentioned APIs, as it's basically a PIMPL.
    • Added helper function at::detail::check_rng_state for checking the validity of new RNG state tensor.
  • CPU Generator
    • Renamed and moved THTensor_(setRNGState) and THTensor_(getRNGState) to CPUGeneratorImpl::set_state and CPUGenerator::state.
    • Renamed and moved THGeneratorState and THGeneratorStateNew to CPUGeneratorStateLegacy and CPUGeneratorState.
  • CUDA Generator
    • Renamed and moved THCRandom_setRNGState and THCRandom_getRNGState to CUDAGeneratorImpl::set_state and CUDAGeneratorImpl::state.
  • PyTorch Bindings
    • THPGenerator_setState and THPGenerator_getState now simply forward to at::Generator::set_state and at::Generator::state.

@facebook-github-bot
Copy link
Contributor

facebook-github-bot commented Dec 18, 2020

💊 CI failures summary and remediations

As of commit 93decdd (more details on the Dr. CI page):


  • 1/1 failures introduced in this PR

🕵️ 1 new failure recognized by patterns

The following CI failures do not appear to be due to upstream breakages:

See CircleCI build pytorch_linux_bionic_py3_8_gcc9_coverage_test1 (1/1)

Step: "Run tests" (full log | diagnosis details | 🔁 rerun)

Jan 06 11:20:48 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 06 11:20:48 At:
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 06 11:20:48 
Jan 06 11:20:48 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 06 11:20:48 
Jan 06 11:20:48 At:
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 06 11:20:48 
Jan 06 11:20:48 [E request_callback_no_python.cpp:636] Received error while processing request type 258: RuntimeError: Can not pickle torch.futures.Future
Jan 06 11:20:48 
Jan 06 11:20:48 At:
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(120): serialize
Jan 06 11:20:48   /opt/conda/lib/python3.8/site-packages/torch/distributed/rpc/internal.py(172): serialize
Jan 06 11:20:48 
Jan 06 11:20:48 [W tensorpipe_agent.cpp:547] RPC agent for worker2 encountered error when reading incoming request from worker1: EOF: end of file (this is expected to happen during shutdown)
Jan 06 11:20:48 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker1: EOF: end of file (this is expected to happen during shutdown)
Jan 06 11:20:48 [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker3: EOF: end of file (this is expected to happen during shutdown)
Jan 06 11:20:49 ok (2.767s)
Jan 06 11:20:51   test_return_future_remote (__main__.TensorPipeRpcTestWithSpawn) ... [W tensorpipe_agent.cpp:547] RPC agent for worker0 encountered error when reading incoming request from worker2: EOF: end of file (this is expected to happen during shutdown)

1 job timed out:

  • pytorch_linux_bionic_py3_8_gcc9_coverage_test1

This comment was automatically generated by Dr. CI (expand for details).Follow this link to opt-out of these comments for your Pull Requests.

Please report bugs/suggestions to the (internal) Dr. CI Users group.

This comment has been revised 270 times.

@lqf96 lqf96 force-pushed the aten-generator-state branch 3 times, most recently from 34b5294 to f7df038 Compare December 19, 2020 19:13
@lqf96 lqf96 changed the title [WIP] Move generator state APIs to ATen Move generator state APIs to ATen Dec 19, 2020
@lqf96
Copy link
Contributor Author

lqf96 commented Dec 19, 2020

I guess this is ready for review now. Since I'm not familiar with the C++ side of the project, maybe @ezyang or someone can take a look at this?

@codecov
Copy link

codecov bot commented Dec 19, 2020

Codecov Report

Merging #49589 (7293324) into master (70734f1) will increase coverage by 47.81%.
The diff coverage is 86.36%.

@@             Coverage Diff             @@
##           master   #49589       +/-   ##
===========================================
+ Coverage   32.88%   80.69%   +47.81%     
===========================================
  Files         511     1891     +1380     
  Lines       68918   204977   +136059     
===========================================
+ Hits        22665   165416   +142751     
+ Misses      46253    39561     -6692     

@ngimel ngimel requested a review from pbelevich December 21, 2020 18:37
@ngimel ngimel added the triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module label Dec 21, 2020
@lqf96 lqf96 force-pushed the aten-generator-state branch from c0a9f7b to 7293324 Compare December 22, 2020 11:02
@ezyang
Copy link
Contributor

ezyang commented Jan 4, 2021

@pbelevich let me know if you need extra eyeballs on this PR

@pbelevich
Copy link
Contributor

@pbelevich let me know if you need extra eyeballs on this PR

Will review it and other prs today, sorry for late review

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pbelevich has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@lqf96
Copy link
Contributor Author

lqf96 commented Jan 5, 2021

@pbelevich I revisited the PR today and figured out although the implementation is correct, there's still a few places where the code quality can be improved:

  1. Should we name the new APIs as state / set_state or rng_state / set_rng_state? The latter sounds clearer and matches the legacy APIs names (THTensor_(get|set)RNGState / THCRandom_(get|set)RNGState) better.
  2. c10::GeneratorImpl::state (or rng_state) should accept const c10::TensorImpl& instead of c10::TensorImpl&. Similarly, at::Generator::state (or rng_state) should accept const Tensor&.

What's your opinions on these? Since the PR is already imported to FB's internal systems, should I update this PR or open a new one?

Copy link
Contributor

@pbelevich pbelevich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lqf96 thanks a lot for this work!

  1. Should we name the new APIs as state / set_state or rng_state / set_rng_state? The latter sounds clearer and matches the legacy APIs names (THTensor_(get|set)RNGState / THCRandom_(get|set)RNGState) better.

I think we should try to match Python API names instead of legacy names to make C++ API and Python API look similar. So, let's rename state to set_state and keep get_state as is.

2. c10::GeneratorImpl::state (or rng_state) should accept const c10::TensorImpl& instead of c10::TensorImpl&. Similarly, at::Generator::state (or rng_state) should accept const Tensor&.

Agree, let's make them const

Since the PR is already imported to FB's internal systems, should I update this PR or open a new one?

Amend this PR, it's easy to reimport it again to FB internal code

@pbelevich
Copy link
Contributor

TODO(for pbelevich): amend this PR with meta-pytorch/csprng#106 after reimport to fbcode

@lqf96 lqf96 force-pushed the aten-generator-state branch from 7293324 to 8df4d29 Compare January 6, 2021 02:27
@lqf96 lqf96 force-pushed the aten-generator-state branch from 0520138 to 93decdd Compare January 6, 2021 08:11
@lqf96
Copy link
Contributor Author

lqf96 commented Jan 6, 2021

@pbelevich I applied your suggestions and this PR should be ready for review again.

Copy link
Contributor

@facebook-github-bot facebook-github-bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pbelevich has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@facebook-github-bot
Copy link
Contributor

This pull request has been merged in 876dfbd.

facebook-github-bot pushed a commit that referenced this pull request Jan 7, 2021
Summary:
## Rationale

While most of the `torch.Generator` properties and methods are implemented as a thin wrapper of the corresponding `at::Generator` methods, `torch.Generator.get_state()` and `torch.Generator.set_state()` are implemented in legacy Torch code and are not dispatched through the `c10::GeneratorImpl` interface. This is not structured well and makes implementing generators for new backends (e.g. `XLAGeneratorImpl` for the XLA backend) inconvenient. As such, this pull request seeks to move these generator state APIs to c10 and ATen.

## What is being refactored?
* Interfaces
  - Added `c10::GeneratorImpl::set_state` and `c10::GeneratorImpl::state` for getting and setting the internal state of a random number generator.
  - `at::Generator::set_state` and `at::Generator::state` wraps the above-mentioned APIs, as it's basically a PIMPL.
  - Added helper function `at::detail::check_rng_state` for checking the validity of new RNG state tensor.
* CPU Generator
  - Renamed and moved `THTensor_(setRNGState)` and `THTensor_(getRNGState)` to `CPUGeneratorImpl::set_state` and `CPUGenerator::state`.
  - Renamed and moved `THGeneratorState` and `THGeneratorStateNew` to `CPUGeneratorStateLegacy` and `CPUGeneratorState`.
* CUDA Generator
  - Renamed and moved `THCRandom_setRNGState` and `THCRandom_getRNGState` to `CUDAGeneratorImpl::set_state` and `CUDAGeneratorImpl::state`.
* PyTorch Bindings
  - `THPGenerator_setState` and `THPGenerator_getState` now simply forward to `at::Generator::set_state` and `at::Generator::state`.

Pull Request resolved: #49589

Reviewed By: H-Huang

Differential Revision: D25785774

Pulled By: pbelevich

fbshipit-source-id: 8ed79209c4ffb1a0ae8b19952ac8871ac9e0255f
@lqf96 lqf96 deleted the aten-generator-state branch January 7, 2021 04:07
hwangdeyu pushed a commit to hwangdeyu/pytorch that referenced this pull request Jan 14, 2021
Summary:
## Rationale

While most of the `torch.Generator` properties and methods are implemented as a thin wrapper of the corresponding `at::Generator` methods, `torch.Generator.get_state()` and `torch.Generator.set_state()` are implemented in legacy Torch code and are not dispatched through the `c10::GeneratorImpl` interface. This is not structured well and makes implementing generators for new backends (e.g. `XLAGeneratorImpl` for the XLA backend) inconvenient. As such, this pull request seeks to move these generator state APIs to c10 and ATen.

## What is being refactored?
* Interfaces
  - Added `c10::GeneratorImpl::set_state` and `c10::GeneratorImpl::state` for getting and setting the internal state of a random number generator.
  - `at::Generator::set_state` and `at::Generator::state` wraps the above-mentioned APIs, as it's basically a PIMPL.
  - Added helper function `at::detail::check_rng_state` for checking the validity of new RNG state tensor.
* CPU Generator
  - Renamed and moved `THTensor_(setRNGState)` and `THTensor_(getRNGState)` to `CPUGeneratorImpl::set_state` and `CPUGenerator::state`.
  - Renamed and moved `THGeneratorState` and `THGeneratorStateNew` to `CPUGeneratorStateLegacy` and `CPUGeneratorState`.
* CUDA Generator
  - Renamed and moved `THCRandom_setRNGState` and `THCRandom_getRNGState` to `CUDAGeneratorImpl::set_state` and `CUDAGeneratorImpl::state`.
* PyTorch Bindings
  - `THPGenerator_setState` and `THPGenerator_getState` now simply forward to `at::Generator::set_state` and `at::Generator::state`.

Pull Request resolved: pytorch#49589

Reviewed By: H-Huang

Differential Revision: D25785774

Pulled By: pbelevich

fbshipit-source-id: 8ed79209c4ffb1a0ae8b19952ac8871ac9e0255f
ts-alchemist659op added a commit to ts-alchemist659op/csprng that referenced this pull request Oct 20, 2025
Summary:
## Rationale

While most of the `torch.Generator` properties and methods are implemented as a thin wrapper of the corresponding `at::Generator` methods, `torch.Generator.get_state()` and `torch.Generator.set_state()` are implemented in legacy Torch code and are not dispatched through the `c10::GeneratorImpl` interface. This is not structured well and makes implementing generators for new backends (e.g. `XLAGeneratorImpl` for the XLA backend) inconvenient. As such, this pull request seeks to move these generator state APIs to c10 and ATen.

## What is being refactored?
* Interfaces
  - Added `c10::GeneratorImpl::set_state` and `c10::GeneratorImpl::state` for getting and setting the internal state of a random number generator.
  - `at::Generator::set_state` and `at::Generator::state` wraps the above-mentioned APIs, as it's basically a PIMPL.
  - Added helper function `at::detail::check_rng_state` for checking the validity of new RNG state tensor.
* CPU Generator
  - Renamed and moved `THTensor_(setRNGState)` and `THTensor_(getRNGState)` to `CPUGeneratorImpl::set_state` and `CPUGenerator::state`.
  - Renamed and moved `THGeneratorState` and `THGeneratorStateNew` to `CPUGeneratorStateLegacy` and `CPUGeneratorState`.
* CUDA Generator
  - Renamed and moved `THCRandom_setRNGState` and `THCRandom_getRNGState` to `CUDAGeneratorImpl::set_state` and `CUDAGeneratorImpl::state`.
* PyTorch Bindings
  - `THPGenerator_setState` and `THPGenerator_getState` now simply forward to `at::Generator::set_state` and `at::Generator::state`.

Pull Request resolved: pytorch/pytorch#49589

Reviewed By: H-Huang

Differential Revision: D25785774

Pulled By: pbelevich

fbshipit-source-id: 8ed79209c4ffb1a0ae8b19952ac8871ac9e0255f
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla signed Merged open source triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants