New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
#pragma once
|
|
|
|
|
|
2023-12-19 02:14:28 +00:00
|
|
|
#include <c10/core/Device.h>
|
|
|
|
|
#include <c10/core/Stream.h>
|
2019-01-11 00:06:27 +00:00
|
|
|
#include <c10/core/impl/InlineStreamGuard.h>
|
2023-12-19 02:14:28 +00:00
|
|
|
#include <c10/core/impl/VirtualGuardImpl.h>
|
|
|
|
|
#include <c10/util/ArrayRef.h>
|
|
|
|
|
#include <c10/util/Optional.h>
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
|
|
|
|
|
namespace c10 {
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* A StreamGuard is an RAII class that changes the current device
|
|
|
|
|
* to the device corresponding to some stream, and changes the
|
|
|
|
|
* default stream on that device to be this stream.
|
|
|
|
|
*
|
|
|
|
|
* Use of StreamGuard is HIGHLY discouraged in operator definitions. In
|
|
|
|
|
* a single operator, you probably don't know enough about the global
|
|
|
|
|
* state of the world to profitably decide how to set streams. Let
|
|
|
|
|
* the caller handle this appropriately, and just use the current stream
|
|
|
|
|
* in your operator code.
|
|
|
|
|
*
|
|
|
|
|
* This StreamGuard does NOT have an uninitialized state; it is guaranteed
|
|
|
|
|
* to reset the stream and device on exit. If you are in a situation
|
|
|
|
|
* where you *might* want to setup a stream guard, see OptionalStreamGuard.
|
|
|
|
|
*/
|
|
|
|
|
struct StreamGuard {
|
|
|
|
|
/// No default constructor, see Note [Omitted default constructor from RAII]
|
|
|
|
|
explicit StreamGuard() = delete;
|
2024-10-23 00:16:53 +00:00
|
|
|
~StreamGuard() = default;
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
|
|
|
|
|
/// Set the current device to the device associated with the passed stream,
|
|
|
|
|
/// and set the current stream on that device to the passed stream.
|
|
|
|
|
explicit StreamGuard(Stream stream) : guard_(stream) {}
|
|
|
|
|
|
|
|
|
|
/// Copy is disallowed
|
|
|
|
|
StreamGuard(const StreamGuard&) = delete;
|
|
|
|
|
StreamGuard& operator=(const StreamGuard&) = delete;
|
|
|
|
|
|
|
|
|
|
/// Move is disallowed, as StreamGuard does not have an uninitialized state,
|
|
|
|
|
/// which is required for moves on types with nontrivial destructors.
|
|
|
|
|
StreamGuard(StreamGuard&& other) = delete;
|
|
|
|
|
StreamGuard& operator=(StreamGuard&& other) = delete;
|
|
|
|
|
|
|
|
|
|
/// Resets the currently set stream to the original stream and
|
|
|
|
|
/// the currently set device to the original device. Then,
|
|
|
|
|
/// set the current device to the device associated with the passed stream,
|
|
|
|
|
/// and set the current stream on that device to the passed stream.
|
|
|
|
|
///
|
|
|
|
|
/// NOTE: this implementation may skip some stream/device setting if
|
|
|
|
|
/// it can prove that it is unnecessary.
|
|
|
|
|
///
|
|
|
|
|
/// WARNING: reset_stream does NOT preserve previously set streams on
|
|
|
|
|
/// different devices. If you need to set streams on multiple devices
|
|
|
|
|
/// on , use MultiStreamGuard instead.
|
|
|
|
|
void reset_stream(Stream stream) {
|
|
|
|
|
guard_.reset_stream(stream);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the stream that was set at the time the guard was constructed.
|
|
|
|
|
Stream original_stream() const {
|
|
|
|
|
return guard_.original_stream();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the most recent stream that was set using this device guard,
|
|
|
|
|
/// either from construction, or via set_stream.
|
|
|
|
|
Stream current_stream() const {
|
|
|
|
|
return guard_.current_stream();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the most recent device that was set using this device guard,
|
|
|
|
|
/// either from construction, or via set_device/reset_device/set_index.
|
|
|
|
|
Device current_device() const {
|
|
|
|
|
return guard_.current_device();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the device that was set at the most recent reset_stream(),
|
|
|
|
|
/// or otherwise the device at construction time.
|
|
|
|
|
Device original_device() const {
|
|
|
|
|
return guard_.original_device();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
private:
|
2018-11-14 15:25:52 +00:00
|
|
|
c10::impl::InlineStreamGuard<impl::VirtualGuardImpl> guard_;
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
};
|
|
|
|
|
|
|
|
|
|
/**
|
|
|
|
|
* An OptionalStreamGuard is an RAII class that sets a device to some value on
|
|
|
|
|
* initialization, and resets the device to its original value on destruction.
|
|
|
|
|
* See OptionalDeviceGuard for more guidance on how to use this class.
|
|
|
|
|
*/
|
|
|
|
|
struct OptionalStreamGuard {
|
|
|
|
|
/// Create an uninitialized guard.
|
2023-01-11 01:16:05 +00:00
|
|
|
explicit OptionalStreamGuard() = default;
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
|
|
|
|
|
/// Set the current device to the device associated with the passed stream,
|
|
|
|
|
/// and set the current stream on that device to the passed stream.
|
|
|
|
|
explicit OptionalStreamGuard(Stream stream) : guard_(stream) {}
|
|
|
|
|
|
|
|
|
|
/// Set the current device to the device associated with the passed stream,
|
|
|
|
|
/// and set the current stream on that device to the passed stream,
|
|
|
|
|
/// if the passed stream is not nullopt.
|
2024-08-02 13:46:44 +00:00
|
|
|
explicit OptionalStreamGuard(std::optional<Stream> stream_opt)
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
: guard_(stream_opt) {}
|
|
|
|
|
|
|
|
|
|
/// Copy is disallowed
|
|
|
|
|
OptionalStreamGuard(const OptionalStreamGuard&) = delete;
|
|
|
|
|
OptionalStreamGuard& operator=(const OptionalStreamGuard&) = delete;
|
|
|
|
|
|
|
|
|
|
// See Note [Move construction for RAII guards is tricky]
|
|
|
|
|
OptionalStreamGuard(OptionalStreamGuard&& other) = delete;
|
|
|
|
|
|
|
|
|
|
// See Note [Move assignment for RAII guards is tricky]
|
|
|
|
|
OptionalStreamGuard& operator=(OptionalStreamGuard&& other) = delete;
|
2024-10-23 00:16:53 +00:00
|
|
|
~OptionalStreamGuard() = default;
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
|
|
|
|
|
/// Resets the currently set stream to the original stream and
|
|
|
|
|
/// the currently set device to the original device. Then,
|
|
|
|
|
/// set the current device to the device associated with the passed stream,
|
|
|
|
|
/// and set the current stream on that device to the passed stream.
|
|
|
|
|
/// Initializes the guard if it was not previously initialized.
|
|
|
|
|
void reset_stream(Stream stream) {
|
|
|
|
|
guard_.reset_stream(stream);
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the stream that was set at the time the guard was most recently
|
|
|
|
|
/// initialized, or nullopt if the guard is uninitialized.
|
2024-07-15 00:48:43 +00:00
|
|
|
std::optional<Stream> original_stream() const {
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
return guard_.original_stream();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Returns the most recent stream that was set using this stream guard,
|
|
|
|
|
/// either from construction, or via reset_stream, if the guard is
|
|
|
|
|
/// initialized, or nullopt if the guard is uninitialized.
|
2024-07-15 00:48:43 +00:00
|
|
|
std::optional<Stream> current_stream() const {
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
return guard_.current_stream();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
/// Restore the original device and stream, resetting this guard to
|
|
|
|
|
/// uninitialized state.
|
|
|
|
|
void reset() {
|
|
|
|
|
guard_.reset();
|
|
|
|
|
}
|
|
|
|
|
|
|
|
|
|
private:
|
2023-01-11 01:16:05 +00:00
|
|
|
c10::impl::InlineOptionalStreamGuard<impl::VirtualGuardImpl> guard_{};
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
};
|
|
|
|
|
|
2021-04-29 16:29:02 +00:00
|
|
|
/**
|
|
|
|
|
* A MultiStreamGuard is an RAII class that sets the current streams of a set of
|
|
|
|
|
* devices all at once, and resets them to their original values on destruction.
|
|
|
|
|
*/
|
|
|
|
|
struct MultiStreamGuard {
|
|
|
|
|
/// Set the current streams to the passed streams on each of their respective
|
|
|
|
|
/// devices.
|
|
|
|
|
explicit MultiStreamGuard(ArrayRef<Stream> streams) : guard_(streams) {}
|
|
|
|
|
|
|
|
|
|
/// Copy is disallowed
|
|
|
|
|
MultiStreamGuard(const MultiStreamGuard&) = delete;
|
|
|
|
|
MultiStreamGuard& operator=(const MultiStreamGuard&) = delete;
|
|
|
|
|
|
|
|
|
|
// See Note [Move construction for RAII guards is tricky]
|
|
|
|
|
MultiStreamGuard(MultiStreamGuard&& other) = delete;
|
|
|
|
|
|
|
|
|
|
// See Note [Move assignment for RAII guards is tricky]
|
|
|
|
|
MultiStreamGuard& operator=(MultiStreamGuard&& other) = delete;
|
2024-10-23 00:16:53 +00:00
|
|
|
~MultiStreamGuard() = default;
|
2021-04-29 16:29:02 +00:00
|
|
|
|
|
|
|
|
private:
|
|
|
|
|
c10::impl::InlineMultiStreamGuard<impl::VirtualGuardImpl> guard_;
|
|
|
|
|
};
|
|
|
|
|
|
New implementations of DeviceGuard, StreamGuard and MultiStreamGuard (with CUDA specializations) (#13342)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/13342
This PR introduces a few new concepts:
- DeviceGuardImplInterface, and implementations for CPU and CUDA, which
provide a generic interface for interfacing with device and stream state,
without requiring a direct dependency on the code in question.
- InlineDeviceGuard, a general template for generating both specialized
and dynamically dispatched device guard implementations. Dynamic
dispatch is done by specializing it on a VirtualGuardImpl.
- Provide a device-independent DeviceGuard class, which can be used even
from CPU code. It uses the aforementioned dynamic dispatch.
- CUDA-specialized CUDAGuard class, which doesn't have a dynamic dispatch
but can only be used from CUDA.
- StreamGuard, which is the same as above, but for streams rather than
devices.
- Optional variants of all the aforementioned guards, which are a no-op if
no device/stream is specified
- CUDAMultiStreamGuard, specifically for the case when we want to set
a device on every guard.
There are some subtle semantic changes, which have been thoroughly documented
in the class definition.
BC-breaking changes:
- Move constructor/assignment have been removed from all device guard
implementations.
- In some cases where you previously wrote 'set_device' (or 'set_stream'), you now must write
'reset_device', because if you switch devices/device types, the stream/device on the
previous device is unset. This is different from previous behavior.
- CUDAGuard no longer handles streams, or multiple streams. Use CUDAStreamGuard
or CUDAMultiStreamGuard as appropriate for your use case.
Reviewed By: dzhulgakov
Differential Revision: D12849620
fbshipit-source-id: f61956256f0b12be754b3234fcc73c2abc1be04e
2018-11-11 20:08:57 +00:00
|
|
|
} // namespace c10
|