mirror of
https://github.com/saymrwulf/pytorch.git
synced 2026-05-15 21:00:47 +00:00
Summary: Pull Request resolved: https://github.com/pytorch/pytorch/pull/28039 Right now, torch::save() uses std::ostream, which results in unnecessary data copies in practice. Similar for torch::load(). Adding a std::function<size_t(const void*, size_t)> as an output option, parallel to the existing filename and std::ostream apis, gives users the flexibility to emit directly to a backing store. For a simple case of appending the output to a std::string, we observe significant benchmark savings (on order of -50%), even with the minor std::function<> dispatch overhead. The main reason is that std::ostringstream effectively requires 2 extra copies of the data beyond a simple string.append lambda. We also provide a parallel api for the load(), though this one is slightly more complex due to the need to do arbitrary position reads. Test Plan: buck test mode/dev-nosan caffe2/test/... (Basic serialization test in caffe2/test/cpp/api/serialize.cpp) Benchmark in experimental/jeremyl/c2/SerializationBench.cpp, with D17823443 (1M time goes from 90ms -> 40ms, albeit with crc patch applied) Differential Revision: D17939034 fbshipit-source-id: 344cce46f74b6438cb638a8cfbeccf4e1aa882d7 |
||
|---|---|---|
| .. | ||
| any.cpp | ||
| autograd.cpp | ||
| CMakeLists.txt | ||
| dataloader.cpp | ||
| expanding-array.cpp | ||
| functional.cpp | ||
| init.cpp | ||
| init_baseline.h | ||
| init_baseline.py | ||
| integration.cpp | ||
| jit.cpp | ||
| memory.cpp | ||
| misc.cpp | ||
| module.cpp | ||
| modulelist.cpp | ||
| modules.cpp | ||
| nn_utils.cpp | ||
| optim.cpp | ||
| optim_baseline.h | ||
| optim_baseline.py | ||
| ordered_dict.cpp | ||
| parallel.cpp | ||
| README.md | ||
| rnn.cpp | ||
| sequential.cpp | ||
| serialize.cpp | ||
| static.cpp | ||
| support.h | ||
| tensor.cpp | ||
| tensor_cuda.cpp | ||
| tensor_options.cpp | ||
| tensor_options_cuda.cpp | ||
| torch_include.cpp | ||
C++ Frontend Tests
In this folder live the tests for PyTorch's C++ Frontend. They use the GoogleTest test framework.
CUDA Tests
To make a test runnable only on platforms with CUDA, you should suffix your
test with _CUDA, e.g.
TEST(MyTestSuite, MyTestCase_CUDA) { }
To make it runnable only on platforms with at least two CUDA machines, suffix
it with _MultiCUDA instead of _CUDA, e.g.
TEST(MyTestSuite, MyTestCase_MultiCUDA) { }
There is logic in main.cpp that detects the availability and number of CUDA
devices and supplies the appropriate negative filters to GoogleTest.
Integration Tests
Integration tests use the MNIST dataset. You must download it by running the following command from the PyTorch root folder:
$ python tools/download_mnist.py -d test/cpp/api/mnist
The required paths will be referenced as test/cpp/api/mnist/... in the test
code, so you must run the integration tests from the PyTorch root folder.