Summary: A test case that triggers db_options with the save operator is missing.
Test Plan: buck test
Differential Revision: D29642719
fbshipit-source-id: 72b7374d40430398abac26dfe91538550525384d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/61551
We aim to enable rate limiter in C2 load, with a fix bandwidth limit.
This diff update LoadOp to pass down the manifold db options.
Test Plan:
```
buck test mode/opt caffe2/caffe2/python/operator_test:load_save_test
```
Differential Revision: D29639102
fbshipit-source-id: cf69549adadf4c7f12a8a2b7f3ca39092cab4b99
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/59775
This operator is similar to `GetAllBlobNames` but also returns the estimated
size required to serialize each node.
One goal of this operator is to allow checkpoint saving logic to estimate the
amount of space/bandwidth required to save a checkpoint when first starting
training, without actually serializing any blobs yet. Currently the
checkpointing logic uses `GetAllBlobNames` to determine the blobs to
checkpoint. It can instead be updated to use `EstimateAllBlobSizes` to also
get an estimate for how much space will be required for the checkpoint.
ghstack-source-id: 132275153
Test Plan: Included a new unit test.
Reviewed By: mraway
Differential Revision: D29020227
fbshipit-source-id: 811e5d86c4b59183e84e6424c48c97739be09043
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53735
Add an option to BlobSerializationOptions to request that float data be
serialized as bfloat16. This reduces the serialized data size at the expense
of some loss in precision.
ghstack-source-id: 124317910
Test Plan: Included a new unit test.
Reviewed By: mraway
Differential Revision: D26658205
fbshipit-source-id: 74521ed161059066355a3f208488ed01a344dbb5
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53402
Add an `options` field to the `Save` operator which accepts options for how to
serialize different blobs. At the moment this simply allows controlling the
existing `chunk_size` behavior, but in the future we can add other options,
such as the ability to control compression settings or other serialization
formats.
ghstack-source-id: 123567034
Test Plan:
Added a new test to `load_save_test.py` that passes in options and verifies
that blobs were serialized with the expected number of chunks.
buck test caffe2/caffe2:caffe2_test_cpu \
caffe2/caffe2/core:serialization_test \
caffe2/caffe2/python/operator_test:load_save_test
Reviewed By: mraway
Differential Revision: D26502577
fbshipit-source-id: 6e302e530bb96990517c2e35c505db7f14a56284
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/53401
This is a reland of D26641599 (cd9ac54ea7) after rebasing onto D26802576 (f595ba1bae).
Add some small utility functions to read the blob names back from the minidb
file so that we can verify how many chunks were written for each blob.
ghstack-source-id: 123567033
Test Plan: buck test caffe2/caffe2/python/operator_test:load_save_test
Reviewed By: mraway
Differential Revision: D26853942
fbshipit-source-id: 0b45078fdd279f547752c8fdb771e296374a00da
Summary:
Add some small utility functions to read the blob names back from the minidb
file so that we can verify how many chunks were written for each blob.
Test Plan: buck test caffe2/caffe2/python/operator_test:load_save_test
Reviewed By: mraway
Differential Revision: D26641599
fbshipit-source-id: bccb0af157d85e585e95bc7be61c4584fba3cb04
Summary:
Add a test in `load_save_test.py` that passes in a chunk_size parameter,
to ensure that we exercise the logic that passes the chunk size to the C++
serialization code.
Test Plan:
Ran the tests with the vlog level set to 3 and manually verified the log
messages showed that we were serializing in the expected chunks.
There are existing C++ tests that confirm chunking behavior works as expected
in the pure C++ code.
Reviewed By: mraway
Differential Revision: D26502578
fbshipit-source-id: cd0074f2358da81c68b0fed2c2a94818d83a957d
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/51762
Update test_util.py to add a `make_tempdir()` function to the `TestCase`
class. The main advantage of this function is that the temporary
directory will be automatically cleaned up when the test case finishes,
so that test case does not need to worry about manually cleaning up this
directory.
This also prefixes the directory name with `caffe2_test.` so that it is
more obvious where the temporary directories came from if they are ever
left behind after a crashed or killed test process.
This updates the tests in `operator_test/load_save_test.py` to use this
new function, so they no longer have to perform their own manual cleanup
in each test.
Test Plan: python caffe2/python/operator_test/load_save_test.py
Reviewed By: mraway
Differential Revision: D26271178
Pulled By: simpkins
fbshipit-source-id: 51175eefed39d65c03484482e84923e5f39a4768
Summary:
There is a module called `2to3` which you can target for future specifically to remove these, the directory of `caffe2` has the most redundant imports:
```2to3 -f future -w caffe2```
Pull Request resolved: https://github.com/pytorch/pytorch/pull/45033
Reviewed By: seemethere
Differential Revision: D23808648
Pulled By: bugra
fbshipit-source-id: 38971900f0fe43ab44a9168e57f2307580d36a38
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/43205
A number of tests that forward to `TestLoadSaveBase.load_save` are all marked as flaky due to them regularly taking much longer to start up than hypothesis' default timeout of 200ms. This diff fixes the problem by removing the timeout for `load_save`. This is alright as these tests aren't meant to be testing the performance of these operators.
I would set the deadline to 60s if I could however it appears the that caffe2 github CI uses a different version of hypothesis that doesn't allow using `dateutil.timedelta` so instead of trying to figure out an approach that works on both I've just removed the deadline time.
I've also tagged all existing tasks WRT these failures.
Differential Revision: D23175752
fbshipit-source-id: 324f9ff034df1ac4874797f04f50067149a6ba48
Summary:
When output blob names are specified while load_all=1, output blob names are ignored. However, this behavior is not documented. In this diff, we just disallow users to provide blob names when load_all=1.
See discussion at https://fb.workplace.com/groups/1405155842844877/permalink/2714909788536136/
Pull Request resolved: https://github.com/pytorch/pytorch/pull/19133
Reviewed By: dzhulgakov
Differential Revision: D14883698
Pulled By: chandlerzuo
fbshipit-source-id: 6e4171e36c4ccc4f857e79da98b858a06b7d8ad6
Summary:
Goal of this PR is to unify cuda and hip device types in caffe2 python front end.
Pull Request resolved: https://github.com/pytorch/pytorch/pull/14221
Differential Revision: D13148564
Pulled By: bddppq
fbshipit-source-id: ef9bd2c7d238200165f217097ac5727e686d887b
Summary:
Original commit changeset: f5614a5d2607
D9986213 is causing Multifeed Aggregator a [huge performance different](https://our.intern.facebook.com/intern/ads/analyze_canary/412951953278781781/) and is blocking aggregator push since last Friday night: https://fburl.com/feedtools/b6izvwjz
We need to land this revert ASAP to unblock aggregator push.
Reviewed By: orionr
Differential Revision: D10123245
fbshipit-source-id: d83da8e00a1250f5d09811a0a587c127e377aab2