pytorch/test/cpp/rpc
Pritam Damania 8b501dfd98 Fix memory leak in TensorPipeAgent. (#50564)
Summary:
Pull Request resolved: https://github.com/pytorch/pytorch/pull/50564

When an RPC was sent, the associated future was stored in two maps:
pendingResponseMessage_ and timeoutMap_. Once the response was received, the
entry was only removed from pendingResponseMessage_ and not timeoutMap_. The
pollTimedoudRpcs method then eventually removed the entry from timeoutMap_
after the time out duration had passed.

Although, in scenarios where there is a large timeout and a large number of
RPCs being used, it is very easy for the timeoutMap_ to grow without any
bounds. This was discovered in https://github.com/pytorch/pytorch/issues/50522.

To fix this issue, I've added some code to cleanup timeoutMap_ as well once we
receive a response.
ghstack-source-id: 119925182

Test Plan:
1) Unit test added.
2) Tested with repro in https://github.com/pytorch/pytorch/issues/50522

#Closes: https://github.com/pytorch/pytorch/issues/50522

Reviewed By: mrshenli

Differential Revision: D25919650

fbshipit-source-id: a0a42647e706d598fce2ca2c92963e540b9d9dbb
2021-01-18 16:34:28 -08:00
..
CMakeLists.txt
e2e_test_base.cpp
e2e_test_base.h Completely Remove FutureMessage from RPC cpp tests (#50027) 2021-01-07 19:50:50 -08:00
test_e2e_process_group.cpp [c10d] switch ProcessGroup to be managed by intrusive_ptr (#47343) 2020-11-12 07:36:23 -08:00
test_e2e_tensorpipe.cpp Fix memory leak in TensorPipeAgent. (#50564) 2021-01-18 16:34:28 -08:00
test_tensorpipe_serialization.cpp
test_wire_serialization.cpp