pytorch/caffe2/serialize
Lujia Zhang b897c57d47 [TGIF][Inplace][Perf] Copy tensor to device with pinned memory & move copy weight sleep to getRecord (#106849)
Summary:
There are 2 changes in the diff that helps optimize perf during inplace update:
1. Read data with pinned memory
2. move the copy weight sleep from between copying the whole Tensor to between copying chunks

Test Plan:
**Local Test**
```
./ai_infra/inference_platform/test_platform/script/run_sigrid_4card.sh --port 7451 --local_model_dir /home/lujia/script --cuda_devices 6 --bind_node 3 --model_id 962549778_514 --gflag_config_path sigrid/predictor/predictor_x_gflags_mrs_prospector_gpu_torchscript_fusedsolution_1card_opt_fm -- --enable_thrift_warmup=false --tgif_replicate_merge_by_tempfile=false --enable_inplace_snapshot_transition --model_version_config_path sigrid/predictor/models_version/lujia_test --inplace_update_max_retries 0 --submod_to_device="merge|cuda0"
```

**Load test on job  tsp_eag/smart/inference_platform_sp__sigrid_predictor_gpu_adhoc_realtimetest_m962549778_latest.s3**

Before:
(p99 latency)
{F1066957232}

(SR error rate)
 {F1066957650}

After:
(p99 latency)
 {F1066957141}

(SR error rate)
{F1066957376}

Differential Revision: D48182533

Pull Request resolved: https://github.com/pytorch/pytorch/pull/106849
Approved by: https://github.com/842974287, https://github.com/kit1980
2023-08-13 07:37:46 +00:00
..
CMakeLists.txt
crc.cc
crc_alt.h
file_adapter.cc
file_adapter.h
in_memory_adapter.h
inline_container.cc [TGIF][Inplace][Perf] Copy tensor to device with pinned memory & move copy weight sleep to getRecord (#106849) 2023-08-13 07:37:46 +00:00
inline_container.h [TGIF][Inplace][Perf] Copy tensor to device with pinned memory & move copy weight sleep to getRecord (#106849) 2023-08-13 07:37:46 +00:00
inline_container_test.cc [TGIF][Inplace][Perf] Copy tensor to device with pinned memory & move copy weight sleep to getRecord (#106849) 2023-08-13 07:37:46 +00:00
istream_adapter.cc
istream_adapter.h
read_adapter_interface.cc
read_adapter_interface.h
versions.h