Sherlock
60dbd8a1e5
Update maximum batch size for UT; Include recompute modes ( #5444 )
...
* Update MaxBatchSize and include recompute mode
* Minor fix for frontend test
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-10-12 14:50:43 -07:00
Sherlock
37445d1198
Update Bert Perf Script ( #5339 )
...
Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-09-30 14:30:20 -07:00
Vincent Wang
7fb194d03d
Update convergence baseline for ci_test. ( #4465 )
...
Co-authored-by: Vincent Wang <weicwang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
2020-07-09 15:29:36 +08:00
ytaous
4380b8ba68
adjust bs size ( #4375 )
...
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-06-30 10:29:48 -07:00
edgchen1
63bf587623
Use azcopy to download test data ( #4221 )
...
Use azcopy from download_e2e_test_data.py, add helper function for downloading azcopy.
Update download_test_data.py to use helper function.
2020-06-16 10:14:34 -07:00
ytaous
5d28efd434
opset12 code cleanup ( #4242 )
...
* opset12 code cleanup
* opset12 code cleanup
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-06-15 19:45:35 -07:00
ytaous
e0334f177c
Opset12 upgrade for existing models used by perf/e2e pipelines ( #4238 )
...
* opset12 support
* opset12 support
* on comments
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-06-15 14:26:53 -07:00
edgchen1
ba74914c5a
Remove evaluation output from training e2e test baseline data. ( #4092 )
2020-06-01 15:06:21 -07:00
ytaous
72d508b7a0
New perf metric - e2e throughput ( #4085 )
...
* new metric
* on comments
* tab to spaces
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-06-01 12:11:34 -07:00
edgchen1
38d76cc904
Clean up training E2E test ( #4078 )
...
Update training E2E build to not go through CTest and call test scripts directly.
2020-05-29 09:20:47 -07:00
ytaous
fb4efafc8e
GPT-2 training perf scripts ( #3974 )
...
* gpt2 training perf
* gpt2 training perf
* debug
* debug
* debug
* fix bug
* minor
* on comments
* dynamic sql
* fix build
* minor
* linked hash
* on comments
* minor
* mem
* minor
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-05-19 10:21:40 -07:00
ytaous
93eb9bcfde
Add yaml/perf scripts for new perf test pipeline ( #3909 )
...
* yaml/perf scripts for new pipeline
* yaml/perf scripts for new pipeline
* remove unused imports
* testing some comments change
* testing some comments change
* testing jdbc
* testing jdbc
* testing jdbc
* exclude pwd from jdbc properties
* exclude pwd from jdbc properties
* namedtuple
* on comments
Co-authored-by: Ethan Tao <ettao@microsoft.com>
2020-05-13 14:15:17 -07:00
Xueyun Zhu
f1ba9aaf34
Add pipeline transformer for wait/record node ( #3513 )
...
* pipeline transformer
* clean up
* address feedback
* add record/wait for first stage and updated split script
* address feedback
* make recv/send signal as initializer
* merge
* address feedback
* unify input and initializer
* address feedback and bug fix
* minor fix
* windows build
* fix
2020-04-22 23:28:01 -07:00
pengwa
2c7c45076b
MaxBatchSize E2E Test ( #3454 )
...
* max batch size e2e test
*update test data snapshot
2020-04-15 09:50:44 +08:00
Thiago Crepaldi
15e32b44fd
Merge pull request #3383
...
Merge from master into ort_training
2020-04-06 19:05:01 -07:00
Edward Chen
95707d22a5
Disable gradient clipping for E2E test.
2020-04-06 23:07:28 +00:00
Xueyun Zhu
efc8bd738f
add pipeline graph split script ( #3275 )
...
* pipeline graph cut
* add element type
* add input wait event and shape info
* shape inference
* support multiple cuts
* format script
* address feedback
* address feedback
2020-03-31 19:30:18 -07:00
edgchen1
d9f628cb1d
Remove orttraining/tools/scripts/profile directory. ( #3268 )
2020-03-19 14:13:05 -07:00
Jesse Benson
3a7539e071
Update bert-base convergence values
2020-03-13 23:03:34 -07:00
Edward Chen
e542cfd0e0
Introduce training changes.
2020-03-11 14:39:03 -07:00