Commit graph

1480 commits

Author SHA1 Message Date
Changming Sun
ff52d6a6bf
Delete Dockerfile.ubuntu (#12888)
The file was solely for Nuphar.
2022-09-08 10:26:40 -07:00
Changming Sun
a811c7629f
Remove "Build Python Documentation" from py-packaging-stage.yml (#12890)
Remove "Build Python Documentation" from py-packaging-stage.yml because the task has been moved to Github actions by @natke in PR #10116 .
2022-09-08 09:56:54 -07:00
RandySheriffH
d3b684cd9e
Drop nuphar (#11555)
* drop nuphar code and configs

* refactor test case

* format python

* remove nuphar from training test

* remove commented nuphar logics

* restore llvm setting

* drop nuphar ci

* fix compile err

* fix compile err

Co-authored-by: Randy Shuai <rashuai@microsoft.com>
2022-09-07 15:11:18 -07:00
Hariharan Seshadri
ad69aac491
Introduce ordered quantization ops for the CUDA EP [1/n] (#12582)
Initial core small set for the ordered quantization ops for cuda EP.
2022-09-07 11:58:15 -07:00
Yi Zhang
c571b99336
Refactor setup_test_data (#12818)
* refactory setup_test_data

* mv setup test data to test stage

* model link for C# test

* add comment
2022-09-07 08:33:27 +08:00
Baiju Meswani
295bd26980
Remove orttraining-distributed CI pipeline (#12738) 2022-09-02 14:34:26 -07:00
PeixuanZuo
adbc0757ad
[UPDATE] update ROCm ci pipeline to ROCm5.2.3 (#12799)
* [Update] update to rocm5.2.3

* [Fix] cmake version

* [Fix] disbale ortmodule tests

* [revert] revert performance number
2022-09-01 10:32:24 +08:00
Vincent Wang
262a597e2a
[CUDA] BiasSoftmax and Dropout Fusion (#12667)
* bias softmax dropout fusion

* fix rocm build

* move some files
2022-09-01 10:01:44 +08:00
Baiju Meswani
a52543ecd8
Generate windows training package (#12789) 2022-08-30 16:35:50 -07:00
Yulong Wang
82a28cc2c3
upgrade emsdk to 3.1.19 (#12690)
* upgrade emsdk to 3.1.19

* fix build break

* ignore '-Wunused-but-set-variable' in eigen

* add malloc and free in exported functions

* EXPORTED_FUNCTIONS
2022-08-30 13:42:45 -07:00
Yi Zhang
b4f6dad7c9
increase timeout limit of mac silicon package workflow (#12784)
increase timeout
2022-08-30 13:57:01 +08:00
PeixuanZuo
19ca2a0089
[ADD] python package pipeline for ROCm5.2.3 (#12770)
* [TEST] test rocm5.2.3

[TEST] rm torchversion

[Update]sort

Co-authored-by: Ubuntu <peixuanzuo@peixuanzuomi200vm.zvflicr54joexhdgnhvmxrxygg.phxx.internal.cloudapp.net>
2022-08-30 11:05:59 +08:00
Edward Chen
1ce14e752b
Increase timeout for clean-build-docker-image-cache-pipeline. (#12776) 2022-08-29 15:30:35 -07:00
Baiju Meswani
80c8d934b8
Add debug option to packaging pipeline (#12685) 2022-08-26 20:25:52 -07:00
mwootton
817dc94345
Add first pass of rocm kernel profiler (#10911)
* Add first pass of rocm kernel profiler

* Clean up rocm_profiler. Format args. Demangle kernel names.
Add Api EventRecords

* Remove debug output

* Temporarily disable profiling unit test 'api record check' for cupti

* Fix compile error for non-gpu builds

* Use common file for demangle and pid/tid.  Namespace ThreadUtil.  Fix gpu buffer clearing.

* Merge demangle into profiler_common

* Merge demangle into profiler_common part 2

* Style cleanup

* Resolve linking issues via ProviderHost interface

* Demangle cuda kernel names

* Clean up comments

* Fix formatting

* Fix anal retentive formatting
2022-08-26 19:38:03 -07:00
Adam Louly
ee543a47f6
upgrade cuda version on ci pipelines (training CI pipelines) (#12708)
* upgrade cuda version on ci pipelines

* keeping folder name same

* keeping folder name same

* setting manual seed for primitive test case

* resolving comments

* changing atol and rtrol only for test case

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-26 16:51:19 -07:00
Baiju Meswani
34d90dd5bd
mac-objc-static-analysis-ci-pipeline increase timeout (#12737) 2022-08-26 12:49:49 -07:00
Adam Louly
3bb5fb0f90
moving training pipelines from cuda 11.5 to 11.6 and deprecating 11.3 (packaging pipeline) (#12688)
* moving training pipelines from cuda 11.5 to 11.6 and deprecating cuda 11.3

* change to cuda 11.6.2

* change pytorch's & torchvision's cuda version to 11.6

* specify deps version to 11.6.2

* update pytorch and torch text version

* torch 1.12.1

* change torchvision and torchtext version to be compatible with torch 1.12.1

* change cuda to 11.6 for cuda_home comaptibility

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-25 22:12:01 -07:00
Cheng
baf141a084
Enable xnnpack EP in Android AAR package (#12720)
* take new features to export symbols

* comments to explain why
2022-08-26 10:29:23 +08:00
Scott McKay
8483b9c6e3
MacOS pipeline and MAUI CoreML fixes (#12724)
* Add asm statement to model.mm to force linker to link against CoreML.Framework.

Update targets.xml as per Rolf's suggestions

* Remove explicit numpy version from macos build. We don't specify it for other CIs and the version specified doesn't have a pre-built 3.10 wheel. This leads to the CI attempting to build numpy which fails.
2022-08-26 08:51:37 +10:00
Cassie Breviu
e85dce8cea
Add csharp docfx (#12596)
* add docfx and gh action to build docs

* kick off build from feature branch

* Fix LGTM linting

* update az pipeline to win22 & remove nuget install

* remove azure ci changes

* fix implicit using to support 5.0

* fix more js issues

* remove resource designer changes

* remove space

* fix linting misspellings in autogenerated js temp

* fix misspellings in generated code

* delete log file
2022-08-25 09:51:32 -05:00
Yi Zhang
dee2fdffb0
Remove debug build/test in Mac CPU training (#12698)
* run mac training parallely

* update jobname

* remove debug build/test
2022-08-25 13:38:53 +08:00
Yi Zhang
d91f017da1
remove redundant publish unit test results (#12697)
rm redundant publish unit test results
2022-08-25 11:18:07 +08:00
Cheng
eba4f77d00
enable xnnpack in default_full_aar_build_settings (#12682) 2022-08-25 10:41:06 +08:00
Changming Sun
7927d525a7
Remove CUDNN path from CI build scripts (#12671) 2022-08-24 18:21:50 -07:00
Adam Louly
94f76b944e
nightly pipeline build using PTCA image. (#12605)
* nightly pipeline yaml and requirements files

* changed names, removed torchvision installing

* delete old file

Co-authored-by: Adam Louly <adamlouly@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-08-24 10:40:55 -07:00
Changming Sun
cb2601c5ea
Update mac-ci.yml to increase macOS build jobs' timeout value to 3 hours (#12675) 2022-08-22 21:31:30 -07:00
Wei-Sheng Chin
dc486d146b
Make ORT callable from various Pytorch compilers (LazyTensor, TorchDynamo, etc) (#10460)
* Make ORT as Pytorch JIT backend

LORT likely doesn't work with aten fallback so we only test LORT in its own CI.

* Revert changes to enable external CUDA allocator. Will add it later.

Revert "Revert changes to enable external CUDA allocator. Will add it later."

This reverts commit d5487f2e193014c805505afae8fb577c53667658.

Fix external allocator

* Relax tolerance and remove commented code

* Print more information in CI

* Fix pointer

* Address comments.
1. Reuse ORT-eager mode's environment.
2. Remove unused ctor.

* Use Pytorch master branch as all PRs are merged

Fix

* Refine based on cpplint feedbacks

* Revert changes to allow custom CUDA allocator in public APIs

* Use torch.testing.assert_close

* Use unittest framework

* Switch docker repo

* Rename *.cpp to *.cc

* Address comments

* Add comment

* Use same pipeline file for eager and lort pipelines

* Address comments

* Add yaml comment

* Fix cmake files

* Address comments

* Rename flags, remove printing code, remove dead comment
2022-08-22 09:40:40 -07:00
Changming Sun
b270334e1e
Update numpy version from 1.21.0 to 1.21.6 to avoid building it from source (#12644) 2022-08-18 22:11:48 -07:00
Thiago Crepaldi
d1ba801570
Add BuildError for --gen_doc and --enable_training (#12630) 2022-08-17 14:18:37 -04:00
yf711
9d10badc55
Add build option to link TensorRT prebuilt parser (#12602)
* Add build option to link prebuilt TensorRT parser

* Test without the build option to link prebuilt TRTParser

* Minor: update name of build option

* Minor: update name of build option
2022-08-16 14:09:58 -07:00
Xinya Zhang
eb827bd3e5
[ROCm] NGramRepeatBlock, LongformerAttention and DecoderAttention Ops (#11971)
* [ROCm] enable NGramRepeatBlock Op

* [ROCm] Enable testing ROCm in NGramRepeatBlockTest.NGramSize_3

Also link onnxruntime_test_all with amdhip64 when USE_ROCM=1

* [ROCm] add LongformerAttention Op

* [ROCm] Enable LongformerAttentionTest

* [ROCm] Add DecoderAttention Op

* Enable DecoderAttention Test for ROCm.

* [ROCM] Updates according to reviews
2022-08-11 19:32:08 -07:00
Changming Sun
ac7538b909
Remove CUDA 10.2 support (#12541) 2022-08-10 22:46:41 -07:00
Baiju Meswani
3e78f3cf1f
Add win-ci pipeline for on-device training (#12513) 2022-08-10 14:45:39 -07:00
Changming Sun
c0d396d176
Restrict "Component Detection" task to Lotus project only (#12536)
It is related to PR #12426
2022-08-10 03:25:29 -07:00
Changming Sun
e810480403
Replace the occurrences of "master" to "main" in yaml files (#12534) 2022-08-09 22:03:21 -07:00
Vincent Wang
e85e31ee80
Update ORTModule Default Opset Version to 15 (#12419)
* update ortmodule opset to 15

* update torch version

* fix ut

* fix ut

* rollback

* rollback for orttrainer
2022-08-05 16:55:04 +08:00
PeixuanZuo
3e1b0ac4b3
[DELETE] delete python package rocm4.3.1 (#12480)
[delete] delete rocm4.3.1
2022-08-05 13:27:42 +08:00
Vincent Wang
37995a7245
[CUDA] BiasSoftmax Supporting New Pattern (#12361) 2022-08-05 06:59:24 +08:00
Xinya Zhang
77cab7a3a5
[ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops (#11968)
* [ROCm] disable expected failure tests PoolTest.MaxPool_10_DilationPadding_?d

* [ROCm] Add AveragePool, GlobalAveragePool, MaxPool, GlobalMaxPool Ops

* (To squash after review) Replace rocm/nn/pool.cc with amd_hipify.py changes

* [ROCM] Replace miCompat with Helper functions

* (to squash) fix the compiling error of SetPoolingNdDescriptorHelper
2022-08-03 14:36:36 -07:00
Xinya Zhang
01f3a197d7
[ROCm] InstanceNormalization, BatchNormalization and LRN Ops (#11972)
* [ROCm] Add InstanceNormalization Op

* Enable InstanceNormBatch1_fp16 and InstanceNormBatch2_fp16 for ROCm

* [ROCm] Add BatchNormalization for fp32 and fp16

* Enable BatchNormTest for ROCm

* [ROCm] Add LRN Op

* [ROCM] replace miCompat functions with Helper functions
2022-08-02 23:14:26 -07:00
Changming Sun
5d610bc8eb
Disable CG task in PR pipelines (#12426) 2022-08-02 19:01:41 -07:00
Yulong Wang
feed5da435
[js] loosen test timeout (#12427)
Losen the following test timeout:

1. "Test Web Multi-Browsers" stage in "ONNX Runtime Web CI Pipeline": 30min -> 60min
2. Node.js binding default per-case timeout: 30 sec -> 90 sec
2022-08-02 19:01:19 -07:00
Changming Sun
1a64b94f60
Fix a small issue in nuget packaging pipeline (#12405)
In #12358 I typed a wrong path in the yaml file.
2022-08-02 15:44:43 -07:00
Yi Zhang
5d1173fe68
Run IOS pipeline concurrently (#12400)
split ios pipelines
2022-08-02 11:07:17 +08:00
Yi Zhang
63d64636f6
Add the comment linking to wiki (#12398)
add the comment
2022-08-02 10:09:16 +08:00
Yi Zhang
8b4ad77ea2
pipeline can use last run's artifacts (#12379)
* first step

* depends on stage

* temp change

* specific

* runId

* parameters

* fix typo

* fix typo

* add nnapi

* add nnapi

* fix typo

* minor fix

* condition on stage

* format

* format
2022-07-30 21:34:57 +08:00
Changming Sun
7b4ce0c1e1
Delete the build scripts that were copied from manylinux project (#12358)
1. Delete the build scripts that were copied from manylinux project. Use "git checkout" instead.
2. Update manylinux version to get python 3.11. Related issue: Python 3.11 support #12343
3. Change the cuda version of linux gpu build job of nuget packaging pipeline from cuda 11.4 to cuda 11.6 to match the TRT job within the same pipeline.. (A lot other places need be updated as well, but I'd prefer to put them in another PR)
4. Make dockerfile names static. For example, replace tools/ci_build/github/linux/docker/$(DockerFile) to tools/ci_build/github/linux/docker/Dockerfile.manylinux2014_cpu . The former one relies on a runtime variable $(DockerFile), Template Parameters are expanded early in processing a pipeline run when most variables are not available. It like C++ macros vs variables.
2022-07-29 18:24:19 -07:00
ytaous
e4bd41fb3b
[ROCm] Enable Einsum for inferencing perf (#12360)
* enable einsum

* address comments

* comments

Co-authored-by: Ethan Tao <ettao@microsoft.com>
2022-07-28 20:26:25 -07:00
Jian Chen
7a7e372b9f
Remove training cuda 10.2 pipeline (#12347)
* update to 2022

* Update the VS version

* Rolling back to gcc 10

* Rolling back

* Update cuda home

* remove "CMAKE_CUDA_ARCHITECTURES=52"

* update cuda Architure to 70

* Delete cuda 10.2 training pipeline

* rolling back a mistake

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Update win-gpu-reduce-op-ci-pipeline.yml

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.10.0_cu10.2 directory

* Delete tools/ci_build/github/linux/docker/scripts/training/ortmodule/stage1/requirements_torch1.11.0_cu10.2 directory
2022-07-28 14:58:17 -04:00