Commit graph

898 commits

Author SHA1 Message Date
Vincent Wang
9a22b5d253
Strided Tensor Support for Eager Mode (#10578)
* strided tensor for eager mode

* fix build and resolve comments

* fix win x86 build
2022-03-01 14:25:31 +08:00
Thiago Crepaldi
e788cc2a23
Convert com.microsoft::ATen into org.pytorch.aten::ATen onnx op (#10060)
Signed-off-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-02-28 14:14:45 -05:00
harshithapv
037f08f1ff
Fix unsqueeze for opset 13 for ReduceMean Grad (#10668)
* fix unsqueeze for opset 13 for reducemean grad

* fix input for reduce mean
2022-02-28 09:55:52 -08:00
David Fan
617474e298
Stop gradient edges for aten::argmax (#10650) 2022-02-24 21:14:53 -08:00
Dmitri Smirnov
2679711bee
Refactor transformers and other code to reduce memory allocation calls (#10523)
Work on minimizing memory management calls by
  reducing number of allocations and copies.
  Replace std::unordered_set to InlinedHashSet
  and add usage of InlinedVector.
  Employ std::move() to minimize copying and memory allocations.
  Remove copying of the const shared data into each of the
  PropagateCast transformer instances.
  Move inlined_containers.h header to include/common
  Adjust AsSpan imlementation for C++ < 17
2022-02-24 16:17:14 -08:00
Tang, Cheng
7660eeef3e
fix ortmodule's output device info when it runs on ort device (#10616)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-24 10:22:55 -08:00
Justin D. Harris
742694f679
[python] [orttraining] Add utility to export a graph to compute gradients (#8125) 2022-02-18 14:00:49 -08:00
Scott McKay
df841ee87d
Fix incorrect type constraint registration for operator kernels. (#10489)
* Fix incorrect type constraint registration for RoiAlign. This led to the input type not actually being checked when matching a kernel as the invalid constraint name is treated as a missing optional input.
  * fix missing dependency for the unit test exe. Whilst it doesn't link against the CUDA providers lib, without the dependency VS doesn't know it needs to rebuild the library if there are changes.
* Add check for invalid type constraints.
* Fix invalid registrations for other kernels.
* Add hash replacement logic to provide backwards compatibility in ORT format models when the registration is fixed.
* Add tests
2022-02-18 16:55:32 +10:00
Pallavi Deshmukh
ccd7a2d840 Fix build failure when using clang compiler 2022-02-16 17:52:45 -08:00
ytaous
4f76c38686
Revert "Reduce max gradient (#9859)" (#10574)
This reverts commit 7443edb0bf.
2022-02-16 16:02:30 -08:00
Anh Nguyen
7443edb0bf
Reduce max gradient (#9859)
* ReduceMax gradient builder

* Update gradient_builder.cc

* Add CI fix

* Remove whitepace

* Update gradient_builder.cc

* Update gradient_ops_test.cc

* Fix Window CI tests

Co-authored-by: root <tuananhnguyen7198@gmail.com>
2022-02-15 22:38:19 -08:00
Anh Nguyen
0c3e88944d
Fix create ort value hardcoded memory info to CPU (#10510)
* Fix create ort value hardcoded memory info to CPU

* Remove unneeded check

* Remove unneeded header

* Remove unneeded header

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

* Update ort_ops.cpp

Co-authored-by: root <root@QTM-ANHNGUYEN-1.northamerica.corp.microsoft.com>
2022-02-15 10:40:44 -08:00
Baiju Meswani
7691e7ed12
Introduce load balancing dataset samplers (#10163) 2022-02-14 13:46:14 -08:00
ytaous
4e2a974090
[ROCm] UTs and code clean up (#10511)
* Fix UT

* UT

* UTs

* enable ROCm UT

* fix build attempt

* minor

* fix UT

* fix UT

* fix UTs

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-11 08:23:25 -08:00
Edward Chen
f92e47e95b
Remove onnxruntime_util dependency on onnxruntime_framework (#10512)
There's a circular dependency between onnxruntime_util and onnxruntime_framework.
Remove onnxruntime_util's dependency on onnxruntime_framework.
2022-02-10 19:17:08 -08:00
Hubert Lu
c9fbd0b15a
Optimize cuComputePartGradGammaBeta kernel for MI100 (#10475)
* Optimize cuComputePartGradGammaBeta kernel for MI100

Co-authored-by: root <root@gb-sjc2-10.local.lan>
Co-authored-by: Jeff Daily <jeff.daily@amd.com>
2022-02-09 12:51:06 -08:00
ashbhandare
7e5d68eea6
gradient and test (#10455)
Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-02-08 10:18:22 -08:00
ytaous
435e14d60a
[ROCm] BFloat16 support (#10465)
* bf16 support

* minor clean up

* UTs

* fix build

* UTs

* UTs

* merge commit 6b5504c

* minor

* ROCm code cleanup

* fix build

* fix build

* minor

Co-authored-by: Ethan Tao <ettao@microsoft.com@orttrainingdev7.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-07 22:55:15 -08:00
ytaous
63198a6566
[ROCm] BFloat16 support (#10447)
* bf16 support

* bf16 support

* UTs

* fix build

* fix UTs

Co-authored-by: root <root@GCRAMDRR1-MI100-087.redmond.corp.microsoft.com>
2022-02-03 11:31:14 -08:00
Changming Sun
ec4362f8f3
Enable more static analysis warnings and enable the analyzer for training cpu (#10176) 2022-01-27 11:17:20 -08:00
ashbhandare
cf13b9dd5e
Symbolic export for numpy_T (#10390)
* Export numpy_T as onnx transpose

* further fixes, test

Co-authored-by: Aishwarya Bhandare <aibhanda@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-26 14:14:42 -08:00
Tang, Cheng
9aa51379c9
[eager mode]: add configuration for ort virtual device count (#10346)
* add configuration for ort virtual device count

* fix build break

* fix ci build break

Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-25 16:15:54 -08:00
pallavides
790c3be7e9
Fix Reshape issue when shape size is -1 (#10356)
* Fix Reshape issue (in_place) when shape size is -1
2022-01-24 19:30:52 -08:00
Dmitri Smirnov
7e092a7e3f
Reduce number of memory allocations based on a customer profiling case (#10193)
Add abseil and inlined containers typedefs
Introduce TensorShapeVector for shape building.
Use gsl::span<const T> to make interfaces accept different types of vector like args.
Introduce InineShapeVectorT for shape capacity typed instantiations
Refactor cuda slice along with provider shared interfaces
Refactor Concat, Conv, Pad
Build with Conv Einsum and ConvTranspose refactored.
Remove TesnorShape::GetDimsAsVector()
Refactor SliceIterator and SliceIteratorBase
Refactor broadcast
Refactor Pads for twice as long
Remove memory planner intermediate shapes vector
Refactor orttraining
Fix passing TenshroShapeVector to tests
Remove abseil copy and submodule, use FetchContent_Declare/Fetch
Path with separate command
Make RocmAsyncBuffer accept anything convertible to span. Adjust Linux GPU pipeline.
2022-01-24 10:40:46 -08:00
Baiju Meswani
141606534c
Add support for FusedAdam to be mathematically equivalent to pytorch/AdamW (#10106) 2022-01-21 13:37:59 -08:00
Cheng Tang
13e277525c fix whitelist 2022-01-21 13:30:53 -08:00
Tang, Cheng
2dcb69685e
support type promotion in binary poerators in eager mode (#10285)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-20 10:06:09 -08:00
Baiju Meswani
c67594694c
Add ability to set onnx opset version from json config (#10223) 2022-01-20 09:10:19 -08:00
Abhishek Jindal
4aa7cee0d8
Abjindal/clean eager backend (#10055)
* clearing map for eager mode backends

* clearing map for eager mode backends manager

* making OrtBackendsManager an extern variable and trying to delete it

* cleaning backends manager when the python interpret exits

* adding ifdef for eager mode code

* disabling warning for pybind state file

* disabling warning for python module file

* running clang auto format and reducing redundancy

* remove new line

* moving declaration to a new header file

* adding the header file for eager mode for python module

* removing source files for eager mode

* add source file for python module in eager mode

* Update orttraining/orttraining/python/orttraining_python_module_eager.h

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>

Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com>
2022-01-19 14:20:09 -08:00
jingyanwangms
a656c55a75
Add _force_exportable_set and pass debug_options (#10282)
* Add _force_exportable_set and pass debug_options

* Update orttraining/orttraining/python/training/ortmodule/experimental/hierarchical_ortmodule/_hierarchical_ortmodule.py

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>

* nit fix

* Update orttraining/orttraining/python/training/ortmodule/experimental/hierarchical_ortmodule/_hierarchical_ortmodule.py

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>

Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>
2022-01-19 10:26:27 -08:00
David Fan
7b14c70cfe
[ortmodule] Ensure contiguous tensor into forward pass (#10315) 2022-01-18 22:06:37 -08:00
pengwa
e365ad7f3a
fix deadlock in model.train mode forward run only (#9960)
* fix deadlock in model.train model forward run only

* fix tests

* clear the grad_fns before every forward run

* add clean up on exit

* fix

* refine code comments
2022-01-14 13:53:29 -08:00
Vincent Wang
44e2db9397
CUDA BFloat16 Refactor (#10085) 2022-01-14 19:38:56 +08:00
Vincent Wang
3ea7fb0f9f
fix mem leak (#10272) 2022-01-14 14:54:19 +08:00
ashari4
aff96ce081
remove hardcoded type (#10251) 2022-01-12 10:00:34 -08:00
PeixuanZuo
7d93498e0e
[FIX] register softmaxgrad_13/logsoftmaxgrad_13 for rocm (#10177)
* [FIX] register  softmaxgrad_13/logsoftmaxgrad_13 for rocm
* [FIX] update softmaxgrad_13/logsoftmaxgrad_13 implementation for rocm
2022-01-10 11:33:46 +08:00
Abhishek Jindal
4ac3277743
adding definition of concat operator for mapping it to onnx (#10062)
* adding definition of concat operator for mapping it to onnx

* adding the opgen generator file to include tensorlist type for eager mode
2022-01-06 14:56:35 -08:00
ashari4
4ab891999a
fix hardcoded type (#10205) 2022-01-06 09:28:22 -08:00
ashari4
7b5464ed7b
aten add_ op supports bf16 (#10084)
* hand implemented add_
2022-01-05 09:33:28 -08:00
Tang, Cheng
97659495d9
fix aten view op (#10050)
* fix aten view op

* add test case

* fix signature

* fix the build

Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2022-01-04 08:29:30 -08:00
Edward Chen
3bc91c2151
Move reduced ops files into build directory (#10030)
In a reduced ops build, some source files get updated. This change moves the updated files into the build directory. This way, it is easier to simultaneously manage different build directories (with possibly different reduced ops configurations) based on a single source directory.
2021-12-28 19:04:20 -08:00
Vincent Wang
f780f06240
ConcatGrad for OpSet13 (#10109) 2021-12-24 10:02:52 +08:00
satyajandhyala
bd4fb4c5da
Coding style fix. (#10080) 2021-12-18 12:05:48 -08:00
ashari4
cdbd678192
Check kMSDomain already exists before registering it (#10078)
* Check domain before registration
2021-12-17 17:55:15 -08:00
Bowen Bao
102f9b05e1
Support new symbolic function api from PyTorch with PythonOp (#9880)
* Support new symbolic function api from PyTorch with PythonOp

* Specify exact exception

* add comments

* move comments and arg
2021-12-16 11:08:06 -05:00
George Nash
93636cbd20
Reduce ops for DNNL ep (#10056)
* Add Reduce Ops to DNNL ep

Combine the Reduction ops into one class

Add ReduceL1, ReduceL2, ReduceSum, ReduceMax, ReduceMin, and ReduceProd,
ReduceSumSquare, ReduceLogSum, and ReduceLogSumExp

Reduce code now also handles the keepdims attribute

Also updated code to use HandleNegativeAxis function from
the providers/common.h code instead of manually calculating.

In code documentation exists to help explain complex reduction op code

Add elementwise ops to Reduction op capability code removed keepdims check
from the Reduction op capability code.

Updated the error_tolerance for LogGrad(DNNL EP only) after finding a few
instances that the tests were a little out of tolerance.

Signed-off-by: George Nash <george.nash@intel.com>

* Documentation cleanup in dnnl_qattention

Cleaned up the Comments documenting the QAttention operator
For some reason a bunch of new lines were introduced to the
comment making it harder to read.

Signed-off-by: George Nash <george.nash@intel.com>
2021-12-16 07:31:16 -08:00
Tang, Cheng
6357c12977
use inplace reshape (#9991)
Co-authored-by: Cheng Tang <chenta@microsoft.com@orttrainingdev9.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2021-12-15 21:17:29 -08:00
ashari4
af71da0ac6
Yield op supports bf16 (#10035) 2021-12-14 13:12:37 -08:00
ashari4
9e04b7e59b
Remove memcpy in in-place ATen ops (#9913)
* Make ops in-place

* Add comment
2021-12-14 08:28:12 -08:00
Changming Sun
7b63d1102b
Fix some warnings in orttraining code (#10009) 2021-12-13 15:28:21 -08:00