* Add better message for subgraph output coming directly from outer scope value.
* Use regex to match value name as the test model is processed in a different order on different platforms.
* skeleton change
* adam compute kernels
* add rtol/atol for tests
* some clean up
* optional outputs
* more clean up
* add tests
* adamw mode=1 test pass
* clean up tests
* add HF AdamW test cases
* refactor adam test file
* make test pass
* all test pass, fix comments
* rename to adamw
* make test pass again
* fix cpplint
* minor fixes
* fix python lint
* Fix build and tests
* fix builds
* fix windows build
* fix win build
* minor fix
* Refine based on comments
* resolve comments
* formatting
* resolve comments
* add ut
* Implement BitmaskDropout and associated unit tests.
* Implement BitmaskDropoutGrad and associated unit tests.
* Implement Dropout -> BitmaskDropout rewrite rule and associated unit tests.
* Implement (Dropout,DropoutGrad) -> (BitmaskDropout,BitmaskDropoutGrad) rewrite rule.
This commit does not yet include unit tests for this rewrite rule.
This commit also introduces improved documentation for all changes which will be grouped
into this PR.
* bitmask dropout
* fix win build
* bugfix for rocm
* bugfix
* fix code format
* fix ut
* fix build break
* fix ut in win
* resolve comments
* fix ut in trt
* resolve comments
* fix rocm build error
* fix typo
Co-authored-by: Aidan Beggs <aidanbeggs@microsoft.com>
* Fix torch cpp ext build when CPU wheel is installed but GPU card is present
Also there is a minor improvement for ATen operator that allows both
"::op" and "aten::op" name for operators
* Fix flake8 false positive
This includes a series of unit test that exercise
the MatMul fusion. This is not an exhaustive list
of tests. The tests focuse on paterns seen in
in models, with additional tests to cover at least
one instance of each operator type that can be part
of the fusion.
Signed-off-by: George Nash <george.nash@intel.com>
* [UPDATE] update amd ci pipeline 2 rocm5.1.1
* [FIX] json format error
* [ERROR] disable unit tests
* [FIX] ucx error
* [FIX] cmake version
* [FIX] units test
* add so_folder option to TVM EP options. add TvmSoEP class and update TVM EP factory
* compilation from so_folder was implemented
* update TVMCompiler for default pipeline and compilation from shared lib
* filter excess so-file in so_folder
* clean Compile method and vm conditions
* implementation of TVMSoCompile on native side instead of python API
* cpplint fixes
* some fixes after review
* more cpplint fixes
* more fixes after review
* align TVMso EP with new API for compilation from #10632
* small fixes for cpplint
Co-authored-by: Valery Chernov <valery.chernov@deelvin.com>
- Enable pyright and pylint (https://github.com/microsoft/pyright) in CI
- Enable pyright, pylint and bandit by default in VS code
Pylint has some good style checks. pyright is Microsoft's static type checker.
* Share thread pools between devices
* make tests reuse device
* Change cpu thread pool options for dml sessions to use 1 thread with no spinning
* fix test failure
* Update missing type constraints for dft
* Add comment and rename inference session parameter
* default missing causing inconsistent test behavior
Co-authored-by: Sheil Kumar <sheilk@microsoft.com>
This patch uses vector instrinsics to optimize MlasQLinearAddKernelHelper
function for POWER processor.
Co-authored-by: Rajalakshmi Srinivasaraghavan <rajis@linux.ibm.com>
**Description**: Extract arg value from torch Value
**Motivation and Context**
Input to gelu is `torch._C.Value` type values. This caused the `if approximate == "none"` check to always fail, preventing the optimized `com.microsoft::Gelu` op from being used.
* initial implementation for support nnapi depthtospace
* modify depthtospace output tensor shape and enable test pass
* minor update
* minor update
* modify input output layout order and hack nnapi instance to use nchw flag for optest
* address pr comments
* add depthtospace to layout logic
* format length and revert UT log level
* add nchw and android feature level check in opsupportchecker
* minor fix
* update
* update
* fix
* minor update