onnxruntime/cmake/external
George Nash a36f627a4c
Dnnl training (#6045)
* Add ReluGrad and ConvGrad ops for the dnnl provider

* the mnist sample is updated to add the --use_dnnl option that
will cause the sample to use the dnnl execution provider for
nodes that exist in dnnl provider.

* Added the ability to find forward ops. Dnnl backward gradient
ops require the forward primitive description and workspace
from the forward operation.

* Enable specifying the execution provider for Gradient Checker Tests

* Prevent memory leak when running dnnl_provider in training mode

Prevent creating a SubgraphPrimitivePool when the code is built with the
ENABLE_TRAINING build flag. Instead create a SubgraphPrimitive directly.

The SubgraphPrimitivePool was causing a pool of SubgraphPrimitives to be
stashed in a map for reuse. Due to the way the Training Loop uses threads
the pool of SubgraphPrimitives were not being reuse instead a new pool of
SubgraphPrimitives being created each run. The old pool was not instantly
freed. This behavior could be a language error when using thread_local
memory.

Signed-off-by: George Nash <george.nash@intel.com>

* Added fixes to maxpoolgrad and memory leak.

Maxpoolgrad will now pass all unit tests.
With the conv and convgrad disabled for dnnl, mnist is able to train till 95%

Signed-off-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Fixed misc issues when testing training code with dnnl provider

* fix conv_grad dnnl tests with dilation to run dnnl execution provider

* update mnist training sample to accept convolution type models

  convolution models require the input shape to be {1, 28, 28}
  instead of the flat {728} image that is used for the gemm models

  this will enable models that require the different shape by adding
 `--model_type conv` to the command line when running the mnist sample.
 (while testing a workaround was used see #4762)

* Disable weight caching in dnnl conv operator when using training

  When training we can not use cached weights because the weight
  will be updated each run. This re-enables dnnl Conv and ConvGrad Ops.
  The weight caching was the source of the error from Conv when training.

* Fix issues found when building grad ops on Linux
  * The dnnl_convgrad code was over using the scope operator
    causing a compilation problem.
  * The dnnl_maxpoolgrad code had a logic error that is was
    comparing with the source description when it should have
    been comparing with the destination despription.

* Update BUILD.md so it shows DNNL for training
  * Updated the table of contents. Since the same providers
    are listed twice. Once for Infrance and again for Training
    an HTML anchor was added to distinguish the second header
    from the first for the TOC.

* Fix build failure when not using --enable-training build option

* reorganize the gradient operators so they are grouped together

* Fix issues found when running onnx_backend_test_series.py

* Pooling code only supports 2 outputs when built with --enable-training

* Address code review feedback
  * class member variables end in underscore_
  * use dst instead of dist to match pattern use elsewhere in DNNL code.

* Remove workaround that was introduced to handle problems running
  convolution based training models. See issue #4762

Signed-off-by: George Nash <george.nash@intel.com>

* Isolate training code and code cleanup

* Do not build if dnnl_gpu_runtime if enable_training is set training code
  does not support dnnl_gpu_runtime yet.
* Isolated Training code inside ifdefs so that they wont affect
  project if built without training enabled
* Inadvertant changes in whitespace were removed to make code review simpler
* Undid some code reordering that was not needed
* comments added to closing #endif statments to simplify reading complex ifdefs
* Modified the GetPrimitiveDesc functions to return shared_ptr instead of raw
  pointer. This matches what was done in Pool code and is safer memory code.

Signed-off-by: George Nash <george.nash@intel.com>

* Address code review issues

- whitespace changes caused by running clang-format on the code
- Several spelling errors fixed
- Removed/changed some ifdefs to improve readability
- other misc. changes in responce to code review.

Signed-off-by: George Nash <george.nash@intel.com>

* Code changes to address code review

- Simplify iteration code using `auto` keyword
- remove C style cast that was not needed
- remove instance variable that was not needed [relugrad.h]
- added the execution providers to `ComputeGradientErrorInternal()`
  and `ComputeTheoreticalJacobianTranspose()` instead of using
  a pointer to an instance varaible [gradient_checker.h/.cc]

Signed-off-by: George Nash <george.nash@intel.com>

* Combined the default gradient ops test and dnnl gradient ops test for ConvGrad and MaxPoolGrad into one function with the help of a helper function.
This will reduce repeated code.
Signed-off-by: Palangotu Keshava, Chethan's avatarChethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>

* Replaced the stack used by convgrad to vector so that the vector(used as stack) can be easily cleared everytime the graph is created.
This will prevent memory leak from convolution kernels being pushed constantly onto the stack.
Signed-off-by: chethan.palangotu.keshava@intel.com

* Code clean up and formating updates

 - Removed empty else statment
 - updated indentation of code that was causing double curly brackets to look unususal
 - Changed check for NumDimensions to Size in Relu and ReluGrad error checking code.
 - isolated training code

Signed-off-by: George Nash <george.nash@intel.com>

* Restore inadvertantly removed ConvGrad tests

When combining the DNNL and CPU version of the ConvGrad
tests two test were inadvertantly excluded.  This adds
back the Conv3d and Conv3d with strides test cases.

Signed-off-by: George Nash <george.nash@intel.com>

* Add validation to ConvGrad

This validates the dimensions of the ConvGrad match the
passed in Convolution forward primitive description.

The current code for DNNL ConvGrad makes the assumption that the ConvGrad
nodes will be visited in the reverse order from the corresponding Conv nodes

The added validation will return an error if this assumption is not true.

Signed-off-by: George Nash <george.nash@intel.com>

* Do not create new execution providers in provider_test_utils

This removes the code that generated new execution providers in the
OpTester::Run function. This was added because the std::move was
leaving the `entry` value empty so subsequent calls would cause a
segfault.

Problem is this potentially changed the execution_provider because it
would create the default provider dropping any custom arguments.

When the now removed code was originally added the std::move was causing
crashes when the GradientChecker unit tests were run.  However, it is no
longer causing problems even with the code removed.

Signed-off-by: George Nash <george.nash@intel.com>

* Change the forward conv stack to a forward conv map

This changes how the forward conv kernel is mapped to the bwd ConvGrad
kernel the problematic stack is no longer used.

The convolution stack made the assumption that the corresponding
ConvGrad operator would be visited in reverse order of the forward
Conv operators.  This was always problematic and was unlikely to
work for inception models.

Important changes:
- The weight_name is added to the ConvGrad dnnl_node making it
  possible to use the weight_name as a lookup key to find the
  Conv forward Kernel
- the `std::vector fwd_conv_stack_` has been replaced with a
  `std::map fwd_conv_kernel_map_`
- Although it is not needed lock_guards were added when writing
  to and reading from the fwd_conv_kernel_map_ as well as the
  fwd_kernel_map_. These should always be accessed by a single
  thread when preparing the dnnl subgraphs so the guard should not
  be needed but its added just in case.
- Updated the comments ConvGrad.h code to no longer mention the
  stack. The error check is not removed. It will be good to verify
  there are no errors as we continue to test against more models.

Signed-off-by: George Nash <george.nash@intel.com>

Co-authored-by: Chethan Palangotu Keshava <chethan.palangotu.keshava@intel.com>
Co-authored-by: unknown <63478620+jeyblu@users.noreply.github.com>
2021-01-29 16:05:58 -08:00
..
coremltools@523d5e03d8 Initial version of CoreML EP (#6392) 2021-01-27 10:43:17 -08:00
cub@c3cceac115 add dependency 'cub' as submodule (#1924) 2019-09-26 16:10:39 +08:00
cxxopts@3c73d91c0b Introduce training changes. 2020-03-11 14:39:03 -07:00
date@e7e1482087
eigen@d10b27fe37 Update eigen to the latest to support C++20 (#4817) 2020-08-17 10:19:48 -07:00
FeaturizersLibrary@fd5fe3de50 FeaturizersLibrary update and add variadic Input/Output to TimeSeriesImputer (#3674) 2020-04-24 08:53:00 -07:00
flatbuffers@6df40a2471 Move flatbuffers to 1.12 release (#5392) 2020-10-07 09:23:03 -07:00
googletest@703bd9caab Upgrade gtest to the latest version (#2827) 2020-01-13 20:16:48 -08:00
json@d98bf0278d Add provision in ORT for session options to be parsed when available via model file (#2449) 2019-12-03 16:56:07 -08:00
libprotobuf-mutator@7a2ed51a6b Onnxruntime fuzzing (#4341) 2020-07-06 16:34:34 -07:00
mimalloc@2d54553b7a Use a custom allocator for temporary buffers in reduction_ops.cc (#2775) 2020-02-23 16:04:30 +10:00
mp11@21cace4e57 Op kernel type reduction infrastructure. (#6466) 2021-01-28 07:27:19 -08:00
nsync@436617053d Update nsync 2020-02-20 11:25:34 -08:00
onnx@174de7d086 Refine auto_pad based pad computation in ConvTranspose (#6305) 2021-01-19 19:01:49 -08:00
onnx-tensorrt@b3eda616d3 Expose graph ModelPath to TensorRT shared library (#6353) 2021-01-26 10:41:31 -08:00
optional-lite@4acf4553ba Upgrade optional implementation to https://github.com/martinmoene/optional-lite. (#5563) 2020-11-03 15:27:47 -08:00
protobuf@498de9f761 Upgrade protobuf to 3.11.3 2020-02-12 14:47:00 -08:00
re2@30cad26715
SafeInt Revert to using release SafeInt repo now that it supports a build with exceptions disabled. (#5233) 2020-09-22 06:29:28 +10:00
tensorboard@373eb09e4c Introduce training changes. 2020-03-11 14:39:03 -07:00
tvm@eab844a872 update tvm submodule (#4284) 2020-06-19 14:51:18 -07:00
wil@e8c599bca6 Add DirectML Execution Provider (#2057) 2019-10-15 06:13:07 -07:00
dml.cmake Switch to unified DirectML 1.4.0 redistributable (#5794) 2020-11-17 13:42:23 -08:00
dnnl.cmake Dnnl training (#6045) 2021-01-29 16:05:58 -08:00
eigen.cmake apply eigen patch only for ACL. 2019-11-05 13:53:53 -08:00
featurizers.cmake Fix WCOS/Win32 linking bugs (#3126) 2020-03-19 08:52:40 -07:00
FindNumPy.cmake
jemalloc.cmake
mimalloc.cmake Use a custom allocator for temporary buffers in reduction_ops.cc (#2775) 2020-02-23 16:04:30 +10:00
onnx_minimal.cmake Support opset-13 specs of controlflow ops (Loop, If) (#5665) 2020-11-11 23:44:14 -08:00
pybind11.cmake Add python 3.9 support (#5874) 2020-11-30 12:02:48 -08:00
pyxir.cmake Initial release of Vitis-AI Execution Provider (#3771) 2020-05-19 05:32:32 -07:00
zlib.cmake