* Add iOS test pipeline and a sample app.
* clean up the unused code.
* clean up.
* revert the unknown change
* disable the shared library for iOS.
* add open source notice text.
* ignore the skipped test.
* extract the common ortenv setup
* Add minimal build option to build.py
Group some of the build settings so binary size reduction options are all together
Make some cmake variable naming more consistent
Replace usage of std::hash with murmurhash3 for kernel. std::hash is implementation dependent so can't be used.
Add initial doco and ONNX to ORT model conversion script
Misc cleanups of minimal build breaks.
* correct some errors in the flatbuffers schema, move flatbuffers submodule to cmake/external
* update the ort flatbuffers schema to use less namespace
* minor update
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
* update onnx to latest master
* implement per-channel for quantizelinear and dequantizelinear
* refine the unit test
* exclude sequence_insert tests
* refine onnx cmake
* add failure tests to broken_tests
* move qdq common code to a seperate function
* refine code
* add flatbuffers submodule
* test version of flat buffer schema
* test version of flat buffer schema
* minor updates
* add serialization of the value info, group defs in different namespace
* update comments
* update cgmanifest.json
* update namespace, changed typeinfovalue to use union, added root_type and file_identifier
* add new container type, add max_node_index to graph
* add serializing session state
* addressed review comments
* minor updates
Co-authored-by: gwang0000 <62914304+gwang0000@users.noreply.github.com>
1. Publish the image ACR, instead of building it every time for every PR
2. Make USE_MKLML and USE_OPENMP be able to co-exist. Currently both of them are enabled in our Linux CI build but indeed only one of them is taking effect.
3. Split nuphar and DNNL to separated pipelines.
4. Fix two warnings in onnxruntime/core/optimizer/matmul_scale_fusion.cc and onnxruntime/test/tvm/tvm_basic_test.cc.
5. Update the manylinux2010_x86_64 image to the latest.
* add training dockerfile tested for examples repo
* forgot pytorch patch for build from source
* make apt-get update -y adjacent apt-get install -y due to Docker caching rules
* comment for mellanox libraries
* mpi4py comment as I forgot where it came from
* apparently curl not included anymore
* grr.. nvidia change nccl location
* dont need findnccl.patch after nvidia changed nccl location
* pr comment /opt/ompi4 => /opt/openmpi-xxx
* switch to pip install pytorch
* use Release instead of RelWithDebInfo
* comment wording
* wordin
* missed RelWithDebInfo => Release
* replace Mellanox with libibverbs
* stale comment
* ordering
* no more ninja
* add / at end of copy
* update cgmanifest.json
* pr comments
Co-authored-by: suffian khan <sukha@OrtTrainingDev1.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>
* Move nnapi dnnlib to subfolder
* dnnlib compile settings
* add nnapi buildin build.py
* add onnxruntime_USE_NNAPI_BUILTIN
* compile using onnxruntime_USE_NNAPI_BUILTIN
* remove dnnlib from built in code
* Group onnxruntime_USE_NNAPI_BUILTIN sources
* add file stubs
* java 32bit compile error
* built in nnapi support 5-26
* init working version
* initializer support
* fix crash on free execution
* add dynamic input support
* bug fixes for dynamic input shape, add mul support, working on conv and batchnorm
* Add batchnormalization, add overflow check for int64 attributes
* add global average/max pool and reshape
* minor changes
* minor changes
* add skip relu and options to use different type of memory
* small bug fix for in operator relu
* bug fix for nnapi
* add transpose support, minor bug fix
* Add transpose support
* minor bug fixes, depthwise conv weight fix
* fixed the bug where the onnx model input has mismatch order than the nnapi model input
* add helper to add scalar operand
* add separated opbuilder to handle single operator
* add cast operator
* fixed reshape, moved some logs to verbose
* Add softmax and identity support, change shaper calling signature, and add support for int32 output
* changed the way to execute the NNAPI
* move NNMemory and InputOutputInfo into Model class
* add limited support for input dynamic shape
* add gemm support, fixed crash when allocating big array on stack
* add abs/exp/floor/log/sigmoid/neg/sin/sqrt/tanh support
* better dynamic input shape support;
* add more check for IsOpSupportedImpl, refactored some code
* some code style fix, switch to safeint
* Move opbuilders to a map with single instance, minor bug fixes
* add GetUniqueName for new temp tensors
* change from throw std to ort_throw
* build settings change and 3rd party notice update
* add readme for nnapi_lib, move to ort log, add comments to public functions, clean the code
* add android log sink and more logging changes, add new string for NnApiErrorDescription
* add nnapi execution options/fp16 relax
* fix a dnnlibrary build break
* addressed review comments
* address review comments, changed adding output for subgraph in NnapiExecutionProvider::GetCapability, minor issue fixes
* formatting in build.py
* more formatting fix in build.py, return fail status instead of throw in compute_func
* moved android_log_sink to platform folder, minor coding style changes
* addressed review comments