onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-08 00:23:03 +00:00

Author	SHA1	Message	Date
Chi Lo	5ae4c54ab8	Fix bug for validating GPU packages (#8997 )	2021-09-08 02:06:53 -07:00
George Wu	a30d9f5317	fix windows gpu pipelines that use cuda 10.2 (training, reduced_ops and 10.2 validation) (#8994 ) * build for arch 52 * arch 52 * gpu arch 52	2021-09-07 22:01:06 -07:00
Sunghoon	450524359e	[js/web] WebAssembly profiling (#8932 ) * add p50 in test * Preallocate WebAssembly worker threads to minimize worker creation time * WebAssembly profiling * merge master * merge with proxy changes * disable profiling tests from WebAssembly build * fix e2e test failure Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-09-07 17:18:08 -07:00
ytaous	0193490cbf	ReduceMin - add int64 cuda kernel support for opset12/13 (#8966 ) * ReduceMin - int64 support * fix doc Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-07 17:01:26 -07:00
Changming Sun	91c15843cd	Fix a directml python packaging error (#8981 )	2021-09-07 16:29:33 -07:00
Ye Wang	e2194797a7	bumping up to version 1.9 (#8982 ) * bump up version * makes the windowAI column align with ORT version * update the hardcoded version string * fix a typo	2021-09-07 14:30:55 -07:00
George Wu	00eca42413	make_policy(SET CMP0104 OLD) (#8793 )	2021-09-07 13:12:50 -07:00
Ryan Hill	b7971575f8	Fix python manylinux to not load cuda if it fails to load dependencies (#8882 ) * Fix python manylinux to not load cuda if it fails to load dependencies	2021-09-07 11:09:25 -07:00
Changming Sun	0bb56a18cf	Add TRT header file to ORT GPU nuget package (#8962 )	2021-09-07 09:50:09 -07:00
senysenyseny16	3be96f8a15	fix: import error in TrtTable::Dict method (#8940 )	2021-09-07 00:28:49 -07:00
Ye Wang	5d47b2e431	Add Einsum and Reciprocal op support in symbolic shape inference (#8931 ) * fix 1 * fix 2 * update * support einsum * format * test * format * add test for eimsum	2021-09-06 16:54:48 -07:00
Changming Sun	60c98a86b7	CMake file changes for macOS universal2 support (#8953 )	2021-09-04 13:30:33 -07:00
stevenlix	a9776d1c70	Add QDQ model support in TensorRT EP (#8969 ) * disable setting dynamic range for QDQ model * update cgmanifest * Update cgmanifest.json	2021-09-03 19:33:34 -07:00
ytaous	53eb79f9f6	Gemm/Transpose fusion - additional pattern coverage (#8941 ) * gemm transpose fixes * enforce condition * add comments * rm redundant code Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-03 15:24:47 -07:00
Scott McKay	eebcc20f10	Add netstandard2.0 framework to nuget managed package. (#8960 ) * Add netstandard2.0 to nuget managed package. Re-does PR that was backed out due to packaging pipeline changes. Allows deprecation of netstandard1.1 in the following release as netstandard2 is the preferred lowest level framework.	2021-09-04 08:01:46 +10:00
Olivia Jain	a0c9408f0d	Make TRT Version Configurable (#8864 ) * copy changes from trt_and_mem * second edits * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * change to cuda 11.4 * build with cuda 11.4 * Update Dockerfile.ubuntu_cuda11_1_tensorrt7_2 * add cmake extra defines * cmake architectures * fix cmake arch * Delete ubuntu-18.04.Dockerfile * Rename Dockerfile.ubuntu_cuda11_1_tensorrt7_2 to Dockerfile.ubuntu_cuda11_4_tensorrt7_2 * Update linux-gpu-tensorrt-ci-perf-pipeline.yml * Update linux-gpu-tensorrt-ci-perf-pipeline.yml for Azure Pipelines * removing previous ort args * rename to cuda 11.4 * remove cuda 10_2 * delete trt 7.1 * remove 7.1 * Passing in cuda architecture to reduce build time * always add submodule sync due to recursive cloning * fix run command * add and * take away unused arms and share python installation script * Update linux-gpu-tensorrt-ci-perf-pipeline.yml * Update Dockerfile.tensorrt * cleanup file * install python directly on dockerfile - move to scripts in future * Update Dockerfile.custom-trt-perf * adding cuda 11.1 for missing Libnvrtc.so.11.1 * Delete install_python.sh	2021-09-03 13:32:27 -07:00
Chi Lo	1f576e1766	Detect necessary files inside GPU packages (#8955 ) * Rename files * Update YAML files * Update validation script and YAML	2021-09-03 13:28:28 -07:00
liqun Fu	a7f5bd226b	retarget torch181 to torch182 (#8947 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-03 09:44:42 -07:00
baijumeswani	0cc2909573	Auto forward non method attribute lookups to the user's model and bind custom methods to ORTModule (#8798 )	2021-09-03 08:25:44 -07:00
Vincent Wang	c343f7cb43	Add Algorithm Search for ConvGrad (#8613 ) * algo search for conv grad * global cache, bigger workspace size * fix build error * refactor * refactor * resolve comments * fix rocm * change lock places * rename variable * remove setting for inference * resolve comments	2021-09-03 11:25:17 +08:00
Tianlei Wu	91f05f387a	Update embed layer norm fusion to work with transformers v4.9 (#8914 )	2021-09-02 19:48:07 -07:00
Hariharan Seshadri	e348929019	Minor cleanup from #7592 (#8952 )	2021-09-02 18:46:57 -07:00
Scott McKay	5f30be3e92	Exclude training support from BatchNorm in minimal build (#8939 ) * Exclude changes to BatchNorm that are training specific from minimal build. Previous changes [excluded](https://github.com/microsoft/onnxruntime/pull/7704) training specific code but that was recently [undone](https://github.com/microsoft/onnxruntime/pull/8269) to support a pytorch CI need that isn't relevant to minimal builds.	2021-09-03 08:02:19 +10:00
Gary Miguel	47435311f4	Include pytorch_export_contrib_ops in inference builds (#8878 ) * Include pytorch_export_contrib_ops in inference builds Rename / move it from tools/python/register_custom_ops_pytorch_exporter to onnxruntime/python/tools/pytorch_export_contrib_ops. Rationale for inclusion in inference builds: This code is potentially useful for anyone using ORT, not just training. Rationale for new name: "Contrib op" is the nomenclature used within ORT to refer to the set of ops that are not in the standard op set but are included by default with ORT. This is more specific than "custom op", which is what the PyTorch exporter uses to refer to any non-standard op. Step 1 of addressing #8818. After this is merged I will update the docs. * Enable test_pytorch_export_contrib_ops.py in CI Fixes AB#1342330	2021-09-02 14:26:58 -07:00
Gary Miguel	06bb2ec561	ignore direnv configs (#8861 ) https://direnv.net/ is a useful tool but its configs are developer-specific	2021-09-02 11:53:57 -07:00
Thiago Crepaldi	fe7f30aa14	Enable all-or-nothing fallback by default (#8911 )	2021-09-02 10:45:14 -07:00
Changming Sun	1a34775fe9	Fix the benchmark code (#8926 )	2021-09-02 10:36:24 -07:00
Tianlei Wu	6490191f58	Fix non deterministic of --input_int32 of transformer optimizer (#8927 )	2021-09-02 10:20:48 -07:00
Ye Wang	7647caa520	update Tensorflow_Tf2onnx_Bert-Squad_OnnxRuntime_CPU.ipynb (#8898 ) * init checkin * update * update * Update Tensorflow_Tf2onnx_Bert-Squad_OnnxRuntime_CPU.ipynb * Update Tensorflow_Tf2onnx_Bert-Squad_OnnxRuntime_CPU.ipynb * use prettrained model * re-run * re-run	2021-09-02 09:59:40 -07:00
satyajandhyala	4570d85f20	Move setdlopenflags calls into _pybind_state.py (#8916 ) * Use PROTOBUF_LIB instead of protobuf::libprotbuf * Moved setdlopenflags to _pybind_state.py * Copy the generated _pybind_state.py to required location for Windows.	2021-09-02 09:54:32 -07:00
Wei-Sheng Chin	f711d8992a	Not to calc memory for inference (#8935 )	2021-09-02 09:49:54 -07:00
Changming Sun	fbb6f0f599	Fix an error in Nuget pipeline caused by merge conflict	2021-09-02 09:26:25 -07:00
Scott McKay	b058dee648	Fix a couple of issues mentioned in the PR comments. (#8936 )	2021-09-02 17:58:29 +10:00
Hariharan Seshadri	ddbc8bc5fc	Fix CPU Xor implementation (#8934 )	2021-09-01 21:38:55 -07:00
Edward Chen	1985616262	Trim InferenceSession binary size. (#8917 ) - Move flatbuffers SessionState access code into helper functions instead of duplicating them between InferenceSession and SessionState. - Trim VerifyEachNodeIsAssignedToAnEp(), e.g., disable verbose log output in a minimal build.	2021-09-01 18:18:32 -07:00
Sunghoon	332c2ba4f4	[js/web] Integrate ONNX Runtime Web CI with BrowserStack (#8859 ) * Integrate ONNX Runtime Web CI with BrowserStack * Rename a pipeline from browserstack to multi-platform	2021-09-01 17:25:57 -07:00
liqun Fu	757e9e6df7	do not post cuda version mismatch warning if cannot find local cudart version (#8924 ) Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-01 17:11:54 -07:00
liqun Fu	f126a12699	decouple pytorch from onnxruntime training build (#8815 )	2021-09-01 16:31:53 -07:00
Tianlei Wu	9467f511ac	Disable some ORT graph optimizers in offline transformers optimization tool (#8923 ) walkaround "Unsupported operator FusedMatMul" during symbolic shape inference	2021-09-01 15:47:57 -07:00
Suffian Khan	225439193e	Optimize Concat and Split on CUDA to eliminate host-to-device copies when sizes are all the same (#8833 ) * special case concat and split when sizes are equal * add tests for 16 and 32 inputs with same dim * add tests for 16/64 inputs on concat or 16/64 outputs on split * try eliminate windows warning * outter => outer	2021-09-01 15:25:45 -07:00
Scott McKay	858989293d	Reduce binary size of strided copy used by Concat (#8913 ) * Change the strided copy to switch on data size not data type. Move to header so we can reduce on the enabled types. Setup type reduction for Concat now that it's using this implementation.	2021-09-02 08:19:20 +10:00
satyajandhyala	9e661b64ae	Fix cast propagation to not change casts from bool type. (#8925 ) * Added new models to test bool->float and bool->float16 casts * Fixed bool casts. Added new test cases.	2021-09-01 15:15:37 -07:00
Changming Sun	6299a60bf8	Nuget: splitting PDB files to a separated package (#8903 )	2021-09-01 09:07:24 -07:00
Suffian Khan	00b0a9c127	Add hugging-face models loss curve and performance guards to ROCm CI pipeline. (#8915 ) * test running hf bert-large * try again * try again * include other models * correct names * disable deberta-v2-xxlarge * avoid torch.distributed * add compare json loss and perf for bert-large to test * fix sed expression * remove pytest * add more models * move unit tests u * display samples/sec	2021-09-01 09:03:10 -07:00
Chi Lo	43d6951fa5	Add warning message for combined trt +cuda python pkg (#8906 ) * Add warning message * update message * fix line too long * fix flake8 issue	2021-09-01 07:28:01 -07:00
Hariharan Seshadri	acd9db7fad	Fix location planning for initializers used only in nested subgraphs (#8642 )	2021-09-01 00:02:08 -07:00
Tang, Cheng	4dc0ddf606	support register external ep lib information (#8897 ) * support register external ep lib inforation; make eager mode share the same ep pools with training workloads * fix inference code * fix build break * fix the message	2021-08-31 20:51:22 -07:00
pengwa	3eb08d4dc7	custom autograd func memory (#8901 ) * remove PythonOpGrad control dependency && avoid segement fault * comment alignment * fix bugs	2021-09-01 09:29:26 +08:00
Yulong Wang	feb747173e	[js/web] Update browser support table (#8900 ) * [js/web] Update browser support table update section 'Compatibility' for Edge browser * update linux	2021-08-31 17:39:51 -07:00
Guoyu Wang	8404a2d011	Add NNAPI E2E test for Android java package (#8912 ) * Add NNAPI E2E test for Android java package * address cr comment	2021-08-31 17:34:33 -07:00

1 2 3 4 5 ...

5517 commits