onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-04 23:59:56 +00:00

Author	SHA1	Message	Date
Xavier Dupré	1b85a262fa	Propagate documentation modification from rel-1.0.0 (#2713 )	2020-01-17 15:18:02 -08:00
Yufeng Li	c33dab394f	fix the CUDNN_BN_MIN_EPSILON difference issue between cudnn7.3 and cudnn7.6 (#2680 )	2019-12-17 15:39:39 -08:00
Ryan Hill	82d35ded39	Ryanunderhill/rel 1.1.0 (#2651 ) * Add missig env variables for mac pipeline test (#2595) * Java API for onnxruntime (#2215) * Rename automl python tools folder to featurizer_ops. (#2593) * Make sure fenced tensor could not reuse other tensor. (#2561) * Add support for opset 11 in reshape fusion (#2592) * Support opset 11 subgraph of Squad model in Embed Layer Normalization (#2605) * Allow providers to be set for InferenceSession at construction (#2606) * EmbedLayerNormalization Fusion For Dynamic Squad Model Opset 10 (#2613) * Improve Embed Layer Norm Fusion for SQuAD with static input shape (#2621) * Improve cuda expand() opeator's performance. (#2624) * Cuda pad optimize when no padding is needed. (#2625) * Shortcut cuda Pad() when no padding is needed. * Improve performance of resize() in Nearest mode (#2626) * Optimize cuda scatter() on 2D compatible. (#2628) * Optimize cuda scatter() on 2D compatible. * fix float16 comparison in initializer (#2629) * epsilon attribute for layernormalization fusion (#2639) * Fix memory exception in Layer Norm Fusion (#2644)	2019-12-13 17:15:33 -08:00
Ryan Hill	6049de8d26	Ryanunderhill/rel 1.1.0 (#2615 ) * Add missig env variables for mac pipeline test (#2595) * Java API for onnxruntime (#2215) * Rename automl python tools folder to featurizer_ops. (#2593)	2019-12-13 13:37:02 -08:00
Ryan Hill	36eb1771ba	Update version (#2584 )	2019-12-08 18:00:12 -08:00
liuziyue	200f4b4ea6	EmbedLayerNormalization Fusion Improvement (#2553 ) Embedding layer norm fusion improvements - add more checks	2019-12-07 23:14:26 -08:00
KeDengMS	0f12346d76	[Nuphar EP] fixes for some object detection models (#2581 ) Update notebook tutorial with multi-threaded int8 GEMM from #2517	2019-12-07 13:37:00 -08:00
Ryan Hill	cbc398bb75	Ryanunderhill/packagename test (#2582 )	2019-12-07 12:08:46 -08:00
Ashwini Khade	c06dbd8311	Add ConvTranspose1D (#2578 )	2019-12-07 08:50:02 -08:00
Mark	79847f39b3	Fix file not found error during docker build. (#2569 )	2019-12-07 08:49:47 -08:00
Yufeng Li	5575766a53	Add more check on SkipLayerNorm and BiasGelu fusion (#2574 )	2019-12-06 15:36:02 -08:00
Changming Sun	262ee9dc5a	Fix a warning found in the latest VS release	2019-12-06 15:07:21 -08:00
Yufeng Li	34beafc51c	make layernorm fusion to support opset 11 (#2545 )	2019-12-06 13:06:36 -08:00
shahasad	eeb28a80c0	setup java ci mac (#2570 )	2019-12-06 11:43:40 -08:00
Tianlei Wu	038ee91da5	Allow sequence length to be symbolic (#2559 )	2019-12-06 10:13:56 -08:00
George Wu	73c682b97c	disable onnx_test_runner -x invocations for dnnl (#2568 )	2019-12-05 23:05:34 -08:00
Changming Sun	7eddac16c2	Re-enable Windows C# tests (#2564 )	2019-12-05 21:22:31 -08:00
Ryan Hill	854362cf05	Update win-x86-ci.yml (#2557 ) Fix build pipeline break	2019-12-05 18:44:12 -08:00
Changming Sun	ace132f9aa	Fix android build (#2558 )	2019-12-05 15:03:22 -08:00
Sreekanth Yalachigere	4c996a8699	DNNL CMAKE update (#2548 )	2019-12-05 13:48:57 -08:00
Hariharan Seshadri	53a6bc2f07	Fix a bug handling negative begin pad values in Pad op (#2550 ) * Fix bug in Pad op * Update	2019-12-05 11:29:45 -08:00
Changming Sun	bec4abf074	Add back executable bit to build.py	2019-12-04 21:22:02 -08:00
Ashwini Khade	281933fa1c	Fix C API tests for centos and mac (#2544 ) * change c++14 to c++11 * add ld lib path for centos * enable csharp tests on macos * fix C API test on MacOS + fix manylinux dotnet install * fix manylinux dotnet install * fix lib link	2019-12-04 18:01:35 -08:00
Dmitri Smirnov	d34fb62012	Introduce container type runtime checks and other improvements (#2522 ) Rework TensorSeq in a manner consistent with Tensor and SparseTensor in terms of type system setup. Reduce templating. Introduce helpers to ensure the same data type. Make OrtValue __dtor not virtual. Introduce ContainerChecker	2019-12-04 16:04:17 -08:00
Yulong Wang	be56d77a66	Fix integer overflow in cuda NonMaxSuppression implementation (#2540 ) * add test case that should pass but fail * fix nms * extract int_max_output_boxes_per_class	2019-12-04 13:27:04 -08:00
Xiang Zhang	3e7aaf8fa1	User/xianz/telemetry (#2458 ) * enabme telemetry * enable telemetry * set enable telemetry as default * for debugging * remove log and set disable telemetry as default back * delete private file while testing * resolve comment: mainly add license header, rename macro and update docs * rewording in privacy.md	2019-12-03 23:34:53 -08:00
stevenlix	293b15480b	Add dynamic shape support in TensorRT execution provider (#2450 ) * remove onnx-tensorrt submodule * add new onnx-tensorrt submodule (experiment) for trt6 * update engine build for trt6 * update compile and compute for tensorrt6.0 * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * switch to onnx-tensorrt master for TensorRT6' * Update tensorrt_execution_provider.cc * Handle dynamic batch size and add memcpy in TensorRT EP * update test cases * Update tensorrt_execution_provider.cc * update onnx-tensorrt submodule * Update Dockerfile.ubuntu_tensorrt * Update Dockerfile.ubuntu_tensorrt * Update run_dockerbuild.sh * Update run_dockerbuild.sh * Update install_ubuntu.sh * Update concat_op_test.cc * Update tensorrt_execution_provider.cc * Upgrade TensorRT to version 6.0.1.5 * Update onnxruntime_providers.cmake * Update CMakeLists.txt * Update reduction_ops_test.cc * Update install_ubuntu.sh * Update Dockerfile.ubuntu_tensorrt * Update Dockerfile.tensorrt * Update BUILD.md * Update run_dockerbuild.sh * Update install_ubuntu.sh * Update onnxruntime_providers.cmake * Update install_ubuntu.sh * Update install_ubuntu.sh * Update gemm_test.cc * Update gather_op_test.cc * Update CMakeLists.txt * Removed submodule * update onnx-tensorrt submodule * update header file * Removed submodule * add submodule onnx-tensorrt kevin's branch shape-test' * add debugging code * Update tensorrt_execution_provider.cc * Update tensorrt_execution_provider.cc * merge master * Removed submodule * update onnx-tensorrt submodule * add more changes for dynamic shapes * Update tensorrt_execution_provider.cc * update for dynamic shape * update dynamic shape processing * fix logger issue * remove submodule onnx-tensorrt * add submodule onnx-tensorrt * add env variable min_subgraph_size * remove redundency * update document * use onnxruntime::make_unique * fix multi-run issue * remove some tests to save CI build time * Add dynamic shape test * Update TensorRT-ExecutionProvider.md * Add example of running Faster R-CNN model on TensorRT EP * Add more details on env variables * update environment variables * Update tensorrt_basic_test.cc * Update model tests * Update tensor_op_test.cc * remove --use_full_protobuf * Update build.py	2019-12-03 23:18:33 -08:00
Yulong Wang	d748f891d8	Revert "Disable thread pool creation when enabled OpenMP (#2485 )" (#2535 ) This reverts commit `7c7d5a149c`.	2019-12-03 22:09:02 -08:00
Hariharan Seshadri	5c2e474751	Add provision in ORT for session options to be parsed when available via model file (#2449 ) * Initial commit * Fix gitmodules * Nits * Nits * Updates * Update * More changes * Updates * Update * Some updates * More changes * Update * Update * Merge * Update * Updates * More changes * Update * Fix nits * Updates * Fix warning * Fix build * Add comment * PR feedback * PR feedback * Updates * Updates * Update * More changes * Fix build break * Comment test for now * Updates * Updates * PR feedback * Updates * Nits * Add tests * Fix build * Fix build * Fix build * Fix build break * Fix build * Nits * PR feedback * More change * Expose GetSessionOptions in pybind logic and add unit test for python * Fix build * PR feedback * PR feedback	2019-12-03 16:56:07 -08:00
shahasad	178d059111	Setup java ci (#2528 )	2019-12-03 14:21:51 -08:00
Tianlei Wu	b50878dcf0	Disable Attention fusion tests when DISABLE_CONTRIB_OPS is defined (#2529 )	2019-12-03 14:21:21 -08:00
Ashwini Khade	e32eff826c	enable nuget package testing on centos7 (#2527 ) * add centos tests to linux cpu ci pipeline * Disable failing test * use centos6 instead of centos7 * change back to centos7 * add dotnet runtime dependency * fix dotnet runtime dependencies * install dotnet sdk instead of runtimes * add more dotnet dependencies * temporary skip failing test * ix lib path * reenable failing test	2019-12-03 10:16:45 -08:00
RandySheriffH	85a4ed8cf7	fix cuda kernel causing invalid mem access (#2523 )	2019-12-03 09:16:00 -08:00
Tianlei Wu	66254eb25a	Update BERT model optimization python script (#2521 ) Add support of GPT2 model optimization: * Match subgraph of Gelu Approximation (using Tanh). * Fuse LayerNormalization if SkipLayerNormalization is not ready. * Output model even if embedding layer is not fused. * Improve Reshape Fusion to improve coverage. * Refine constant input checking, and output fused op counter. Update script according to latest op improvements: * Fusion of Add Bias and Gelu. * Fuse SkipLayerNormalization and Add Bias. Other: * Add ReduceSum for mask as intermediate step. * Refactor verbose setting.	2019-12-03 08:40:51 -08:00
Sreekanth Yalachigere	31ea11a696	Renaming MKL-DNN as DNNL (#2515 ) * DNNL: Moving Files to rename file names * DNNL name change * azure pipeline updated * disable ceil/dialation and enable Opset10 * disable ceil/dialation tests in Python * mlperf_ssd_resnet34_1200 disabled	2019-12-03 07:34:23 -08:00
Changming Sun	3d627362a0	Upgrade Windows CPU CI pipeline to use VS 2019 (#2519 )	2019-12-02 23:05:35 -08:00
Scott McKay	e8b327d657	Fix constant folding of node assigned to CUDA (#2510 ) * Constant folding bug fix/improvements - Handle constant folding for node that is assigned to a non cpu EP - Check for errors in optimizer execution frame setup - Improve CUDA partitioning to look for initializers in parent graphs - Add unit test Fixes #2474	2019-12-03 16:28:44 +10:00
Changming Sun	4354023913	Make link time optimization work on Linux (#2477 )	2019-12-02 22:25:41 -08:00
baowenlei	25c260fdef	Add parallel for tensorized gemm (#2517 ) * add parallel for tensorize gemm * add option to control parallel * change to a more clean way to control	2019-12-02 22:05:46 -08:00
KeDengMS	c1be615c45	[NupharEP] refine parallel schedule control (#2514 ) * [NupharEP] Add parallel schedule to JIT function name Update Nuphar docker to use Python 3.6 and ubuntu 18.04 * Update notebook * Avoid JIT cache file name conflict	2019-12-02 17:40:51 -08:00
Zhang Lei	784eca0dcd	Cuda pad() for opset 11 (#2490 ) * Cuda pad opset 11. * Handle type conversion issue in building.	2019-12-02 16:28:17 -08:00
Jeff Bloomfield	b9faa0b6fd	Fix kernel registry validation to reenable DML kernels	2019-12-02 15:43:44 -08:00
Scott McKay	ddaad86605	CUDA Loop (#2444 ) * Implement CUDA Loop operator. * Add control flow node implicit input handling to the memcpy transformer and allocation planner.	2019-12-03 08:29:21 +10:00
Zhang Lei	50eb140119	Cuda Resize Operator for opset 11. (#2484 ) * Cuda Resize Operator for opset 11.	2019-12-02 13:42:21 -08:00
xavier dupré	c42148a0c3	Improves softmax function for standard ml	2019-12-02 10:48:46 -08:00
Dmitri Smirnov	ec88f6d8d6	Add DataFrameTool (#2456 ) Add DataFrameTool to feed inputs from Panda DataFrame	2019-12-02 10:12:03 -08:00
Yulong Wang	89824b35e9	optimize CPU implementation of Attention (#2496 )	2019-12-01 14:43:38 -08:00
Tianlei Wu	0f57e0a49e	Change mask input of EmbedLayerNormalization op to be optional (#2495 ) Change mask input of EmbedLayerNormalization op to be optional	2019-12-01 08:36:06 -08:00
liuziyue	0edd4ef6ca	EmbedLayerNormalization fusion (#2452 ) Embed Layer Normalization Fusion	2019-11-28 14:03:58 -08:00
KeDengMS	60208463a9	[NupharEP] Enable parallel schedule (#2505 ) * [NupharEP] Enable parallel schedule * Update TVM with the fix to TVM threadpool to use OpenMP if possible * Add parallel schedule when trying to vectorize With this change, BERT squad perf on a 4-core (8 HT) CPU goes from 187ms to 150ms * Address CR, docs and cmake update * Doc fix * Fix mkl * Fix TVM windows build when using mklml	2019-11-28 08:35:56 -08:00

1 2 3 4 5 ...

1677 commits