onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-09 00:30:53 +00:00

Author	SHA1	Message	Date
Saquib Nadeem Hashmi	daff4240f0	Updated README.md (#2910 ) Corrected spelling mistake.	2020-01-27 13:37:22 -08:00
Yufeng Li	cd876720d9	Only fuse when output count of add is 1 (#2884 ) * Only fuse when output count of add is 1 * add unit test for add with multi output	2020-01-24 13:47:34 -08:00
Scott McKay	a92e924ab2	Revert "Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835 )" (#2904 ) This reverts commit `724ff0753b`.	2020-01-24 14:02:30 +10:00
Changming Sun	e0c9cdaa73	Fix the nuget pipelines (#2901 )	2020-01-23 20:02:18 -08:00
Tracy Sharpe	17b72d5578	Fix NCHWc BatchNormalization regression (#2903 ) Fix the BatchNormalization optimization in the NCHWC optimizer. If the node has the optional training outputs specified, then skip the transform.	2020-01-23 18:54:11 -08:00
Jeff	ba336b5583	Disable DML EP on software adapter, fix float16 fallback bug, re-enable DML in CI (#2896 ) * Re-enable DML in CI pipeline * Fix bug with float16 fallback + fusion, and disallow DML EP with software adapter * Address PR comments	2020-01-23 15:18:28 -08:00
Changming Sun	201b089a36	Fix some warnings on Windows (#2560 ) 1. Enable warning "4503" # Decorated name length exceeded. 2. Enable warning "4146" # unary minus operator applied to unsigned type. 3. Enable float64 support for the Softmax operator 4. Enable compliance checks for Windows x86 32bits build 5. Use TryBatchParallelFor to replace some fallback code in mlas pooling.cc 6. Fix Android CI pipeline.	2020-01-22 15:59:11 -08:00
Pranav Sharma	49725f896c	Disable openmp for the nocontribops pipeline. (#2888 )	2020-01-22 12:07:44 -08:00
Scott McKay	fc51473b09	Update BFCArena logic to use backoff if cudaMalloc fails. Makes behaviour equivalent to when a CPU allocation fails. Add unit test. (#2748 ) Clear error when throwing an exception for a failed CUDA call so that there is only one error mechanism being used at a time. Minor improvements to logging to aid debugging of BFCArena behaviour.	2020-01-22 14:21:21 +10:00
edgchen1	061f10fcd5	Fixed typo in ORT_RETURN_IF_NOT() message. (#2862 )	2020-01-21 20:03:41 -08:00
Scott McKay	9f5e8c4ae8	InferenceSession::Run needs to call OnRunEnd for any EP that OnRunStart was called for so they can cleanup. Currently it only calls OnRunEnd if the Status is OK. Due to this the CUDA EP will throw during shutdown as the per-thread information has not been cleaned up prior to the CUDA library shutting down. (#2881 ) Also update onnxruntime_perf_test to catch the exception from the call to Run and return a Status. Otherwise it exits with an 'unknown exception' error.	2020-01-22 12:17:52 +10:00
RandySheriffH	38b34babe0	Rashuai/boost cuda TopK performance (#2826 ) * Implement Bitonic and Radix TopK * remove needless print out * fix com err * add negative support * fix comments Co-authored-by: Randy <45701928+RandyShuai@users.noreply.github.com>	2020-01-21 13:40:38 -08:00
Tracy Sharpe	08113b80cc	Optimize BatchNormalization to NCHWc Conv (#2855 ) Update the NCHWc transformer to convert BatchNormalization ops to NCHWc convolutions when the input tensor is already in NCHWc.	2020-01-20 16:35:03 -08:00
Ashwini Khade	807a59c55d	Add calibration tool (#2845 ) * add calibration tool * add model for e2e example * format readme * some more formatting updates * plus a few more updates * plus review comments * plus updates * more updates	2020-01-20 14:49:35 -08:00
Xavier Dupré	22d9f3998e	Fix positive raw scores for TreeEnsembleClassifier (#2824 ) Fix positive raw scores for TreeEnsembleClassifier	2020-01-20 16:48:37 +01:00
Hariharan Seshadri	b21576eeb0	Support non-sequence tensor fed through as a python list (#2782 ) * Support list feeds in Python	2020-01-20 09:45:10 +10:00
KeDengMS	f9f25ec047	Fix spurious component detection warning (#2857 ) Fix spurious component detection warning Use component detection template for all pipelines	2020-01-18 20:10:35 -08:00
Yufeng Li	25d7ad187f	Add float16 support back in the bert fusion script (#2870 ) * Add float16 support back in the bert fusion script * update readme	2020-01-17 20:00:39 -08:00
Yufeng Li	95f3eb6aeb	Bert fusion script for Tensorflow squad (#2858 )	2020-01-17 15:27:04 -08:00
Tracy Sharpe	01f3a33c38	update protoc path to match protobuf version (#2865 )	2020-01-17 14:48:39 -08:00
Changming Sun	e6f7658ade	Update Windows GPU build to use cudnn 7.6	2020-01-17 12:23:13 -08:00
Pranav Sharma	3853ddf9c7	Fix topk type handling to accommodate more types. (#2842 ) * Fix topk type handling to accommodate more types + add unit test for int64_t. * Fix Linux build	2020-01-17 11:57:29 -08:00
Changming Sun	47e27ec9a1	Disable DML in Windows GPU CI build (#2856 ) Disable DML in Windows GPU CI build for now, because there are some wired model test failure and I don't know how to fix it. Will seek help from WinML team.	2020-01-16 18:47:30 -08:00
Scott McKay	724ff0753b	Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835 ) * Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks.	2020-01-17 07:41:48 +10:00
Tracy Sharpe	928b6bb210	MLAS: enable threading for quantized GEMMs (#2844 )	2020-01-15 19:25:40 -08:00
Tianlei Wu	5db8543018	update optimization doc for BERT related fusions (#2819 ) * Add bert related transformers to doc * Add execution provider and comment for bert optimizations * Add comment about accuracy impact of approximation	2020-01-15 16:01:11 -08:00
Changming Sun	56030f8d74	Fix Linux CUDA nuget packaging pipeline break	2020-01-14 21:13:41 -08:00
Tiago Koji Castro Shibata	cff266e1b9	Fix cgmanifest.json generating script (#2770 ) * Fix protobuf submodule name * Workaround pygit2 bug	2020-01-14 14:59:07 -08:00
Ori Levari	db05436fc0	User/orilevari/32bit comparison warning (#2800 ) * use correct type for for loop * explicitly specify void for parameters of OrtGetApiBase because the function is defined in c, so when the function is just (), it is interpreted as having an unknown number of parameters. This was causing compiler warning C4276.	2020-01-14 14:59:07 -08:00
Ashwini Khade	8643f3ebbb	add domain check for nodes + update documentation (#2831 )	2020-01-14 11:15:50 -08:00
Dmitri Smirnov	aa37dea598	Convert ExternalProject Featurizers into git submodule (#2834 ) Add git submodule for Featurizer library. Update cmake to build for git submodule.	2020-01-14 10:32:06 -08:00
Scott McKay	98cb41aa03	Ignore allocator type in ExecutionProviders allocator map. Make default initialization of OrtMemoryInfo more clearly invalid. (#2768 ) * Remove allocator type from the key comparison in ExecutionProviders. Remove usage of DummyArena as it's no longer necessary. * Fix x86 tests where arena allocator is disabled. Make initialization of OrtMemoryInfo clearer by adding Invalid enum value. * Make OrtValueNameIdxMap::MaxIdx more intuitive.	2020-01-14 18:14:55 +10:00
Pranav Sharma	b308e826a8	Add support for int64_t for topk CPU. Fixes github issue #2806 . (#2833 )	2020-01-13 20:26:16 -08:00
Changming Sun	5c391854f4	Upgrade gtest to the latest version (#2827 ) WinML would like to update the googletest submodule. They want some newer features (namely GTEST_SKIP to skip tests programmatically and be able to skip entire fixtures easily) and would need to update the submodule version. However, because the new version of code hit a bug in gcc, even though the bug is already fixed in the latest gcc but we're using gcc 4.8.x and it won't get patched for the bug, so we have to do a compromise, change our code a little bit to make it work. The gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51213	2020-01-13 20:16:48 -08:00
Dmitri Smirnov	120433c29d	Add OneHotEncoder and HashOneHotEncoder kernels. (#2830 ) Add defs and imlementation for OneHotEncoders, adjuist date_time_transformer kernel and test. Add OneHotEncoder kernel test. Add HashOneHotVectorizerTransformer unit test. This does not link due to multiple definitions of functions that are included into header from a CPP file.	2020-01-13 17:58:33 -08:00
Qing	723cf83793	Update Ubuntu & TensorRT version in README (#2820 ) Dockerfile.tensorrt is using nvcr.io/nvidia/tensorrt:19.09-py3 as base Image, update Ubuntu and TensorRT version according to https://docs.nvidia.com/deeplearning/sdk/tensorrt-container-release-notes/rel_19-09.html#rel_19-09	2020-01-13 14:37:32 -08:00
Yingge WAN	07502ec14e	Fix dnnl wheel package name (#2823 ) * Append '-dnnl' to whl package name when --use_dnnl * Update build.py	2020-01-13 14:37:11 -08:00
Ashwini Khade	7c6242b024	update default optimization level + fix gemm_activation fusion (#2791 ) * update defualt optimization level + fix gemm_activation fusion * fix typo * add unit test and incorporate review comments * fix test comment	2020-01-13 14:05:38 -08:00
Ashwini Khade	cc75e5a162	update quantization doc (#2783 ) * update documentation for quantization script * plus some spell corrections	2020-01-13 10:52:46 -08:00
Changming Sun	c4e4abce73	Run static code analyzer on most of our code (#2817 )	2020-01-10 22:17:17 -08:00
Dmitri Smirnov	e37cdbed74	Add manifest missing comma	2020-01-10 16:02:19 -08:00
stevenlix	c4f6db7796	Fix memory leak in TRT (#2815 ) * fix memory leak issue * revert EP_FAIL on enueueV2	2020-01-10 14:07:40 -08:00
Dmitri Smirnov	afa48b7e13	Add timeseries imputer transformer featurizer kernel (#2813 ) Make kernels non-template. Add input constraint for learnt data. Fixup tests. Add two more featurizers along with tests. Tests fail. min_max_scalar_transformer robust_scalar_transformer Fix tests serialized stream by prepending version bytes. Add inputation_marker_transfomer and the test. Fix up float/double type designations. Added label_encoder_transformer along with a test. string_throw case is broken at the momement. Fix labelencodertransfomer_test.cc string_throw case Rename maxabsscalertransformer_test.cc Add MissingDummiesTransformer along with the test. Update manifest. Add TimeSeriesImputerTransformer definition, implementation and tests	2020-01-10 13:27:51 -08:00
Changming Sun	48e042868f	Update test data (#2356 )	2020-01-10 10:52:23 -08:00
George Wu	31200ed92c	speed up Windows TRT CI (#2811 ) * don't run cuda tests if building with tensorrt * remove unnecessary build options for win trt ci * refactor win gpu tensorrt ci yml * --numpy_version=1.17 * update * update * azcopy and cuda path	2020-01-10 08:40:40 -08:00
Ke Zhang	b0019ac7fe	add interface to copy batch tensors. (#2807 ) * add interface to copy batch tensors. * onnxruntime	2020-01-09 16:52:34 -08:00
Tracy Sharpe	7ef6570e27	MLAS: update SGEMM threading parameters (#2808 )	2020-01-09 14:48:20 -08:00
Yufeng Li	71b5165ed3	Initialize max of softmax with lowest of float (#2786 )	2020-01-09 13:48:18 -08:00
Dmitri Smirnov	2c8179bee4	ML.NET team needs featurizers within a package (#2789 ) Add auto ml featurizers to Windows, MacOS as well as to GPU packaging-pipelines.	2020-01-09 10:54:12 -08:00
George Wu	1978376e1e	add session creation time cost. (#2798 )	2020-01-08 11:17:48 -10:00

1 2 3 4 5 ...

1808 commits