onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-06-25 02:50:42 +00:00

Author	SHA1	Message	Date
RandySheriffH	38b34babe0	Rashuai/boost cuda TopK performance (#2826 ) * Implement Bitonic and Radix TopK * remove needless print out * fix com err * add negative support * fix comments Co-authored-by: Randy <45701928+RandyShuai@users.noreply.github.com>	2020-01-21 13:40:38 -08:00
Tracy Sharpe	08113b80cc	Optimize BatchNormalization to NCHWc Conv (#2855 ) Update the NCHWc transformer to convert BatchNormalization ops to NCHWc convolutions when the input tensor is already in NCHWc.	2020-01-20 16:35:03 -08:00
Ashwini Khade	807a59c55d	Add calibration tool (#2845 ) * add calibration tool * add model for e2e example * format readme * some more formatting updates * plus a few more updates * plus review comments * plus updates * more updates	2020-01-20 14:49:35 -08:00
Xavier Dupré	22d9f3998e	Fix positive raw scores for TreeEnsembleClassifier (#2824 ) Fix positive raw scores for TreeEnsembleClassifier	2020-01-20 16:48:37 +01:00
Hariharan Seshadri	b21576eeb0	Support non-sequence tensor fed through as a python list (#2782 ) * Support list feeds in Python	2020-01-20 09:45:10 +10:00
KeDengMS	f9f25ec047	Fix spurious component detection warning (#2857 ) Fix spurious component detection warning Use component detection template for all pipelines	2020-01-18 20:10:35 -08:00
Yufeng Li	25d7ad187f	Add float16 support back in the bert fusion script (#2870 ) * Add float16 support back in the bert fusion script * update readme	2020-01-17 20:00:39 -08:00
Yufeng Li	95f3eb6aeb	Bert fusion script for Tensorflow squad (#2858 )	2020-01-17 15:27:04 -08:00
Tracy Sharpe	01f3a33c38	update protoc path to match protobuf version (#2865 )	2020-01-17 14:48:39 -08:00
Changming Sun	e6f7658ade	Update Windows GPU build to use cudnn 7.6	2020-01-17 12:23:13 -08:00
Pranav Sharma	3853ddf9c7	Fix topk type handling to accommodate more types. (#2842 ) * Fix topk type handling to accommodate more types + add unit test for int64_t. * Fix Linux build	2020-01-17 11:57:29 -08:00
Changming Sun	47e27ec9a1	Disable DML in Windows GPU CI build (#2856 ) Disable DML in Windows GPU CI build for now, because there are some wired model test failure and I don't know how to fix it. Will seek help from WinML team.	2020-01-16 18:47:30 -08:00
Scott McKay	724ff0753b	Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks (#2835 ) * Use IArenaAllocator::Reserve for initializers and mem pattern planner blocks.	2020-01-17 07:41:48 +10:00
Tracy Sharpe	928b6bb210	MLAS: enable threading for quantized GEMMs (#2844 )	2020-01-15 19:25:40 -08:00
Tianlei Wu	5db8543018	update optimization doc for BERT related fusions (#2819 ) * Add bert related transformers to doc * Add execution provider and comment for bert optimizations * Add comment about accuracy impact of approximation	2020-01-15 16:01:11 -08:00
Changming Sun	56030f8d74	Fix Linux CUDA nuget packaging pipeline break	2020-01-14 21:13:41 -08:00
Tiago Koji Castro Shibata	cff266e1b9	Fix cgmanifest.json generating script (#2770 ) * Fix protobuf submodule name * Workaround pygit2 bug	2020-01-14 14:59:07 -08:00
Ori Levari	db05436fc0	User/orilevari/32bit comparison warning (#2800 ) * use correct type for for loop * explicitly specify void for parameters of OrtGetApiBase because the function is defined in c, so when the function is just (), it is interpreted as having an unknown number of parameters. This was causing compiler warning C4276.	2020-01-14 14:59:07 -08:00
Ashwini Khade	8643f3ebbb	add domain check for nodes + update documentation (#2831 )	2020-01-14 11:15:50 -08:00
Dmitri Smirnov	aa37dea598	Convert ExternalProject Featurizers into git submodule (#2834 ) Add git submodule for Featurizer library. Update cmake to build for git submodule.	2020-01-14 10:32:06 -08:00
Scott McKay	98cb41aa03	Ignore allocator type in ExecutionProviders allocator map. Make default initialization of OrtMemoryInfo more clearly invalid. (#2768 ) * Remove allocator type from the key comparison in ExecutionProviders. Remove usage of DummyArena as it's no longer necessary. * Fix x86 tests where arena allocator is disabled. Make initialization of OrtMemoryInfo clearer by adding Invalid enum value. * Make OrtValueNameIdxMap::MaxIdx more intuitive.	2020-01-14 18:14:55 +10:00
Pranav Sharma	b308e826a8	Add support for int64_t for topk CPU. Fixes github issue #2806 . (#2833 )	2020-01-13 20:26:16 -08:00
Changming Sun	5c391854f4	Upgrade gtest to the latest version (#2827 ) WinML would like to update the googletest submodule. They want some newer features (namely GTEST_SKIP to skip tests programmatically and be able to skip entire fixtures easily) and would need to update the submodule version. However, because the new version of code hit a bug in gcc, even though the bug is already fixed in the latest gcc but we're using gcc 4.8.x and it won't get patched for the bug, so we have to do a compromise, change our code a little bit to make it work. The gcc bug: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=51213	2020-01-13 20:16:48 -08:00
Dmitri Smirnov	120433c29d	Add OneHotEncoder and HashOneHotEncoder kernels. (#2830 ) Add defs and imlementation for OneHotEncoders, adjuist date_time_transformer kernel and test. Add OneHotEncoder kernel test. Add HashOneHotVectorizerTransformer unit test. This does not link due to multiple definitions of functions that are included into header from a CPP file.	2020-01-13 17:58:33 -08:00
Qing	723cf83793	Update Ubuntu & TensorRT version in README (#2820 ) Dockerfile.tensorrt is using nvcr.io/nvidia/tensorrt:19.09-py3 as base Image, update Ubuntu and TensorRT version according to https://docs.nvidia.com/deeplearning/sdk/tensorrt-container-release-notes/rel_19-09.html#rel_19-09	2020-01-13 14:37:32 -08:00
Yingge WAN	07502ec14e	Fix dnnl wheel package name (#2823 ) * Append '-dnnl' to whl package name when --use_dnnl * Update build.py	2020-01-13 14:37:11 -08:00
Ashwini Khade	7c6242b024	update default optimization level + fix gemm_activation fusion (#2791 ) * update defualt optimization level + fix gemm_activation fusion * fix typo * add unit test and incorporate review comments * fix test comment	2020-01-13 14:05:38 -08:00
Ashwini Khade	cc75e5a162	update quantization doc (#2783 ) * update documentation for quantization script * plus some spell corrections	2020-01-13 10:52:46 -08:00
Changming Sun	c4e4abce73	Run static code analyzer on most of our code (#2817 )	2020-01-10 22:17:17 -08:00
Dmitri Smirnov	e37cdbed74	Add manifest missing comma	2020-01-10 16:02:19 -08:00
stevenlix	c4f6db7796	Fix memory leak in TRT (#2815 ) * fix memory leak issue * revert EP_FAIL on enueueV2	2020-01-10 14:07:40 -08:00
Dmitri Smirnov	afa48b7e13	Add timeseries imputer transformer featurizer kernel (#2813 ) Make kernels non-template. Add input constraint for learnt data. Fixup tests. Add two more featurizers along with tests. Tests fail. min_max_scalar_transformer robust_scalar_transformer Fix tests serialized stream by prepending version bytes. Add inputation_marker_transfomer and the test. Fix up float/double type designations. Added label_encoder_transformer along with a test. string_throw case is broken at the momement. Fix labelencodertransfomer_test.cc string_throw case Rename maxabsscalertransformer_test.cc Add MissingDummiesTransformer along with the test. Update manifest. Add TimeSeriesImputerTransformer definition, implementation and tests	2020-01-10 13:27:51 -08:00
Changming Sun	48e042868f	Update test data (#2356 )	2020-01-10 10:52:23 -08:00
George Wu	31200ed92c	speed up Windows TRT CI (#2811 ) * don't run cuda tests if building with tensorrt * remove unnecessary build options for win trt ci * refactor win gpu tensorrt ci yml * --numpy_version=1.17 * update * update * azcopy and cuda path	2020-01-10 08:40:40 -08:00
Ke Zhang	b0019ac7fe	add interface to copy batch tensors. (#2807 ) * add interface to copy batch tensors. * onnxruntime	2020-01-09 16:52:34 -08:00
Tracy Sharpe	7ef6570e27	MLAS: update SGEMM threading parameters (#2808 )	2020-01-09 14:48:20 -08:00
Yufeng Li	71b5165ed3	Initialize max of softmax with lowest of float (#2786 )	2020-01-09 13:48:18 -08:00
Dmitri Smirnov	2c8179bee4	ML.NET team needs featurizers within a package (#2789 ) Add auto ml featurizers to Windows, MacOS as well as to GPU packaging-pipelines.	2020-01-09 10:54:12 -08:00
George Wu	1978376e1e	add session creation time cost. (#2798 )	2020-01-08 11:17:48 -10:00
Tianlei Wu	32c5e76a16	Improve bert optimization script: (#2712 ) (1) Move input int64=>int32 conversion to embed layer fusion. (2) Output epsilon attribute for LayerNormalization fusion.	2020-01-08 11:32:27 -08:00
Nathan	f84240db2b	add uint8 support to where op (#2792 )	2020-01-08 09:59:42 -08:00
Hariharan Seshadri	ebfcad1c90	Add script for release Nuget validation (#2719 ) * Initial commit * Nits * Disable a test temporarily * Change working directory * Test * Add download python step * Test update * More changes * Fix space issue * Fix * Verify nuget signing * Fix * Spaces * PR feedback * Nit * Fix * Fix * Remove temporary changes	2020-01-08 18:42:22 +05:30
Andrews548	3e6f1836eb	ACL EP convolution improvements (#2774 ) Added the optimized implementation for depthwise convolution for both ACL v19.02 and ACL 19.05. Also the pointwise convolution seems to be more optimal in the CPU implementation so we opted for that instead.	2020-01-07 06:42:03 -10:00
Andrews548	fdc0106f83	ACL EP GEMM improvements (#2780 ) When it is posible we use a fully connected layer instead of the gemm implementation. This will let the library use the best implementation based on the input data.	2020-01-07 06:35:18 -10:00
Maher Jendoubi	f22bffe0f6	Contributing: Fix a typo (#2784 )	2020-01-07 06:32:13 -10:00
Yufeng Li	72bdfc8cd4	Implement a more stable softmax (#2715 ) * Implement a more stable SoftMax e^x is represented as infinity if x is large enough, like 100.f. Infinity divided by Infinity is a NAN. Thus, softmax gets a NAN if one or more item are large enough. A math transform as below is leveraged to get a stable softmax: e^xi/(e^x1 + ...e^xn) = e^(xi - max) / (e^(x1 - max) + ... + e^(xn - max)) And for convenience, force max to 0.f if all xi are negative	2020-01-06 14:28:12 -08:00
Dmitri Smirnov	6f66260372	Import more featurizers (#2781 ) Make kernels non-template. Add input constraint for learnt data. Add min_max_scalar_transformer, robust_scalar_transformer, inputation_marker_transfomer, label_encoder_transformer, missing_dummies_transformer along with tests. Advance Featurizers library commit.	2020-01-06 13:43:44 -08:00
Changming Sun	1b23118056	Fix nightly build version number issue	2020-01-06 11:16:44 -08:00
Changming Sun	e3f674b563	Disable featurizers in python packages	2020-01-06 11:16:44 -08:00
Changming Sun	7ace7a5bcd	Pass BUILD_BUILDNUMBER to linux docker	2020-01-06 11:16:44 -08:00

... 203 204 205 206 207 ...

11997 commits