onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-09 17:28:58 +00:00

Author	SHA1	Message	Date
Brian Martin	09c9caab2d	Brianma/cpu (#2583 ) * don't include dml stuff in cpu builds * tests that link the image lib also need the telemetry lib now	2019-12-07 08:59:42 -08:00
Brian Martin	09ca58044e	Merge branch 'layer_dev' into windowsai	2019-12-06 16:23:05 -08:00
Ori Levari	b3c568cf4d	Layer dev dml delayload (#2580 )	2019-12-06 15:44:08 -08:00
Ori Levari	be48f05c64	Cmake and preprocessor fixes that where uncovered by building on agents without DML available via SDK	2019-12-06 13:30:19 -08:00
Paul McDaniel	56cbd82c71	Layer dev paulm (#2567 ) * commetns for dml graph transformer fixed ort value passing using the allocatir info * fixed and coded maps and sequences across the abi * cleaned up w4's cleaned up the model info ABI delayload directml.dll from winml * cleaned up namepsace aliases. renamed _winmla to winmla this was good PR feedback from tiago a while back. * moved files from inc to lib\api.core cleaned up some of the cmake * staged changes * making windowsAI azure dev ops work. * code review comments. * revert changes	2019-12-05 18:14:20 -08:00
Ryan Lai	9933b8a5d6	Fix custom ops scenario tests (#2562 ) * Do not shutdown protobuf after ort environment gets destroyed. Lazy load lotus environment first time it is needed * comment typo * pr comment about calling phoenix singleton * Make lotus_environment static in winmladapter	2019-12-05 15:41:01 -08:00
Ori Levari	8294fa72a4	various changes to unblock windowsai ADO build	2019-12-05 13:50:13 -08:00
Xiang Zhang	8fb7b88e0a	Handle exception thrown from all apis in WinMLAdapter (#2539 )	2019-12-04 14:08:16 -08:00
Ori Levari	2b8d6d3e31	add missing namespace to winml_trace_logging_provider in lotusenvironment.h (#2542 )	2019-12-04 11:30:45 -08:00
Paul McDaniel	a7cf316efb	Layer dev paulm (#2536 ) ori said yes	2019-12-03 16:35:27 -08:00
Ryan Lai	3afb7a89fe	Spawn child process to run DeviceLostRecovery scenario test (#2530 ) * Spawn child process to run DeviceLostRecovery scenario test	2019-12-03 15:38:04 -08:00
Paul McDaniel	c615002f5d	Layer dev paulm (#2533 ) * commetns for dml graph transformer fixed ort value passing using the allocatir info * fixed and coded maps and sequences across the abi * cleaned up w4's cleaned up the model info ABI delayload directml.dll from winml * cleaned up namepsace aliases. renamed _winmla to winmla this was good PR feedback from tiago a while back. * moved files from inc to lib\api.core cleaned up some of the cmake * staged changes	2019-12-03 15:31:22 -08:00
Jeff Bloomfield	a437d43420	merge master	2019-12-03 13:50:20 -08:00
Ashwini Khade	e32eff826c	enable nuget package testing on centos7 (#2527 ) * add centos tests to linux cpu ci pipeline * Disable failing test * use centos6 instead of centos7 * change back to centos7 * add dotnet runtime dependency * fix dotnet runtime dependencies * install dotnet sdk instead of runtimes * add more dotnet dependencies * temporary skip failing test * ix lib path * reenable failing test	2019-12-03 10:16:45 -08:00
Brian Martin	f54625f7c5	re-enable warnings for winml builds and fix the warnings that were hiding (#2526 ) * turn devmode back on for winml builds * fix some warnings. include protobuf in a way that disables some warnings * undo protobufhelpers changes and just ignore 4100 errors in pb code * attempt to isolate protobufhelpers errors * add template specialization for getting tensor proto data	2019-12-03 09:57:56 -08:00
RandySheriffH	85a4ed8cf7	fix cuda kernel causing invalid mem access (#2523 )	2019-12-03 09:16:00 -08:00
Tianlei Wu	66254eb25a	Update BERT model optimization python script (#2521 ) Add support of GPT2 model optimization: * Match subgraph of Gelu Approximation (using Tanh). * Fuse LayerNormalization if SkipLayerNormalization is not ready. * Output model even if embedding layer is not fused. * Improve Reshape Fusion to improve coverage. * Refine constant input checking, and output fused op counter. Update script according to latest op improvements: * Fusion of Add Bias and Gelu. * Fuse SkipLayerNormalization and Add Bias. Other: * Add ReduceSum for mask as intermediate step. * Refactor verbose setting.	2019-12-03 08:40:51 -08:00
Sreekanth Yalachigere	31ea11a696	Renaming MKL-DNN as DNNL (#2515 ) * DNNL: Moving Files to rename file names * DNNL name change * azure pipeline updated * disable ceil/dialation and enable Opset10 * disable ceil/dialation tests in Python * mlperf_ssd_resnet34_1200 disabled	2019-12-03 07:34:23 -08:00
Changming Sun	3d627362a0	Upgrade Windows CPU CI pipeline to use VS 2019 (#2519 )	2019-12-02 23:05:35 -08:00
Scott McKay	e8b327d657	Fix constant folding of node assigned to CUDA (#2510 ) * Constant folding bug fix/improvements - Handle constant folding for node that is assigned to a non cpu EP - Check for errors in optimizer execution frame setup - Improve CUDA partitioning to look for initializers in parent graphs - Add unit test Fixes #2474	2019-12-03 16:28:44 +10:00
Changming Sun	4354023913	Make link time optimization work on Linux (#2477 )	2019-12-02 22:25:41 -08:00
baowenlei	25c260fdef	Add parallel for tensorized gemm (#2517 ) * add parallel for tensorize gemm * add option to control parallel * change to a more clean way to control	2019-12-02 22:05:46 -08:00
KeDengMS	c1be615c45	[NupharEP] refine parallel schedule control (#2514 ) * [NupharEP] Add parallel schedule to JIT function name Update Nuphar docker to use Python 3.6 and ubuntu 18.04 * Update notebook * Avoid JIT cache file name conflict	2019-12-02 17:40:51 -08:00
Zhang Lei	784eca0dcd	Cuda pad() for opset 11 (#2490 ) * Cuda pad opset 11. * Handle type conversion issue in building.	2019-12-02 16:28:17 -08:00
Ori Levari	da897d76e7	add dml binaries to DirectML package and be more explicit about condition variables (#2520 )	2019-12-02 16:10:38 -08:00
Jeff Bloomfield	b9faa0b6fd	Fix kernel registry validation to reenable DML kernels	2019-12-02 15:43:44 -08:00
Scott McKay	ddaad86605	CUDA Loop (#2444 ) * Implement CUDA Loop operator. * Add control flow node implicit input handling to the memcpy transformer and allocation planner.	2019-12-03 08:29:21 +10:00
Zhang Lei	50eb140119	Cuda Resize Operator for opset 11. (#2484 ) * Cuda Resize Operator for opset 11.	2019-12-02 13:42:21 -08:00
xavier dupré	c42148a0c3	Improves softmax function for standard ml	2019-12-02 10:48:46 -08:00
Dmitri Smirnov	ec88f6d8d6	Add DataFrameTool (#2456 ) Add DataFrameTool to feed inputs from Panda DataFrame	2019-12-02 10:12:03 -08:00
Yulong Wang	89824b35e9	optimize CPU implementation of Attention (#2496 )	2019-12-01 14:43:38 -08:00
Tianlei Wu	0f57e0a49e	Change mask input of EmbedLayerNormalization op to be optional (#2495 ) Change mask input of EmbedLayerNormalization op to be optional	2019-12-01 08:36:06 -08:00
Tiago Koji Castro Shibata	092d8f2866	Make tests dependend on winml_dll (#2509 )	2019-11-30 15:05:50 -08:00
Brian Martin	ecb3228e43	Merge branch 'windowsai' into layer_dev	2019-11-29 08:18:18 -08:00
Brian Martin	5adab88eed	Merge branch 'master' into windowsai	2019-11-29 07:50:17 -08:00
liuziyue	0edd4ef6ca	EmbedLayerNormalization fusion (#2452 ) Embed Layer Normalization Fusion	2019-11-28 14:03:58 -08:00
KeDengMS	60208463a9	[NupharEP] Enable parallel schedule (#2505 ) * [NupharEP] Enable parallel schedule * Update TVM with the fix to TVM threadpool to use OpenMP if possible * Add parallel schedule when trying to vectorize With this change, BERT squad perf on a 4-core (8 HT) CPU goes from 187ms to 150ms * Address CR, docs and cmake update * Doc fix * Fix mkl * Fix TVM windows build when using mklml	2019-11-28 08:35:56 -08:00
Yufeng Li	005305be6e	Implement AddGelu and SkipLayerNorm (#2487 ) * Implement AddGelu and SkipLayerNorm	2019-11-28 08:29:59 -08:00
Zhang Lei	ee0bde6b69	Enable three type of Equal() to version 11. (#2508 )	2019-11-28 03:03:43 -08:00
Paul McDaniel	301d407b39	Layer dev paulm (#2507 ) * commetns for dml graph transformer fixed ort value passing using the allocatir info * fixed and coded maps and sequences across the abi * cleaned up w4's cleaned up the model info ABI delayload directml.dll from winml * cleaned up namepsace aliases. renamed _winmla to winmla this was good PR feedback from tiago a while back.	2019-11-27 15:50:49 -08:00
Dmitri Smirnov	75b4747701	Fix a memleak in pybind. (#2503 )	2019-11-27 15:32:05 -08:00
Ryan Lai	197fd9ea3d	Remove usage of IOBinding in WinML and use C_API Run method (#2504 ) * remove usage of iobinding * Change data structure to use vector of Ort::Values * Polish bind input / output * Use C APIrun method * Update providers on evaluate getresults * Remove run and IObinding interface from WinMLAdapter * Remove use of IObinding * bind unbound outputs code moved to learningmodelbinding * clean up unneeded istensor adapter function * Fix comment * Check if session is closed before binding and clearing * PR feedback	2019-11-27 15:31:30 -08:00
Paul McDaniel	e8e285dd97	Layer dev paulm (#2506 ) * commetns for dml graph transformer fixed ort value passing using the allocatir info * fixed and coded maps and sequences across the abi * cleaned up w4's cleaned up the model info ABI delayload directml.dll from winml	2019-11-27 15:04:47 -08:00
Scott McKay	1fdf1006ac	Various fixes coming out of discussions in #2436 (#2497 ) - Add --skip_tests option to build.py based on github feedback - Add debug output at end of run_subprocess so it's clearer when the output is from a different process running - Add check for scipy as it's required by gen_test_models.py for the onnx tests - Use log.warning instead of warnings.warn for consistency. We use the logger almost everywhere and somewhat randomly used warnings.warn in two places. - Add check for 'wheel' dependency not being found in setup.py and handle more gracefully - Fix invalid input name in Keras tests	2019-11-28 07:03:23 +10:00
Zhang Lei	04b6097db4	Cuda Clip() for op set 11. (#2411 ) * Cuda Clip() for op set 11. * make min_val and max_value input CPU memory directly. * Remove original cu file useless "#pragma once" * merge duplicate logic into one class.	2019-11-27 12:42:45 -08:00
Yulong Wang	ccbd778d0d	optimize CPU implementation of EmbedLayerNorm (#2491 ) * optimize CPU implementation of EmbedLayerNorm * use atomic in parallelization	2019-11-27 12:34:57 -08:00
Ori Levari	2cfee5744b	Layer dev release pipeline (#2488 ) Adds winml binaries to existing cpu nuget package, and creates new gpu dml nuget package with winml binaries and DML EP.	2019-11-27 11:36:20 -08:00
Tiago Koji Castro Shibata	fd8105640f	Link scenario tests to DML when it's enabled (#2502 )	2019-11-27 11:29:09 -08:00
Tiago Koji Castro Shibata	9169c95a0e	Add CLI parameters to test runner, build WinML in ARM and x86 CI (#2479 ) * Support test parameters through CLI arguments * Add WinML do Windows x86/ARM CI builds * Code style fixes * Update googletest Remove GPUTEST macros everywhere now that GTEST_SKIP is supported * Refactor main.cpp * Build scenario tests without DML	2019-11-27 10:33:00 -08:00
Tianlei Wu	e57b735bb9	Add a transformer to use Gelu approximation for cuda provider (#2480 ) * Add Gelu Approximation Transformer to convert Gelu or AddGeluFusion to FastGelu to get better inference performance.	2019-11-27 10:15:50 -08:00

1 2 3 4 5 ...

1725 commits