onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-29 23:06:41 +00:00

Author	SHA1	Message	Date
Tianlei Wu	e57b735bb9	Add a transformer to use Gelu approximation for cuda provider (#2480 ) * Add Gelu Approximation Transformer to convert Gelu or AddGeluFusion to FastGelu to get better inference performance.	2019-11-27 10:15:50 -08:00
Changming Sun	109b3cb450	Avoid using the default logger in the graph lib and optimizers (#2361 ) 1. Use the session logger if it is available. 2. Don't disable warning 4100 globally. We should fix the warnings instead of disabling it.	2019-11-14 13:23:28 -08:00
Scott McKay	db0dd09ded	Cleanup some aspects of the Initializer class used by optimizers (#2005 ) * Move check on data type outside of the Initializer class as it's specific to Conv processing. Use references for arguments that can't be null.	2019-10-09 10:37:44 +10:00
Dmitri Smirnov	d1b1cdc5c4	Replace GSL with GSL-LITE submodule and fix up refs (#1920 ) Remove gsl subodule and replace with a local copy of gsl-lite Refactor for onnxruntime::make_unique gsl::span size and index are now size_t Remove lambda auto argument type detection. Remove constexpr from fail_fast in gsl due to Linux not being happy. Comment out std::stream support due to MacOS std lib broken. Move make_unique into include/core/common so it is accessible for server builds. Relax requirements for onnxruntime/test/providers/cpu/ml/write_scores_test.cc due to x86 build. Add ONNXRUNTIME_ROOT to Server Lib includes so gsl is recognized	2019-10-01 12:43:29 -07:00
Adrian Tsai	a7beed798e	Implement L1 graph transformer for free dimension override (#1825 ) * Implement FreeDimensionOverrideTransformer * Add test * Fix compiler warnings * Update comment * LOGS_DEFAULT * Merge from master	2019-09-20 10:52:14 -07:00
Pranav Sharma	818c023535	Add/correct missing SAL annotations + avoid using unsigned types (except where counts are involved). (#1451 ) * Add/correct missing SAL annotations + other cosmetic changes. * Add Outptr * Don't use unsigned types	2019-07-22 23:25:53 -07:00
Tracy Sharpe	823fa3f39c	Integrate MLAS NCHWc support into ONNX Runtime (#1327 ) This change integrates the NCHWc support recently added to MLAS into ONNX Runtime. When using "-o 3" optimizations, then the runtime will do a NCHWc layout optimization pass to convert standard ONNX operators such as Conv/MaxPool to the com.microsoft.nchwc domain with weights and biases reordered for speed.	2019-07-09 20:41:19 -07:00
Konstantinos Karanasos	ee6217972b	Fix when rewrite rule gets registered to multiple op types; update constness of rule methods; enable dropout elimination (#1098 )	2019-05-24 13:47:55 -07:00
Changming Sun	99556b111d	Make MemPatternPlanner on/off switchable in model weight loading (#989 )	2019-05-16 14:39:09 -07:00
Konstantinos Karanasos	feab3088fb	Conv(Add\|Mul\|BN)Fusion as rewrite rules (#863 ) * Converted ConvAddFusion, ConvMulFusion, and ConvBNFusion to rewrite rules * Extended graph_utils::RemoveNode * Introduced RewriteRuleEffect enum	2019-05-01 13:23:29 -07:00
Konstantinos Karanasos	1b7d1f2645	Convert constant folding to a transformer (#866 )	2019-04-29 18:12:49 -07:00
Konstantinos Karanasos	ada90086f7	More efficient rule-based transformer (#815 ) Introduce a quick pre-filtering of rules based on the node op types they are targeting. The goal is to avoid evaluating all rules for all nodes. Instead, for each node, we will only be evaluating the rules associated with its op type.	2019-04-18 17:10:13 -07:00
Ashwini Khade	14d63b5f45	generate transformers bug fix (#838 ) * fix graph transformer generation * add more tests * cosmetic changes * more changes per review	2019-04-16 14:10:33 -07:00
Ashwini Khade	77b981824a	fix graph transformers and refactor tests (#696 ) * fix graph transformers and refactor tests * fix merge master * Set default optimization level to Level1 * fix build warnings for Linux * try root cause tensorrt test failures * try root cause tensorrt test failure * Test level2 transformers with all CI builds * remove ConvActivation fusion transformer * change default level back to level1 * remove providers from apply api * more changes	2019-03-26 20:38:12 -07:00
Konstantinos Karanasos	a872ba7894	Convert Unsqueeze elimination to rewrite rule + improvements in graph utils and graph transformer utils (#670 ) * Convert unsqueeze elimination to rewrite rule * Simplify the way we register predefined transformers and rules in the inference session (all details are now moved to the graph transformer utils) * Some reorganization and renaming of methods in graph_utils * Updates in graph transformers test * Update in edge removal to not perform unnecessary check of node args that led to race conditions when updating the graph * Improve documentation for rewrite rules * Remove top-down rule-based transformer (given we currently have only one type of rule-based transformer)	2019-03-26 13:58:15 -07:00
Ashwini Khade	2f1c3028b7	add capi to set graph optimization level (#657 ) * add capi to set graph optimization level * remove 1 unnecessary check + review comment * plus updates	2019-03-20 17:14:46 -07:00
Ashwini Khade	481eb971ec	graph transformers update (#608 ) * graph transformers update * some updates * plus changes * more updates * fixes per review comments * enable tests * adding more tests * more changes * update api in inference sesion * changes per review * Linux CI fix * fix linux CI failure * fix MAC CI failure * more updates * add more documentation and add level param to register transformer	2019-03-18 14:52:16 -07:00
Konstantinos Karanasos	2ae83c580c	Constant folding (#168 ) Constant folding rewrite rule computes nodes that have only constant inputs at compile time and avoids these computations at run time.	2019-03-13 15:44:26 -07:00
Weixing Zhang	696ab8a194	Create a separate component for graph optimization. (#421 ) * Create a project for graph optimizer. Move optimizer related code to the folder optimizer. * Fix build failures. * rebase and fix build failures. * fix build failure. * fix build failure with cuda path. * fix python build failure. * Move two transformers(memcpy and insert_cast) from framework to optimizer. * rebase. * SessionState should not depend on optimizer.	2019-02-04 15:45:12 -08:00

19 commits