onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-10 17:37:14 +00:00

Author	SHA1	Message	Date
Edward Chen	32366fea02	[Objective-C API] WIgnore clang documentation warnings from C/C++ header usage. (#9057 )	2021-09-14 13:03:48 -07:00
Tianlei Wu	3ec3e9f705	Add t-test to compare experiments in GPT-2 mixed precision conversion (#9042 ) * Add t-test to compare two experiments * Ranking based on pair-wise T-test results and a custom scoring function	2021-09-14 12:40:25 -07:00
G. Ramalingam	7d28b596f4	Add function-body to opschema of FastGeluGrad (#9028 ) * Add function body to FastGeluGrad * Add test case	2021-09-14 12:27:55 -07:00
Suffian Khan	4322f7e647	Fix ROCm wheels CI pipeline break by installing latest protobuf from source (#9047 ) * install protobuf from source * fix rm command in Dockerfile * fix options on rm command * fix cd into protobuf source directory * try again * remove strip step * debug list the files * ls on /usr * more debug * more debug * adjust LD_LIBRARY_PATH * try remove protobuf before ORT build	2021-09-14 12:07:00 -07:00
Guoyu Wang	cf70635d2a	Add Android executable drop in the Package pipeline (#9050 ) * add copy executable for android job * minor fix * Variable fix * Move to use tgz because zip is not part of the docker image * update compression	2021-09-14 11:45:33 -07:00
Yulong Wang	be80698698	[js/web] a bugfix and add tests for wasm proxy worker (#9048 ) * [js/web] add tests for wasm proxy worker * fix script src override	2021-09-14 10:38:58 -07:00
Edward Chen	e574be4a53	[C API Docs] Add docs for run options tag/log level accessors/modifiers. (#9045 ) Add documentation for these C API functions: RunOptionsGetRunLogSeverityLevel RunOptionsGetRunLogVerbosityLevel RunOptionsGetRunTag RunOptionsSetRunLogSeverityLevel RunOptionsSetRunLogVerbosityLevel RunOptionsSetRunTag Update some existing documentation.	2021-09-14 08:53:35 -07:00
mindest	6036a6b915	Add type int64 for Equal, float types for ReduceSum (ROCm) (#9010 )	2021-09-14 00:07:30 -07:00
Sherlock	9174cbe3d5	Optimize CUDA Kernel for 3D and 4D Transpose (#8928 ) * Optimize Transpose120 and Transpose102 * Generalize Transpose0123 for more input shapes * Add Transpose3D test cases * update rocm kernel	2021-09-13 23:00:53 -07:00
Tianlei Wu	5969d576e5	Revert "disable half2 kernel by dfault (#9034 )" (#9044 ) This reverts commit `289999af35`.	2021-09-13 17:25:25 -07:00
baijumeswani	34f37d2920	Disable fallback for ortmodule api tests (#9018 )	2021-09-13 16:00:13 -07:00
Guoyu Wang	c709380c52	Add full iOS job in package pipeline (#9036 ) * Add full ios xcframework job * create zip file of the xcframework	2021-09-13 15:54:11 -07:00
baijumeswani	1422a9ba6b	Remove previous temporary fixes and address TODOs (#9020 )	2021-09-13 10:10:07 -07:00
Edward Chen	011cb8fd48	Fix Where op type reduction processing (#9033 ) * Update type reduction script to track Where Op's second input type. * Clean up op_kernel_type_control.h includes. * Use more maintainable include.	2021-09-13 08:37:58 -07:00
mindest	a1021a1cf4	Add BatchNorm kernel for ROCm (#9014 ) * Add BatchNorm kernel for ROCm, update BN test * correct epsilon_ setting; limit min epsilon	2021-09-13 15:15:05 +08:00
Rajalakshmi Srinivasaraghavan	e83cc534d4	Fix cmake POWER10 detection Recent commit `60c98a8` changed variable mlas_common_srcs which affects POWER10 detection.	2021-09-12 11:56:55 -07:00
Hariharan Seshadri	c674343d94	Remove document text from error message in a couple of ops (#9003 )	2021-09-11 08:37:52 -07:00
Ryan Hill	c3321b1778	Fix NVTX profiling so it can run in the shared CUDA provider (#9035 ) * Move NVTX profiling so it can run in the shared provider properly	2021-09-11 00:35:54 -07:00
Tianlei Wu	289999af35	disable half2 kernel by dfault (#9034 )	2021-09-10 20:09:21 -07:00
Tang, Cheng	8eb6546e8e	enable eager mode with ortmodule (#8961 ) * initial change for eager/ortmodule integration * pdate to latest pytorch api * add test model;fix torch version issue * fix comments in pr * fix python test break * fix api change * fix comments in PR * pass device into the fw function	2021-09-10 15:09:23 -07:00
Edward Chen	29d6573f3d	Increase timeouts for Mac CI builds. (#9024 ) Increase timeouts for "orttraining-mac-ci-pipeline" and "iOS CI Pipeline" CI builds.	2021-09-10 12:57:08 -07:00
Chen Fu	b3c2725862	fix cpuinfo compilation flag usage (#9029 ) Co-authored-by: Chen Fu <fuchen@microsoft.com> Bug was introduced from PR #8716 When restricting cpuinfo to only known platforms, compilation flag change was not thorough, which accidentally turned off hybrid core detection for ARM systems. This PR fixes this bug	2021-09-10 12:43:38 -07:00
satyajandhyala	ce7b12bf5d	Added new fp16 allow/safe opcodes in PropagateCastOps (#8964 ) * Removed RemoveInputOutputUpDownCasts strategy in PropagatCastOps. * Added Expand, Squeeze and Unsqueeze ops to fp16 allow ops * Added onnx models for squeeze/unsqueeze tests.	2021-09-10 11:53:26 -07:00
Bowen Bao	31af88c0bc	Update cross_entropy_loss symbolic for new argument from upstream torch (#9007 ) In torch 1.10, `label_smoothing` is added as additional input to `cross_entropy_loss`. Update the symbolic function to handle this change.	2021-09-10 10:32:59 -07:00
Zuwei Zhao	ff66cfdfa6	Enable linking in exception throwing support library when build onnxruntime wasm. (#8973 ) * Enable linking in exception throwing support library when build onnxruntime webassembly containing onnxruntime-extensions. * Add flag in build.py to enable linking exceptions throwing library. * Update onnxruntime-extensions document and bind custom_ops build flag with use_extensions. * Update doc. * Update cgmanifest.json. Co-authored-by: Zuwei Zhao <zuzhao@microsoft.com>	2021-09-10 22:09:16 +08:00
Tianlei Wu	e5ee0b435d	Attention Fusion for GPT-2 from Megatron (#8987 ) (1) Attention Fusion for gpt-2 model from Megatron. (2) Update symbolic shape inference of Attention to support 4D mask. (3) Add an otpion in save_model_to_file to save external data in one file or not, and warning of existing external data (4) Fix deprecation: logger.warn => logger.warning (5) Add model loader to test model without external data (6) Add an API of optimize_by_fusion, and topological sort after optimization.	2021-09-10 00:29:40 -07:00
Du Li	57b7ab56cd	Adding async fetching for webgl backend (#8951 ) * Adding async fetching for webgl backend * fix PR comments and CI failure. * fixing a bug * adding a flag	2021-09-09 22:17:42 -07:00
Yulong Wang	5145fa236f	[js/web] fix ort web e2e test (#9025 )	2021-09-09 22:08:27 -07:00
Ryan Hill	2439ced3ec	API Documentation (#8948 ) * Make help information compile properly	2021-09-09 22:04:51 -07:00
liqun Fu	6412c6a362	do not add pkg wheel entry to the index html file if it already exists (#9004 ) * do not add pkg wheel entry to the index html file if it already exists	2021-09-09 16:20:19 -07:00
Gary Miguel	e357022362	Remove onnxruntime team from CODEOWNERS (#8954 ) There are currently 98 members in the team. Requesting review from all of them for every PR is too noisy.	2021-09-09 15:26:59 -07:00
Spike Curtis	00fbc3b0bc	Instruct dockerfile users to do submodule updates Signed-off-by: Spike Curtis <spike@lodestar.ai>	2021-09-09 11:17:21 -07:00
baijumeswani	d78e90d1af	Adding preprocessor checks for torch version during torch cpp extensions compilation (#8989 )	2021-09-09 10:26:38 -07:00
Chi Lo	0367e1f1c2	Update Nuget Packge Pipline to CUDA11.4 and TensorRT8 on Windows (#9000 ) * Update to CUDA11.4 and TensorRT-8.0.3.4 * update trt pool, remove cudnn from setup_env_gpu.bat * revert pool * test gpu package pipeline on t4 * back out changes * back out changes Co-authored-by: George Wu <jywu@microsoft.com>	2021-09-09 06:56:37 -07:00
pengwa	d209fe29b9	custom autograd func memory refinement (#8993 ) * Release torch tensor referenced by torch gradient graph (created in PythonOp) * Update orttraining/orttraining/python/training/ortmodule/torch_cpp_extensions/torch_interop_utils/torch_interop_utils.cc * refine with comments Co-authored-by: Wei-Sheng Chin <wschin@outlook.com>	2021-09-09 18:37:24 +08:00
Pranav Sharma	d39959172f	Fix fuzz testing build blocking release. (#9008 )	2021-09-09 00:44:40 -07:00
Guoyu Wang	1533f574e4	Add full Android job in package pipeline (#9009 ) * Add full Android job in package pipeline * Address CR comments	2021-09-08 21:12:59 -07:00
Hariharan Seshadri	c20cb766be	Optimize sequence type usage on CUDA [3/n] (#9002 )	2021-09-08 16:01:38 -07:00
Yulong Wang	2e8792ca42	[js/web] fix karma launch with chrome headless (#8998 )	2021-09-08 11:52:41 -07:00
Ashwini Khade	ec63d10303	add model local function support (#8540 ) * updates for picking pnnx commit * add tests filter to c# tests * plus test fixes * fix versioning for contrib ops * fix tests * test filter for optional ops * more versioning related updates * fix test * fix layernorm spec * more updates * update docs * add more test filters * more filters * update binary size threshold * update docs * draft - enable model local function * enable model local functions in ORT * update to latest rel onnx commit * plus tests * plus more updates * plus updates * test updates * Fix for nested functions + shape inference * plus bug fix and updates per review * plus fixes per review * plus test updates * plus updates per review * plus fixes * fix a test	2021-09-08 11:47:01 -07:00
Vincent Wang	b7b42e0c5d	fast reduction for reducemean (#8976 )	2021-09-08 10:28:57 -07:00
stevenlix	1c872f9d74	Fix issues in TensorRT EP (#8996 ) * fix big engine load issue and add cuda_cpu_alloc * remove redundancy * fix minor issues	2021-09-08 10:28:16 -07:00
Olivia Jain	6fbd0a8233	Change cmake_cuda_architectures to double quotes (#8990 )	2021-09-08 09:41:52 -07:00
Chi Lo	5ae4c54ab8	Fix bug for validating GPU packages (#8997 )	2021-09-08 02:06:53 -07:00
George Wu	a30d9f5317	fix windows gpu pipelines that use cuda 10.2 (training, reduced_ops and 10.2 validation) (#8994 ) * build for arch 52 * arch 52 * gpu arch 52	2021-09-07 22:01:06 -07:00
Sunghoon	450524359e	[js/web] WebAssembly profiling (#8932 ) * add p50 in test * Preallocate WebAssembly worker threads to minimize worker creation time * WebAssembly profiling * merge master * merge with proxy changes * disable profiling tests from WebAssembly build * fix e2e test failure Co-authored-by: Yulong Wang <yulongw@microsoft.com>	2021-09-07 17:18:08 -07:00
ytaous	0193490cbf	ReduceMin - add int64 cuda kernel support for opset12/13 (#8966 ) * ReduceMin - int64 support * fix doc Co-authored-by: Ethan Tao <ettao@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2021-09-07 17:01:26 -07:00
Changming Sun	91c15843cd	Fix a directml python packaging error (#8981 )	2021-09-07 16:29:33 -07:00
Ye Wang	e2194797a7	bumping up to version 1.9 (#8982 ) * bump up version * makes the windowAI column align with ORT version * update the hardcoded version string * fix a typo	2021-09-07 14:30:55 -07:00
George Wu	00eca42413	make_policy(SET CMP0104 OLD) (#8793 )	2021-09-07 13:12:50 -07:00

1 2 3 4 5 ...

5560 commits