onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-07-12 17:57:38 +00:00

Author	SHA1	Message	Date
Vincent Wang	f6a8d2aa5f	split graphs info	2020-12-15 09:03:08 -08:00
Vincent Wang	cfd57c0136	fix input order, and input grad.	2020-12-15 09:03:08 -08:00
Vincent Wang	e759da178d	bugfix for graph inputs and outputs.	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	b7564d0732	Refactor after Vincent work on splitting on backend	2020-12-15 09:03:08 -08:00
Vincent Wang	6d8fde8324	sample code change.	2020-12-15 09:03:08 -08:00
Vincent Wang	934feb0c99	gradient graph split in backend.	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	ea5871ac15	Change DropouGrad.input[1].input_type and del logits_grad from backward graph	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	f1dc6e4007	Refactor BERT classifier fine tune for better debugging	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	d4917f2d65	Hard-code input types for DropoutGrad on BERT	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	3b267d1d60	Add BERT classifier example	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	30042b6e0e	Update InferenceSession usage to match master	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	8b0ade0e83	Integrate automatic graph split into ORTModule	2020-12-15 09:03:08 -08:00
Vincent Wang	c36c8e14a7	refactor	2020-12-15 09:03:08 -08:00
Vincent Wang	26e6d6d004	module transformer	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	3524fb04e8	Add working example for MNIST (MVP)	2020-12-15 09:03:08 -08:00
Thiago Crepaldi	f1b5c25b2d	Improve example to display grads before and after optim step	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	f06cafdebd	Fix path on test script	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	56ca4ab05b	Add flag to allow pytorch-only or ORT flexible api runs	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	d4449d86b9	Add script to run Flexible API MVP PoC	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	e71e08851a	Basic plumbing for backward pass. Not fully working	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	77cefcd6c2	Perform forward pass using training graph with intermediate outputs	2020-12-15 09:03:07 -08:00
Thiago Crepaldi	11b69f141e	Forward pass using InferenceSession on exported ONNX Although forward pass works, this has the limitation of not working for backward pass due to the lack of intermediate tensors needed for gradient. Next step is to export a training graph and split it manually	2020-12-15 09:03:07 -08:00
Jesse Benson	a8d549e181	Minor changes to AMD element-wise kernels to converge with CUDA element-wise kernels.	2020-12-15 08:46:36 -08:00
Pranav Sharma	a9548283d0	Don't mark issues that are marked as enhancement as stale (#6134 )	2020-12-14 18:57:40 -08:00
Edward Chen	9810b9e02b	Reduce amount of compiled CUDA device code (#6118 ) Move CudaKernel from cuda_common.h to a new separate header, cuda_kernel.h. Update include sites to use cuda_kernel.h instead if they need CudaKernel. Inclusions of cuda_common.h are now more lightweight. Make corresponding changes for ROCM execution provider code. Other minor cleanup.	2020-12-14 15:27:40 -08:00
Sheil Kumar	a6a23db130	Enable C# .NET5 for WinML (#6120 ) * build for .net5 * only reference cswinrt for .net5 * remove netstandard2.0 references * upgrade language version * net5 * remove extra comment closure * add targetframework * set target framework * remove net* * pep8 errors * make test project build with .net windows SDK projection * disable c# builds for non-x64 builds * fix pep8 errors * disable for store build * fix tests * remove cswinrt and sdk references from package * bump cswinrt down to 1.0.1 * fix bin path Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-12-14 15:05:15 -08:00
Sherlock	eb5c1f0fcc	Unify activation and initializer alignment value (#6109 ) * Unify activation and initializer alignment value * Fix VerifyInputTensorsAllocatedContiguously	2020-12-14 13:13:41 -08:00
liqunfu	cde723a136	Liqun/move nightly pl to linux multi gpu v100 (#6024 ) * move e2e nightly pipeline to azure devop Co-authored-by: liqun <liqun@OrtTrainingDev4.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-12-14 12:43:41 -08:00
baijumeswani	dd2e5a1a05	state_dict and load_state_dict for ORTTrainer (#6095 ) * add functions state_dict and load_state_dict to ORTTrainer * unit tests for state_dict and load_state_dict for ORTTrainer	2020-12-14 11:55:52 -08:00
dependabot[bot]	d4dddd99d9	Bump ini from 1.3.5 to 1.3.8 in /nodejs Bumps [ini](https://github.com/isaacs/ini) from 1.3.5 to 1.3.8. - [Release notes](https://github.com/isaacs/ini/releases) - [Commits](https://github.com/isaacs/ini/compare/v1.3.5...v1.3.8) Signed-off-by: dependabot[bot] <support@github.com>	2020-12-12 13:06:43 -08:00
Hariharan Seshadri	c755ca0b71	Honor auto_pad attribute in ConvTranspose (#4271 )	2020-12-11 22:30:17 -08:00
Suffian Khan	6cb5d3ac09	Fix multi-tensor LAMB reduction to be deterministic (#6028 ) * define ordering of reduction across blocks * save state * remove debug code * remove debug code * review comments * significant correction for reduction only over blocks on same tensor * addressing ocmments * update rocm/lamb.cc to build as well * remove times 2048size in multitensor test until threshold error in rocm resolved convert tuple => struct as per recomendation * update comment * apply perfect forwarding for launch_multitensor to permit passing ref rather than pointer * remove excess template arguments from rocm lamb.cc launch_multitensor as well * fixes for AMD build * pr comments * run formatter from vscode * formatter on cuda files	2020-12-11 13:13:05 -08:00
Edward Chen	c8ac34d6a5	Fix DEBUG_NODE_INPUTS_OUTPUTS test by putting it in a separate process, clean up unused test_main.cc files. (#5949 ) Move the DEBUG_NODE_INPUTS_OUTPUTS test into its own process. The implementation uses static variables which do not interact well with other tests. Clean up old test_main.cc files which are no longer used.	2020-12-11 11:36:58 -08:00
Sherlock	a53f4dd379	Introduce VariadicAlias, remove hardcoded alias limits (#6106 ) * Introduce VariadicAlias, remove hardcoded alias limits * Include optional-lite in winml build Co-authored-by: Sherlock Huang <bahuang@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-12-11 10:47:08 -08:00
Jesse Benson	38c49c2483	Make ROCM and CUDA reduction_all code more similar.	2020-12-11 09:35:07 -08:00
Ryan Lai	1eb146f561	Implement conversion from ORT String to WinML Tensor String (#6097 ) * Implement conversion from ort string to winml string * NIT:comment	2020-12-10 17:47:50 -08:00
Ryan Lai	8bcb5fd119	Add skip test reason for onnx model zoo models and tier 2 models (#6081 )	2020-12-10 14:41:17 -08:00
Ryan Lai	753af576c4	If building inbox, hook up winrt_activation_handler for WinML Tests (#6074 ) * If building inbox, hook up winrt_activation_handler with what is already defined in in dllload.cpp * Add base.h header * Missed custom ops test	2020-12-10 14:41:01 -08:00
Du Li	e945b5fcf6	adding fp16 support for topk cuda kernel (#6082 ) * adding fp16 support for topk. * disable fp16 tests for cpu ep Co-authored-by: Du Li <duli@OrtTrainingDev0.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-12-10 11:04:19 -08:00
Vincent Wang	7ddeafdfcc	Add ReduceL2Grad and ClipGrad (#5970 ) * ReduceL2Grad and ClipGrad. * fix win build and amd ci pipeline * resolve comments. Co-authored-by: Vincent Wang <weicwang@AiFramework2080ti2.corp.microsoft.com>	2020-12-10 11:03:26 +08:00
RandySheriffH	404982ded5	Enable varied input type for custom op (#6066 ) * allow custom op taking varied types * refactor test case * add test model * refactor test case * enable copy elision * update test case * fix issue in ToString function	2020-12-09 15:10:42 -08:00
Jesse Benson	cc47cfcb31	Update AMD transpose to match CUDA transpose.	2020-12-09 11:00:18 -08:00
Edward Chen	abdbb5fc84	Reduction kernel optimization (#6088 ) Optimize reduction kernel code by moving loads from global memory before computation. Add CMake option to build CUDA code with --generate-line-info option.	2020-12-09 10:20:23 -08:00
Sergii Dymchenko	9e26e59a37	Deprecate opsets <12 for training. (#6027 )	2020-12-09 00:15:27 -08:00
Weixing Zhang	d95fc5e849	clean un-used code. (#6059 ) Co-authored-by: Weixing Zhang <wezhan@microsoft.com>	2020-12-08 23:15:30 -08:00
Weixing Zhang	2705115732	add dockerfile for ROCm3.10 and update BUILD.md for ROCm EP (#5821 ) * add HSA_NO_SCRATCH_RECLAIM=1 to dockerfile It is to work around an issue in AMD compiler which generates poor GPU ISA when the type of kernel parameter is a structure and “pass-by-value” is used * update BUILD.md * add dockerfile for rocm3.10	2020-12-08 23:14:56 -08:00
ashbhandare	b1a75d0e98	Enable passing initial optimizer state while creating training session (#5869 ) * Support to pass initial optimizer states to optimizer graph builder * Changes for passing init optim state to training session config * Pass optimizer state through cpp and python frontend * Cleanup * Review comments * Fix windows and mac CI * Review comments * review comments * Review comments * Frontend review changes * Fix CI	2020-12-08 21:20:51 -05:00
Sherlock	7a43fa0028	Fix AllReduce kernel for contiguous buffer (#6064 )	2020-12-08 15:55:13 -08:00
Edward Chen	e357486707	Fix build definition template typo, add logging (#6065 ) Fix a typo in tools/ci_build/github/azure-pipelines/templates/get-docker-image-steps.yml. Add logging to tools/ci_build/get_docker_image.py for easier debugging.	2020-12-08 15:16:50 -08:00
baijumeswani	523d187193	save data to and load data from an hdf5 file for checkpointing (#5975 ) * save python dictionary to hdf5 representation and load an hdf5 file into a python dictionary * unit tests for saving data to and loading data from hdf5 file	2020-12-08 11:40:57 -08:00

1 2 3 4 5 ...

3947 commits