onnxruntime

mirror of https://github.com/saymrwulf/onnxruntime.git synced 2026-05-18 21:21:17 +00:00

Author	SHA1	Message	Date
Ksenija Stanojevic	ea37a4d89b	Add Trilu custom op (#4537 ) Co-authored-by: neginraoof <neginmr@utexas.edu>	2020-08-17 14:42:26 -07:00
Thiago Crepaldi	42408aa3ed	Add new PytTrch front-end (#4815 ) * Add ORTTrainerOptions class for the new pytorch frontend (#4382) Add ORTTrainerOptions class and some placeholders * Add _ORTTrainerModelDesc to perform validation for model description (#4416) * Add Loss Scaler classes to the new frontend (#4306) * Add TrainStepInfo used on the new frontend API (#4256) * Add Optimizer classes to the new frontend (#4280) * Add LRScheduler implementation (#4357) * Add basic ORTTrainer API (#4435) This PR presents the public API for ORTTrainer for the short term development. It also validates and saves input parameters, which will be used in the next stages, such as building ONNX model, post processing the model and configuring the training session * Add opset_version into ORTTrainerOptions and change type of ORTTrainer.loss_fn (#4592) * Update ModelDescription and minor fix on ORTTrainer ctor (#4605) * Update ModelDescription and minor fix on ORTTrainer/ORTTrainerOptions This PR keeps the public API intact, but changes how model description is stored on the backend Currently, users creates a dict with two lists of tuples. One list called 'inputs' and each tuple has the following format tuple(name, shape). The second list is called 'outputs' and each tuple can be either tuple(name, shape) or tuple(name, shape, is_loss). With this PR, when this dict is passed in to ORTTrainer, it is fully validated as usual. However, tuples are internally replaced by namedtuples and all output tuples will have tuple(name, shape, is_loss) format instead of is_loss being optionally present. Additionally to that normalization in the internal representation (which eases coding), two internal methods were created to replace a namedtuple(name, shape) to namedtuple(name, shape, dtype) or namedtuple(name, shape, is_loss, dtype) dependeing whether the tuple is an input or output. This is necessary as ORTTRainer finds out data types of each input/output during model export to onnx. Finally, a minor fix was done on ORTTrainer. It could initialize ORTTrainerOptions incorrectly when options=None * Rename input name for test * Add ONNX Model Export to New Frontend (#4612) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Create training session + minor improvements (#4668) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Save ONNX model in file (#4671) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add eval step (#4674) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add train_step (#4677) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add LR Scheduler (#4694) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add deterministic compute tests (#4716) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add legacy vs experimental ORTTrainer accuracy comparison (#4727) Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> * Add Mixed precision/LossScaler + several fixes (#4739) Additionally to the mixed precision/loss scaler code, this PR includes: * Fix CUDA training * Add optimization_step into TrainStepInfo class * Refactor LRSCheduler to use optimization_step instead of step * Updated several default values at ORTTrainerOptions * Add initial Gradient Accumulation supported. Untested * Fix ONNX model post processing * Refactor unit tests * Add ONNX BERT example + minor fixes (#4757) * Fix training issue when passing ONNX file into ORTTrainer Co-authored-by: Thiago Crepaldi <thiago.crepaldi@microsoft.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> * Add Dynamic Shape support (#4758) * Update DeepSpeed Zero Stage option to a separate option group (#4772) * Add support to fetches (#4777) * Add Gradient Accumulation Steps support (#4793) * Fix Dynamic Axes feature and add unit test (#4795) * Add frozen weights test (#4807) * Move new pytorch front-end to 'experimental' namespace (#4814) * Fix build Co-authored-by: Rayan-Krishnan <rayankrishnan@live.com> Co-authored-by: Rayan Krishnan <t-rakr@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-08-17 09:45:25 -07:00
Changming Sun	5eec4f66ed	Refactor manylinux docker image and the related pipelines (#4751 ) 1. Publish the image ACR, instead of building it every time for every PR 2. Make USE_MKLML and USE_OPENMP be able to co-exist. Currently both of them are enabled in our Linux CI build but indeed only one of them is taking effect. 3. Split nuphar and DNNL to separated pipelines. 4. Fix two warnings in onnxruntime/core/optimizer/matmul_scale_fusion.cc and onnxruntime/test/tvm/tvm_basic_test.cc. 5. Update the manylinux2010_x86_64 image to the latest.	2020-08-17 09:40:31 -07:00
Yulong Wang	aa993e95c9	enable build flag '--use_openmp' on MacOS (#4774 ) * enable build flag '--use_openmp' on MacOS * cmake 3.16.1 to enable find_package(OpenMP) on mac	2020-08-13 15:56:42 -07:00
jingyanwangms	adda8c66d9	Docker image release pipeline (#4682 ) * create orttraining-1p-linux-gpu-ci-pipeline.yml * fix syntax * fix file path * fix template path * publish docker image to test acr * use right task name * change parameter list * use variables * use python.version * remove --enable_onnx_tests due to segfault * add back --enable_onnx_tests * fix docker push command line * change docker login command * login differently * fix docker tag script * create password.txt * add ortrelease docker image * enable test in build.sh * add pipeline parameter * add pipeline parameter * change timeout * change timeout * fix run_dockerbuild.sh * use PR checkin build docker * fix strategy syntax * fix strategy syntax * change dockerfile * change run_dockerbuild.sh * change tag name * build with root user * use build id for docker image tag * remove all user lines * change docker tag * add mpi, mellanox * add missing args * use release dockerfile for ci build * remove install wheel * use release docker image * fix syntax * use different pool * add Dockerfile.training * remove sudo to run on Linux-Multi-GPU-V100 * change docker file path * update dockerfile * use latest dockerfile * change agent pool * remove --preserve-env * add back parameter * Add test_flag * use azuredevops docker * change repository * use cmd for docker login * echo build script * use ortrelrease ACR * change key vault connection * Move --build flag * change build command * add paramter for image tag * clean up for PR * remove unnecessary changes * whitespace changes * whitespace changes * change build flag * change flag name * change flag * use latest dockerfile * enable build tests * build builder stage and run test * Add back python.version * change build directory * always run build entire dockerfile * fix yml syntax * fix syntax * add en-UTF8 locale * rename * remove unused template * Update orttraining-linux-gpu-docker-release-pipeline.yml for Azure Pipelines * Update orttraining-linux-gpu-docker-release-pipeline.yml for Azure Pipelines * Test commit sha1 in pipeline * fix parameter * update docker file * fix --from=build * remove commented blocks * PR comments * fix syntax * fix syntax * use timestamp as build number * remove latest tag * add build_timestamp variable * remove wrong property * fix docker run command * test build id * Use datestamp build id * change build tags * add no-cache to docker build * rename BUILD_VERSION -> BUILD_CONFIG Co-authored-by: Jingyan Wang <jingywa@OrtDevTest2v100.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net> Co-authored-by: Jingyan Wang <jingywa@OrtTrainingDev3.af05slrtruoetgaxwwjv5nsq5e.px.internal.cloudapp.net>	2020-08-12 13:29:37 -07:00
Dmitri Smirnov	ac4997665a	Make Java Publishing and Java GPU pipelines to run nightly (#4749 ) Schedule Java daily Bump up iInux GPU build timeout	2020-08-10 17:38:45 -07:00
stevenlix	77c69a0325	Upgrade TensorRT to v7.1.3.4 (#4704 ) * upgrade to TensorRT 7.1.3.4 * Upgrade onnx-tensorrt parser for TensorRT 7.1.3.4 * fix format issue * fix format issue * fix format issue * Update tensorrt_execution_provider.cc * change cmake version to 3.14 * Remove --msvc_toolset 14.16 * change to onnxruntime::make_unique * use onnxruntime::make_unique * disable some tests for TensorRT * disable some tests for TensorRT * Update upsample_op_test.cc * Update tile_op_test.cc * disable some tests for TensorRT * Update constant_of_shape_test.cc * update parser * Update Dockerfile.ubuntu_tensorrt	2020-08-07 17:43:56 -07:00
Sheil Kumar	5c5efa900d	Add .NET Core 3.0 nuget e2e pipeline tests (#4695 ) * bump cswinrt version * add cswinrt * test dotnetcore 3.0 * rename buildpacakge source * set folder path to the package source and not the version * refactor .netframework tests * build .net core anycpu Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-08-05 13:02:24 -07:00
Changming Sun	d0297f8d24	Add 'Install ONNX' step to Windows GPU pipeline (#4696 ) Add 'Install ONNX' step to Windows GPU pipeline Previously it's not a problem because onnxruntime python package explicitly said it depends on ONNX, so ONNX will get installed when we test onnxruntime. However, it was removed in #4073	2020-08-03 18:51:24 -07:00
Changming Sun	01ca6392cb	Avoid building ONNX of every history ONNX versions in our CI (#4678 ) 1. Avoid building ONNX of every history ONNX versions in our CI, it is costly and easy to fail. 2. Run docker command without sudo. Previously the user is not in docker group, now Azure DevOps Service have added it in.	2020-08-03 10:18:10 -07:00
Changming Sun	f9f25c5559	Remove featurizer from CI build (#4661 )	2020-07-30 18:37:55 -07:00
Changming Sun	51332e3c81	Change Linux CI build time out value to 3 hours (#4664 ) Because it often need more than 1 hr 55 minutes, increase the value so that we'll less likely see pipeline failed.	2020-07-30 02:52:05 -07:00
Xiang Zhang	d73e01e5b9	remove ENABLE_TELEMETRY macro (#4633 )	2020-07-27 20:06:11 -07:00
gwang-msft	c2ec3b734b	[Android NNAPI EP] Remove dependency on external JD/DNNLibrary (#4576 ) * remove dependency of external jd-dnnlibrary * remove extra variables not used any more * update /cgmanifest.json	2020-07-22 14:08:12 -07:00
Sheil Kumar	fa6d035090	Create WindowsAI zip files automatically as part of the pipeline (#4584 ) * copy rename nupkg to zip as part of build task * update both symbols and regular package Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-07-22 10:53:47 -07:00
Changming Sun	c2c4e6760b	Fix code sign validation errors in nuget and nodejs pipeline (#4527 )	2020-07-20 14:18:47 -07:00
Changming Sun	bc1d197ddf	Re-enable dnnl in CI build (#4544 ) * Revert "Temporarily remove dnnl from Linux CI build to unblock the whole team (#4266)" Previously it fails because it used too much memory. Now we only run dnnl EP with opset12 models in unit tests, to reduce peak memory usage.	2020-07-19 23:20:03 -07:00
Yulong Wang	5086e55a35	Fix condition of running tests in win CI (#4459 )	2020-07-16 16:33:30 -07:00
Changming Sun	8ada440961	Move model tests to onnxruntime_test_all (#4521 ) 1. Move model tests to onnxruntime_test_all 2. Publish TestResults of Windows CI build.	2020-07-15 16:46:18 -07:00
edgchen1	34f73fa1aa	Add sudo --preserve-env option to allow environment to go through to docker commands. (#4512 )	2020-07-14 18:12:31 -07:00
liqunfu	f721f5f1cd	Liqun/multiple choice (#4480 ) * multiple choice runner * add docker cleanup task to frontent pipeline	2020-07-14 17:57:58 -07:00
Sheil Kumar	ee5ca27ae2	Split Microsoft.AI.MachineLearning.nupkg in a NuGet package and symbol NuGet package (#4503 ) * add threadpool interface * generate snupkgs * include_pdb check * fix snupkg generation * Add task to merge snupkgs * folder exists * check dir * revert thread pool stuff Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-07-14 14:52:39 -07:00
gwang-msft	5f8f443ac4	Android CI build, test copy, emulator boot improvement (#4481 ) * Enable onnxruntime_test_all for NNAPI EP * switch to use ninja for ANdroid CI * make android elumator boot faster in android ci * simplify adb push * more style change * more tweaking on android ci * build.py style update	2020-07-13 14:18:34 -07:00
Dmitri Smirnov	35ee00d888	Pin typing version. (#4490 )	2020-07-13 11:48:30 -07:00
Hariharan Seshadri	26ebcfab88	Fix Nuget GPU pipeline (#4462 )	2020-07-10 14:02:28 -07:00
Yulong Wang	bec18eb3f4	[Node.js binding] support CentOS 7 in CI (#4447 )	2020-07-09 00:59:50 -07:00
Negin Raoof	71aec2adcb	Custom op export test template (#4383 ) * Adding pytorch custom op export tests to CI * Test clean build * Fix export for intended failure * update export script * Build onnxruntime	2020-07-08 10:14:56 -07:00
Hariharan Seshadri	6d6b6b54a5	Support binding a graph output to a specific device via the Python binding (#4439 )	2020-07-07 21:09:37 -07:00
Sheil Kumar	fdb4a3a2e8	Add cppwinrt and cswinrt tests in windowsai nuget pipeline (#4381 ) * build e2e cppwinrt tests * add use nuget task * make all referenced to package version prop/target-ified * remove dupe props/targets reference * work around project.assets.json error by deleting it * powershell test invocation * switch to batch script * print debug info * update x86->x64 * stdio.h * pushd/popd * add csharp tests * package.config -> packages.config * typo * x86 -> anycpu * debug is default * add test path * update csproj as well * debug * really replace all package versions * debug output * really use [PackageVersion] * sleep intead of converting async operation to task and waiting * dont close software bitmap * switch to powershell script * remove binding check * continue on failure * continuse on error action * continueOnError and errorActionPreference * tabbing Co-authored-by: Sheil Kumar <sheilk@microsoft.com>	2020-07-07 09:36:42 -07:00
suffiank	7a05b3ca87	Increase python packaging pipeline timeout (#4412 ) * increase python packaging pipeline from 90 to 110 min * change timeout to Linux GPU and do 120 min to match Win GPU	2020-07-02 15:38:39 -07:00
gwang-msft	0bef9d5114	Fix the broken Android NNAPI CI (#4403 ) * Change NNAPI CI to run on new NNAPI EP * update android ci to mac 10.15 and remove in install cmake * update the android ci to targe android api level 29 * remove unnecessary ndk install git submodule call	2020-07-02 10:22:18 -07:00
Changming Sun	3bb6a865cc	Revert "remove openmp and scipy from build pipelines (#4305 )"	2020-07-02 00:30:02 -07:00
Tiago Koji Castro Shibata	7fea332f93	Support builds without RTTI (#4333 ) * Support builds without RTTI * Disable RTTI in all builds	2020-07-01 13:05:35 -07:00
Dmitri Smirnov	49268c42da	Change the way java home is set on Mac OS for CI and Java publishing pipeline (#4385 ) * Change the way java_home is set on Mac. * Change the way JAVA_HOME is set on Mac OS	2020-07-01 07:37:14 -07:00
Negin Raoof	37cbe8551d	Adding export registration and tests for custom ops (#4248 )	2020-06-25 22:29:02 -07:00
Changming Sun	5db67ec000	Fix python package issue and upgrade the linux image to 2010 (#4342 ) 1. Increase job timeout, while we are investigating why the tests take much longer 2. Upgrade the linux docker image to manylinux2010, by request from Tianlei. (We had an offline discussion with Pranav and Tracy) 3. Remove the installation of "devtoolset-7" in the CUDA image. It was added for CUDA 10.0, it is not needed for CUDA 10.1. We have moved to CUDA 10.1.	2020-06-25 20:22:39 -07:00
Dmitri Smirnov	a08805daf9	Fix a minor typon in POM file name (#4250 ) Co-authored-by: Changming Sun <chasun@microsoft.com>	2020-06-25 11:15:14 -07:00
Changming Sun	deea945f80	Remove openmp and scipy from build pipelines (#4305 ) 1. Remove openmp because the default thread pool is already good enough. 2. Remove scipy from build pipelines because it stops support python 3.5.	2020-06-23 20:18:16 -07:00
edgchen1	4e39fda06a	Fix version of torch and torchvision in install_deps.sh. (#4316 )	2020-06-23 14:55:18 -07:00
edgchen1	737c22a911	Refactor Python packaging builds (#4283 ) Reuse the same template file for all Python packaging builds.	2020-06-22 17:13:22 -07:00
Pranav Sharma	2204d39a06	Add build option to disable traditional ML ops from the binary. (#4272 ) * Add build option to disable traditional ML ops from the binary. * Fix python tests by splitting tests for ML ops to a separate file. Exclude ML tests from onnx_test_runner and C# tests. Exclude ML op sources. * Update Edge pkg pipelines with new MLops env variable and fix C# packaging pipeline tests to skip ML ops.	2020-06-20 06:36:06 -07:00
Changming Sun	0349479b19	Fix component governance and codesign validation errors (#4277 ) Adjust the job steps so that these security tasks run before the build directory clean up.	2020-06-18 15:54:18 -07:00
Changming Sun	43deec2174	Temporarily remove dnnl from Linux CI build to unblock the whole team (#4266 )	2020-06-17 16:25:24 -07:00
edgchen1	63bf587623	Use azcopy to download test data (#4221 ) Use azcopy from download_e2e_test_data.py, add helper function for downloading azcopy. Update download_test_data.py to use helper function.	2020-06-16 10:14:34 -07:00
Hariharan Seshadri	91a41298cc	Fix ORT build when onnxruntime_PYBIND_EXPORT_OPSCHEMA is enabled (#3954 )	2020-06-12 19:32:57 -07:00
Changming Sun	6f4320fb85	Fix the python package name issue (#4207 ) Fix the package package name issue. In my last change(#4197) about enabling code sign. I forgot to pass the additional flags to setup.py,	2020-06-12 08:32:59 -07:00
Changming Sun	8f8d899bf2	Enable code sign in c api pipeline and python pipeline	2020-06-10 19:31:22 -07:00
Yulong Wang	73bc6be5d1	build: split nodejs binding build and test to avoid timeout issue (#4188 ) * split nodejs binding build and test * enable nodejs tests	2020-06-10 19:16:32 -07:00
Dmitri Smirnov	af0750ba1b	Java GPu artifact naming (#4179 ) Modify gradle build so artifactID has _gpu for GPU builds. Pass USE_CUDA flag on CUDA build Adjust publishing pipelines to extract POM from a correct path. Co-Authored-By: @Craigacp	2020-06-10 11:15:48 -07:00
Changming Sun	c0bdbc0b39	Enable telemetry for the C API and python pipeline (#4174 )	2020-06-10 00:07:46 -07:00

1 2 3 4 5 ...

466 commits