onnxruntime/docs/ONNX_Runtime_Server_Usage.md

110 lines
5.2 KiB
Markdown
Raw Normal View History

<h1><span style="color:red">Note: ONNX Runtime Server has been deprecated.</span></h1>
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
# How to Use build ONNX Runtime Server for Prediction
ONNX Runtime Server provides an easy way to start an inferencing server for prediction with both HTTP and GRPC endpoints.
The CLI command to build the server is
Default CPU:
```
python3 /onnxruntime/tools/ci_build/build.py --build_dir /onnxruntime/build --config Release --build_server --parallel --cmake_extra_defines ONNXRUNTIME_VERSION=$(cat ./VERSION_NUMBER
```
# How to Use ONNX Runtime Server for Prediction
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
The CLI command to start the server is shown below:
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
```
$ ./onnxruntime_server
Version: <Build number>
Commit ID: <The latest commit ID>
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
the option '--model_path' is required but missing
Allowed options:
-h [ --help ] Shows a help message and exits
--log_level arg (=info) Logging level. Allowed options (case sensitive):
verbose, info, warning, error, fatal
--model_path arg Path to ONNX model
--address arg (=0.0.0.0) The base HTTP address
--http_port arg (=8001) HTTP port to listen to requests
--num_http_threads arg (=<# of your cpu cores>) Number of http threads
--grpc_port arg (=50051) GRPC port to listen to requests
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
```
**Note**: The only mandatory argument for the program here is `model_path`
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
## Start the Server
To host an ONNX model as an inferencing server, simply run:
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
```
./onnxruntime_server --model_path /<your>/<model>/<path>
```
## HTTP Endpoint
The prediction URL for HTTP endpoint is in this format:
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
```
http://<your_ip_address>:<port>/v1/models/<your-model-name>/versions/<your-version>:predict
```
**Note**: Since we currently only support one model, the model name and version can be any string length > 0. In the future, model_names and versions will be verified.
### Request and Response Payload
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
The request and response need to be a protobuf message. The Protobuf definition can be found [here](../server/protobuf/predict.proto).
A protobuf message could have two formats: binary and JSON. Usually the binary payload has better latency, in the meanwhile the JSON format is easy for human readability.
The HTTP request header field `Content-Type` tells the server how to handle the request and thus it is mandatory for all requests. Requests missing `Content-Type` will be rejected as `400 Bad Request`.
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
* For `"Content-Type: application/json"`, the payload will be deserialized as JSON string in UTF-8 format
* For `"Content-Type: application/vnd.google.protobuf"`, `"Content-Type: application/x-protobuf"` or `"Content-Type: application/octet-stream"`, the payload will be consumed as protobuf message directly.
Clients can control the response type by setting the request with an `Accept` header field and the server will serialize in your desired format. The choices currently available are the same as the `Content-Type` header field. If this field is not set in the request, the server will use the same type as your request.
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
### Inferencing
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
To send a request to the server, you can use any tool which supports making HTTP requests. Here is an example using `curl`:
```
curl -X POST -d "@predict_request_0.json" -H "Content-Type: application/json" http://127.0.0.1:8001/v1/models/mymodel/versions/3:predict
```
or
```
curl -X POST --data-binary "@predict_request_0.pb" -H "Content-Type: application/octet-stream" -H "Foo: 1234" http://127.0.0.1:8001/v1/models/mymodel/versions/3:predict
```
### Interactive tutorial notebook
A simple Jupyter notebook demonstrating the usage of ONNX Runtime server to host an ONNX model and perform inferencing can be found [here](https://github.com/onnx/tutorials/blob/master/tutorials/OnnxRuntimeServerSSDModel.ipynb).
## GRPC Endpoint
If you prefer using the GRPC endpoint, the protobuf could be found [here](../server/protobuf/prediction_service.proto). You could generate your client and make a GRPC call to it. To learn more about how to generate the client code and call to the server, please refer to [the tutorials of GRPC](https://grpc.io/docs/tutorials/).
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
## Advanced Topics
### Number of Worker Threads
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
You can change this to optimize server utilization. The default is the number of CPU cores on the host machine.
### Request ID and Client Request ID
For easy tracking of requests, we provide the following header fields:
* `x-ms-request-id`: will be in the response header, no matter the request result. It will be a GUID/uuid with dash, e.g. `72b68108-18a4-493c-ac75-d0abd82f0a11`. If the request headers contain this field, the value will be ignored.
* `x-ms-client-request-id`: a field for clients to tracking their requests. The content will persist in the response headers.
### rsyslog Support
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
If you prefer using an ONNX Runtime Server with [rsyslog](https://www.rsyslog.com/) support([build instruction](https://www.onnxruntime.ai/docs/how-to/build.html#build-onnx-runtime-server-on-linux)), you should be able to see the log in `/var/log/syslog` after the ONNX Runtime Server runs. For detail about how to use rsyslog, please reference [here](https://www.rsyslog.com/category/guides-for-rsyslog/).
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
## Report Issues
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00
If you see any issues or want to ask questions about the server, please feel free to do so in this repo with the version and commit id from the command line.
Add an HTTP server for hosting of ONNX models (#806) * Simple integration into CMake build system * Adds vcpkg as a submodule and updates build.py to install hosting dependencies * Don't create vcpkg executable if already created * Fixes how CMake finds toolchain file and quick changes to build.py * Removes setting the CMAKE_TOOLCHAIN_FILE in build.py * Adds Boost Beast echo server and Boost program_options * Fixes spacing problem with program_options * Adds Microsoft headers to all the beast server headers * Removes CXX 14 from CMake file * Adds TODO to create configuration class * Run clang-format on main * Better exception handling of program_options * Remove vckpg submodule via ssh * Add vcpkg as https * Adds onnxruntime namespace to call classes * Fixed places where namespaces were anonymous * Adds a TODO to use the logger * Moves all setting namespace shortnames outside of onnxruntime namespace * Add onnxruntime session options to force app to link with it * Set CMAKE_TOOLCHAIN_FILE in build.py * Remove whitespace * Adds initial ONNX Hosting tests (#5) * Add initial test which is failing linking with no main * Adds test_main to get hosting tests working * Deletes useless add_executable line * Merge changes from upstream * Enable CI build in Vienna environment * make hosting_run*.sh executable * Add boost path in unittest * Add boost to TEST_INC_DIR * Add component detection task in ci yaml * Get tests and hosting to compile with re2 (#7) * Add finding boost packages before using it in unit tests * Add predict.proto and build * Ignore unused parameters in generated code * Removes std::regex in favor of re2 (#8) * Removes std::regex in favor of re2 * Adds back find_package in unit tests and fixes regexes * Adds more negative test cases * Adding more protos * Fix google protobuf file path in the cmake file * Ignore unused parameters for pb generated code * Updates onnx submodule (#10) * Remove duplicated lib in link * Follow Google style guide (#11) * Google style names * Adds more * Adds an additional namespace * Fixes header guards to match filepaths * Consume protobuf * Unit Test setup * Json deserialization simple test cases * Split hosting app to lib and exe for testability * Add more cases * Clean up * Add more comments * Update namespace and format the cmake files * Update cmake/external/onnx to checkout 1ec81bc6d49ccae23cd7801515feaadd13082903 * Separate h and cc in http folder * Clean up hosting application cmake file * Enable logging and proper initialize the session * Update const position for GetSession() * Take latest onnx and onnx-tensorrt * Creates configuration header file for program_options (#15) * Sets up PredictRequest callback (#16) * Init version, porting from prototype, e2e works * More executor implementation * Adds function on application startup (#17) * Attempts to pass HostingEnvironment as a shared_ptr * Removes logging and environment from all http classes * Passes http details to OnStart function * Using full protobuf for hosting app build * MLValue2TensorProto * Revert back changes in inference_session.cc * Refactor logger access and predict handler * Create an error handling callback (#19) * Creates error callback * Logs error and returns back as JSON * Catches exceptions in user functions * Refactor executor and add some test cases * Fix build warning * Add onnx as a dependency and in includes to hosting app (#20) * Converter for specific types and more UTs * More unit tests * Update onnx submodule * Fix string data test * Clean up code * Cleanup code * Refactor logging to use unique id per request and take logging level from user (#21) * Removes capturing env by reference in main * Uses uuid for logging ids * Take logging_level as a program argument * Pass logging_level to default_logging_manager * Change name of logger to HostingApp * Log if request id is null * Update GetHttpStatusCode signature * Fix random result issue and camel-case names * Rollback accidentally changed pybin_state.cc * Rollback pybind_state.cc * Generate protobuf status from onnxruntime status * Fix function name in error message * Clean up comments * Support protobuf byte array as input * Refactor predict handler and add unit tests * Add one more test * update cmake/external/onnx * Accept more protobuf MIME types * Update onnx-tensorrt * Add build instruction and usage doc * Address PR comments * Install g++-7 in the Ubuntu 16.04 build image for vcpkg * Fix onnx-tensorrt version * Check return value during initialization * Fix infinite loop when http port is in use (#29) * Simplify Executor.cc by breaking up Run method (#27) * Move request id to Executor constructor * Refactor the logger to respect user verbosity level * Use Arena allocator instead of device * Creates initial executor tests * Merge upstream master (#31) * Remove all possible shared_ptrs (#30) * Changes GetLogger to unique_ptr * Reserve BFloat raw data vector size * Change HostingEnvironment to being passed by lvalue and rvalue references * Change routes to getting passed by const references * Enable full protobuf if building hosting (#32) * Building hosting application no longer needs use_full_protobuf flag * Improve hosting application docs * Move server core into separate folder (#34) * Turn hosting project off by default (#38) * Remove vcpkg as a submodule and download/install Boost from source (#39) * Remove vcpkg * Use CMake script to download and build Boost as part of the project * Remove std::move for const references * Remove error_code.proto * Change wording of executable help description * Better GenerateProtobufStatus description * Remove error_code protobuf from CMake files * Use all outputs if no filter is given * Pass MLValue by const reference in MLValueToTensorProto * Rename variables to argc and argv * Revert "Use all outputs if no filter is given" This reverts commit 7554190ab8e50ba6947648c2f3e2a3d4d9606ce0. * Remove all header guards in favor of #pragma once * Reserve size for output vector and optimize for-loop * Use static libs by default for Boost * Improves documentation for GenerateResponseInJson function * Start Result enum at 0 instead of 1 * Remove g++ from Ubuntu's install.sh * Update cmake files * Give explanation for Result enum type * Remove all program options shortcuts except for -h * Add comments for predict.proto * Fix JSON for error codes * Add notice on hosting application docs that it's in beta * Change HostingEnvironment back to a shared_ptr * Handle empty output_filter field * Fix build break * Refactor unit tests location and groups * First end-to-end test * Add missing log * Missing req id and client req id in error response * Add one test case to validate failed resp header * Add build flag for hosting app end to end tests * Update pipeline setup to run e2e test for CI build * Model Zoo data preparation and tests * Add protobuf tests * Remove mention of needing g++-7 in BUILD.md * Make GetAppLogger const * Make using_raw_data_ match the styling of other fields * Avoid copy of strings when initializing model * Escape JSON strings correctly for error messages (#44) * Escape JSON strings correctly * Add test examples with lots of carriage returns * Add result validation * Remove temporary path * Optimize model zoo test execution * Improve reliability of test cases * Generate _pb2.py during the build time * README for integration tests * Pass environment by pointer instead of shared_ptr to executor (#49) * More Integration tests * Remove generated files * Make session private and use a getter instead (#53) * logging_level to log_level for CLI * Single model prediction shortcut * Health endpoint * Integration tests * Rename to onnxruntime server * Build ONNX Server application on Windows (#57) * Gets Boost compiling on Windows * Fix integer conversion and comparison problems * Use size_t in converter_tests instead of int * Fix hosting integration tests on Windows * Removes checks for port because it's an unsigned short * Fixes comparison between signed and unsigned data types * Pip install protobuf and numpy * Missing test data from the rename change * Fix server app path (#58) * Pass shared_ptr by const reference to avoid ref count increase (#59) * Download test model during test setup * Make download into test_util * Rename ci pipeline for onnx runtime server * Support up to 10MiB http request (#61) * Changes minimum request size to 10MB to support all models in ONNX Model Zoo
2019-05-01 01:21:23 +00:00