onnxruntime/onnxruntime/core/platform/telemetry.h
ivberg 74028e4bdc
Fully dynamic ETW controlled logging for ORT and QNN logs (#20537)
### Description
Windows - Fully dynamic ETW controlled logging for ORT and QNN logs

The logging support is documented here 
-
https://onnxruntime.ai/docs/performance/tune-performance/logging_tracing.html#tracing---windows
-
https://onnxruntime.ai/docs/performance/tune-performance/profiling-tools.html#tracelogging-etw-windows-profiling

Also add support for logging ORT SessionCreation on ETW CaptureState

### Motivation and Context
The previous ETW support only worked if you enabled ETW before the
session started. There can commonly be long-lived AI inference processes
that need to be traced & debugged. This enables logging fully on the
fly.

Without this support a dev would have to end up killing a process or
stopping a service in order to get tracing. We had to do this for a
recent issue with QNN, and it was a bit painful to get the logs and it
ruined the repro.

### Testing
I tested with the following cases
- Leaving default ORT run
- Enabling ETW prior to start and leaving running for entire session +
inferences, then stopping
- Starting ORT session + inf, then enabling and stopping ETW
  - Start ORT session /w long running Inferences 
- wpr -start
[ort.wprp](e6228575e4/ort.wprp (L4))
-start
[etw_provider.wprp](e6228575e4/onnxruntime/test/platform/windows/logging/etw_provider.wprp)
  - Wait a few seconds
  - wpr -stop ort.etl
  - Inferences are still running
- Verify ONNXRuntimeLogEvent provider events are present and new
SessionCreation_CaptureState event under Microsoft.ML.ONNXRuntime
provider

Related:
#18882
#19428
2024-06-06 21:11:14 -07:00

76 lines
2.6 KiB
C++

// Copyright (c) Microsoft Corporation. All rights reserved.
// Licensed under the MIT License.
#pragma once
#include <string>
#include <vector>
#include <unordered_map>
#include "core/common/status.h"
#include "core/common/common.h"
struct _LUID;
using LUID = _LUID;
namespace onnxruntime {
/**
* Configuration information for a session.
* An interface used by the onnxruntime implementation to
* access operating system functionality for telemetry
*
* look at env.h and the Env objection which is the activation factory
* for telemetry instances
*
* All Telemetry implementations are safe for concurrent access from
* multiple threads without any external synchronization.
*/
class Telemetry {
public:
// don't create these, use Env::GetTelemetryProvider() instead
// this constructor is made public so that other platform Env providers can
// use this base class as a "stub" implementation
Telemetry() = default;
virtual ~Telemetry() = default;
virtual void EnableTelemetryEvents() const;
virtual void DisableTelemetryEvents() const;
virtual void SetLanguageProjection(uint32_t projection) const;
virtual bool IsEnabled() const;
// Get the current logging level
virtual unsigned char Level() const;
// Get the current keyword
virtual uint64_t Keyword() const;
virtual void LogProcessInfo() const;
virtual void LogSessionCreationStart() const;
virtual void LogEvaluationStop() const;
virtual void LogEvaluationStart() const;
virtual void LogSessionCreation(uint32_t session_id, int64_t ir_version, const std::string& model_producer_name,
const std::string& model_producer_version, const std::string& model_domain,
const std::unordered_map<std::string, int>& domain_to_version_map,
const std::string& model_graph_name,
const std::unordered_map<std::string, std::string>& model_metadata,
const std::string& loadedFrom, const std::vector<std::string>& execution_provider_ids,
bool use_fp16, bool captureState) const;
virtual void LogRuntimeError(uint32_t session_id, const common::Status& status, const char* file,
const char* function, uint32_t line) const;
virtual void LogRuntimePerf(uint32_t session_id, uint32_t total_runs_since_last, int64_t total_run_duration_since_last) const;
virtual void LogExecutionProviderEvent(LUID* adapterLuid) const;
private:
ORT_DISALLOW_COPY_ASSIGNMENT_AND_MOVE(Telemetry);
};
} // namespace onnxruntime