pytorch/tools/stats
Catherine Lee 235f7e06f4 [CI] upload_metrics function to upload to s3 instead of dynamo (#136799)
* Upload_metrics function to upload to ossci-raw-job-status bucket instead of dynamo
* Moves all added metrics to a field called "info" so ingesting into database table with a strict schema is easier
* Removes the dynamo_key field since it is no longer needed
* Removes the concept of reserved metrics, since they cannot be overwritten by user added metrics anymore
* Moves s3 resource initialization behind a function so import is faster
---
Tested by emitting a metric during run_test and seeing that documents got added to s3
Pull Request resolved: https://github.com/pytorch/pytorch/pull/136799
Approved by: https://github.com/ZainRizvi
2024-10-02 23:19:28 +00:00
..
__init__.py
check_disabled_tests.py
export_test_times.py
import_test_stats.py
monitor.py
README.md
test_dashboard.py
upload_artifacts.py
upload_dynamo_perf_stats.py
upload_external_contrib_stats.py
upload_metrics.py [CI] upload_metrics function to upload to s3 instead of dynamo (#136799) 2024-10-02 23:19:28 +00:00
upload_sccache_stats.py
upload_stats_lib.py [CI] upload_metrics function to upload to s3 instead of dynamo (#136799) 2024-10-02 23:19:28 +00:00
upload_test_stat_aggregates.py
upload_test_stats.py
upload_test_stats_intermediate.py

PyTorch CI Stats

We track various stats about each CI job.

  1. Jobs upload their artifacts to an intermediate data store (either GitHub Actions artifacts or S3, depending on what permissions the job has). Example: a9f6a35a33/.github/workflows/_linux-build.yml (L144-L151)
  2. When a workflow completes, a workflow_run event triggers upload-test-stats.yml.
  3. upload-test-stats downloads the raw stats from the intermediate data store and uploads them as JSON to Rockset, our metrics backend.
graph LR
    J1[Job with AWS creds<br>e.g. linux, win] --raw stats--> S3[(AWS S3)]
    J2[Job w/o AWS creds<br>e.g. mac] --raw stats--> GHA[(GH artifacts)]

    S3 --> uts[upload-test-stats.yml]
    GHA --> uts

    uts --json--> R[(Rockset)]

Why this weird indirection? Because writing to Rockset requires special permissions which, for security reasons, we do not want to give to pull request CI. Instead, we implemented GitHub's recommended pattern for cases like this.

For more details about what stats we export, check out upload-test-stats.yml