pytorch/tools/stats
Huy Do 60d9f3f7d9 Set the epoch timestamp when uploading data to dynamoDB (#130273)
This is to move away the `_event_time` field from Rockset, which we cannot use when reimport the data
Pull Request resolved: https://github.com/pytorch/pytorch/pull/130273
Approved by: https://github.com/clee2000
2024-07-08 22:58:32 +00:00
..
__init__.py
check_disabled_tests.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
export_test_times.py [BE][Easy] replace import pathlib with from pathlib import Path (#129426) 2024-06-30 01:36:07 +00:00
import_test_stats.py [BE][Easy] replace import pathlib with from pathlib import Path (#129426) 2024-06-30 01:36:07 +00:00
monitor.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
README.md
test_dashboard.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_artifacts.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_dynamo_perf_stats.py Upload perf stats to both Rockset and dynamoDB (#129544) 2024-07-05 16:31:49 +00:00
upload_external_contrib_stats.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_metrics.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_sccache_stats.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_stats_lib.py Set the epoch timestamp when uploading data to dynamoDB (#130273) 2024-07-08 22:58:32 +00:00
upload_test_stat_aggregates.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_test_stats.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00
upload_test_stats_intermediate.py [BE][Easy] enable postponed annotations in tools (#129375) 2024-06-29 09:23:35 +00:00

PyTorch CI Stats

We track various stats about each CI job.

  1. Jobs upload their artifacts to an intermediate data store (either GitHub Actions artifacts or S3, depending on what permissions the job has). Example: a9f6a35a33/.github/workflows/_linux-build.yml (L144-L151)
  2. When a workflow completes, a workflow_run event triggers upload-test-stats.yml.
  3. upload-test-stats downloads the raw stats from the intermediate data store and uploads them as JSON to Rockset, our metrics backend.
graph LR
    J1[Job with AWS creds<br>e.g. linux, win] --raw stats--> S3[(AWS S3)]
    J2[Job w/o AWS creds<br>e.g. mac] --raw stats--> GHA[(GH artifacts)]

    S3 --> uts[upload-test-stats.yml]
    GHA --> uts

    uts --json--> R[(Rockset)]

Why this weird indirection? Because writing to Rockset requires special permissions which, for security reasons, we do not want to give to pull request CI. Instead, we implemented GitHub's recommended pattern for cases like this.

For more details about what stats we export, check out upload-test-stats.yml