Measure, attribute, and act on energy consumption across GPU infrastructure — turning raw hardware telemetry into the operational intelligence that infrastructure, finance, and sustainability teams need.
Aitra Meter reads GPU power directly via NVML or the Zeus sidecar, correlates it with token output from your inference server, and computes J/token continuously — per workload, per model, per hardware tier.
Aitra Meter connects GPU hardware energy to AI output volume at the token level. It runs entirely inside a single Kubernetes cluster — a DaemonSet agent, an aggregation service, a SQLite store, and a dashboard. One Helm install, no changes to your inference server code.
Read the architecture docsaitra_j_per_token
Joules per output token — workload × model × hardware. The energy cost of a single generated token.
aitra_co2_per_token_grams
gCO₂ per token — J/token × grid carbon intensity. Track and report your inference carbon footprint.
aitra_cost_per_million_tokens_usd
$/M tokens — J/token × electricity cost. The per-token cost visible to finance and platform teams.
aitra_idle_time_ratio
Fraction of the last hour spent idle per node. Surface over-provisioned GPU capacity instantly.
Designed to drop into your existing cloud native stack — not replace it. Works alongside the tools you already run on Kubernetes.
aitra_* metrics available instantly in PromQL.gen_ai.infrastructure.energy.* metrics to any OTel collector. Opt-in via a single Helm value.aitra_j_per_token and aitra_idle_time_ratio. Reference ScaledObjects included in examples/keda/.examples/grafana/./metrics endpoints. Extensible InferenceMetricsProvider interface for custom servers.Aitra Meter is community-driven. All components are available under the Apache 2.0 License on GitHub. There are three high-value contribution paths — a new inference server, a new energy backend, or a new measurement agent in any language.
Aitra Meter is a SODA Foundation project with public governance, a published roadmap, and open maintainership.