Overview
Offset AI estimates the carbon emissions and water consumption associated with your usage of major AI assistants — including ChatGPT, Gemini, Claude, and Perplexity. Every time you send a prompt, the servers powering that response consume electricity and water. Our methodology translates your prompt count into tangible environmental impact using the best available research from peer-reviewed studies, government data, and independent analysis.
We believe in transparency. Below, we describe exactly how we arrive at our estimates, what assumptions we make, and where the underlying data comes from — so you can evaluate our approach and hold us accountable.
Carbon Emissions Per Prompt
Energy Consumption
Every AI prompt requires a GPU server in a data center to process your request and generate a response. The energy consumed depends on the model size, the length and complexity of the prompt, and the hardware running it.
The most widely cited early estimate placed energy consumption at approximately 3 watt-hours (Wh) per ChatGPT query, based on a 2023 analysis by Alex de Vries published in Joule. However, a more recent and detailed analysis by Epoch AI (February 2025) revised this figure significantly downward, estimating that a typical ChatGPT query using GPT-4o consumes approximately 0.3 Wh — ten times less than the original estimate. This reduction reflects more efficient models, improved hardware (NVIDIA H100 vs. A100 GPUs), and more realistic assumptions about typical prompt and response length.
OpenAI CEO Sam Altman subsequently stated that a standard ChatGPT text query uses approximately 0.34 Wh, and Google's own assessment found that a median Gemini text query uses 0.24 Wh. These independent data points converge on a range of 0.24–0.4 Wh for a standard text-based AI prompt.
Offset AI's approach: We use a baseline estimate of 0.34 Wh per standard prompt, consistent with OpenAI's stated figure and corroborated by Epoch AI's independent analysis. We apply a Power Usage Effectiveness (PUE) multiplier to account for data center overhead (cooling, networking, storage) beyond GPU energy alone.
Power Usage Effectiveness (PUE)
Raw GPU energy consumption doesn't capture the full energy picture. Data centers consume additional electricity for cooling systems, networking equipment, lighting, and power distribution. The industry-standard metric for this overhead is Power Usage Effectiveness (PUE), defined as total facility energy divided by IT equipment energy.
According to the International Energy Agency (IEA), the average global data center PUE is approximately 1.3, meaning for every watt consumed by computing hardware, an additional 0.3 watts is consumed by supporting infrastructure. Hyperscale facilities operated by companies like Google and Microsoft often achieve PUEs between 1.1 and 1.2, while older or less optimized facilities may run higher.
Grid Carbon Intensity
The carbon emitted per unit of electricity depends on the energy mix powering the grid where the data center is located. We reference the following sources for grid carbon intensity:
| Source | Coverage | Value Used |
|---|---|---|
| U.S. Energy Information Administration (EIA) | United States | ~367 g CO₂/kWh (2023) |
| International Energy Agency (IEA) | Global average | ~445 g CO₂/kWh (2024) |
Because most major chat assistants are primarily served from U.S.-based data centers (Microsoft Azure for ChatGPT, Google Cloud for Gemini, AWS for Claude and Perplexity), we use the U.S. average grid intensity of approximately 367 g CO₂/kWh as our default. We additionally apply a slightly higher cloud-weighted intensity (~462 g CO₂/kWh) for AWS-hosted services to reflect the published carbon intensity of AWS U.S. regions reported in the Jegham et al. (2025) "How Hungry is AI?" benchmarks.
The Calculation
This yields an estimate of approximately 0.16 grams of CO₂ per standard text prompt. This is consistent with the range reported by Carbon Credits, which cites research by Jegham et al. (2025) estimating approximately 0.15 g CO₂ per standard ChatGPT response when accounting for grid mix and data center efficiency.
Important context: Estimates across published research range widely, from approximately 0.03 g CO₂ (Google's Gemini assessment) to over 4 g CO₂ per query (Smartly.AI), depending on methodology, model, and what's included in the calculation. Earlier studies that assumed the original 3 Wh figure or amortized full training costs across queries produce significantly higher numbers. Our approach uses the most current, corroborated energy data and applies standard data center overhead — arriving at a moderate, defensible estimate. As DesignWhine's analysis notes, estimates vary by over 175x due to methodological differences.
What's Included (and Not Included)
- Included: Inference energy (GPU computation to generate a response), data center overhead via PUE, and grid-average carbon intensity.
- Not currently included: Amortized training costs, embodied carbon of hardware manufacturing, network transmission energy, and end-user device energy. A 2024 study published in Scientific Reports estimates that amortized training adds roughly 1.8 g CO₂ per query for GPT-3, though this figure decreases as total query volume increases over the model's lifetime.
We choose to focus on inference-phase emissions because they represent the direct, per-prompt environmental cost that scales with your usage. We are actively evaluating whether to incorporate amortized training and embodied hardware emissions in future versions.
Water Consumption Per Prompt
Why AI Uses Water
Data centers generate enormous heat from running thousands of GPUs simultaneously. Most facilities use water-based cooling systems — typically evaporative cooling towers — to dissipate this heat. Additionally, the power plants generating electricity for data centers consume water through steam cycles and cooling processes. These two streams are referred to as Scope 1 (on-site cooling) and Scope 2 (electricity generation) water consumption, as described by the OECD AI Policy Observatory.
Research Basis
The foundational research on AI water consumption comes from Li et al. (2023), "Making AI Less Thirsty", published by researchers at the University of California, Riverside. This study estimates that running GPT-3 inference for 10–50 queries consumes approximately 500 milliliters of water (combining on-site cooling and off-site electricity generation water use), which translates to roughly 10–25 mL per prompt.
It's important to note that subsequent analysis has suggested the original 500 mL figure may overestimate water use for a typical conversation, partly because it assumed longer responses than most users generate. Independent analysis suggests the figure is closer to ~5 mL per prompt for modern models, accounting for efficiency improvements. OpenAI CEO Sam Altman has stated the average query uses approximately 0.3 mL, though this likely reflects only direct on-site water use without Scope 2 electricity-generation water.
Water Usage Effectiveness (WUE)
The industry metric for data center water efficiency is Water Usage Effectiveness (WUE), measured in liters per kilowatt-hour. According to the Environmental and Energy Study Institute (EESI), the average WUE across data centers is 1.9 liters per kWh. This figure varies significantly by geography, season, and cooling technology — from near zero in air-cooled Nordic facilities to over 5 L/kWh in hot, dry climates.
The Calculation
When including Scope 2 water (the water consumed by power plants generating the electricity), estimates roughly double or triple the on-site figure, depending on the energy source mix. The EESI reports that indirect water consumption from electricity generation adds approximately 4,540 liters per MWh (4.54 L/kWh) on average.
Offset AI's approach: We report an estimated ~3 mL of total water per prompt, encompassing both on-site cooling and upstream electricity generation water use. This figure is consistent with the range supported by the EESI data, the Li et al. study framework, and subsequent independent analyses.
Per-Tool Emissions Factors
Offset AI tracks usage across the major consumer AI assistants. Each tool runs on different infrastructure (different cloud providers, regions, hardware generations, and model sizes), so the per-prompt energy figure varies. We apply the formulas described above using the per-tool baseline values shown below.
Our reference energy figures are anchored in first-party disclosures from the AI providers themselves where available (OpenAI's Sam Altman for ChatGPT and Google's environmental disclosures for Gemini), and in the peer-reviewed Jegham et al. (2025) "How Hungry is AI? Benchmarking Energy, Water, and Carbon Footprint of LLM Inference" for Claude and Perplexity, which provides per-model energy benchmarks measured on production-class GPU hardware.
| Tool | Energy (Wh / prompt) |
CO₂e (kg / prompt) |
Water (L / prompt) |
|---|---|---|---|
| ChatGPT | 0.34 | 0.000162 | 0.00285 |
| Gemini | 0.24 | 0.000114 | 0.00201 |
| Claude | 0.35 | 0.000210 | 0.00360 |
| Perplexity | 0.43 | 0.000260 | 0.00470 |
Sources for Each Tool
| Tool | Energy Source | Notes |
|---|---|---|
| ChatGPT | Sam Altman, "The Gentle Singularity" (June 2025); corroborated by Epoch AI (Feb 2025) | Median GPT-4o text query, served on Microsoft Azure (U.S.) |
| Gemini | Google, "Measuring the environmental impact of AI inference" (2025) | Google's reported median Gemini text prompt, served on Google Cloud |
| Claude | Jegham et al., "How Hungry is AI?" (2025) | Benchmarked Claude 3.5 Sonnet inference on AWS U.S. regions; AWS grid intensity (~462 g CO₂/kWh) applied |
| Perplexity | Jegham et al., "How Hungry is AI?" (2025) | Benchmarked retrieval-augmented query (model inference + web retrieval overhead) on AWS U.S. regions |
Why Perplexity is higher: Perplexity is a retrieval-augmented assistant — each query triggers both a language-model inference and a web search/indexing step, which adds compute and network overhead. The Jegham et al. benchmarks capture this combined workload, which is why Perplexity's per-prompt footprint sits above the pure-inference assistants.
The CO₂e column for each tool is derived from Energy × PUE × Grid Carbon Intensity using PUE = 1.3 and the grid intensity appropriate to that tool's hosting region (367 g CO₂/kWh for ChatGPT and Gemini, ~462 g CO₂/kWh for AWS-hosted Claude and Perplexity). The water column uses the same energy figure × PUE × combined WUE (on-site cooling + electricity-generation water), with the AWS-hosted tools weighted toward higher cooling-water intensity in the southern U.S. regions where those workloads concentrate.
These factors are recalculated whenever a provider publishes new disclosures or new peer-reviewed benchmarks become available. The values shown reflect our most recent update.
Scaling to Your Usage
Offset AI's browser extension counts the number of prompts you send to each supported AI tool. We multiply your prompt count by the per-tool emission and water factors described above:
| Metric | Per Prompt (ChatGPT) | 10 prompts/day (annual) |
|---|---|---|
| Carbon emissions | ~0.16 g CO₂ | ~0.58 kg CO₂/year |
| Water consumption | ~3 mL | ~10.95 L/year |
While an individual's AI footprint is small relative to activities like driving or air travel, these numbers become meaningful at organizational scale. A team of 100 employees each sending 20 prompts per day generates approximately 117 kg CO₂ and 2,190 liters of water consumption per year from AI usage alone.
Limitations & Assumptions
We want to be upfront about what our methodology captures and where uncertainty remains:
- Prompt variability: A short yes/no question uses far less energy than a long, complex prompt with attached documents. Our per-prompt figure reflects a "standard" text interaction. Research by Arbor notes that advanced reasoning models (like o1) can consume 50–100x more energy than standard queries.
- Model differences: Different ChatGPT models (GPT-4o, GPT-4o mini, o1, etc.) have different computational requirements. We currently use a single average figure.
- Data center location: Grid carbon intensity and water usage vary dramatically by location and season. We use U.S. averages, but a query processed in a renewably-powered data center in Iowa will have a different footprint than one processed in Virginia.
- Evolving efficiency: AI models and data center hardware are rapidly becoming more efficient. We commit to updating our estimates as new research and disclosures become available.
- Training costs excluded: We do not currently amortize the significant one-time cost of model training across queries. Including training would increase per-query estimates.
Key Sources
Our Commitment to Accuracy
The field of AI environmental impact measurement is evolving rapidly. New research, hardware improvements, and increased transparency from AI companies will continue to refine our understanding. We commit to:
- Updating our estimates as new peer-reviewed research and first-party disclosures are published.
- Documenting all changes to this methodology with version history and rationale.
- Erring on the side of transparency — showing our work, citing our sources, and acknowledging uncertainty.
- Incorporating model-specific factors as data becomes available for reasoning models, image generation, and other non-standard interactions.
If you have questions about our methodology or suggestions for improvement, please reach out at support@offsetai.app.
.png)