ZeroGPU quota shows 0s left despite active Pro subscription

Hi,

I subscribed to the Pro plan today and my billing page correctly shows 1.0 / 25 minutes of ZeroGPU usage. However, every ZeroGPU Space I try returns:

You have exceeded your GPU quota (60s requested vs. 0s left). Try again in 0:00:00

This happens on all ZeroGPU Spaces (Qwen/Qwen3-ASR, hlevring/Qwen3-ASR, black-forest-labs/FLUX.1-schnell), with both fine-grained and classic (Read) tokens.

Account: rapportAX
isPro: true (confirmed via /api/whoami-v2)
Billing page: Shows “1.0 / 25 minutes” correctly
Actual behavior: “0s left” on every request, “Try again in 0:00:00”

Could you please check if the ZeroGPU quota has been properly synced to my account?

Thank you.

1 Like

The following are common troubleshooting steps, but based on my personal experience, the most frequent cause of that symptom is either an accidental change to the browser’s cookie settings due to updates, or if it’s an HTTP request, an incorrect header being added.


Background: what ZeroGPU is measuring

ZeroGPU quota is time-based GPU runtime, and it’s enforced per tier:

  • Unauthenticated: 2 min/day
  • Free: 3.5 min/day
  • Pro: 25 min/day
  • Reset: exactly 24 hours after your first GPU usage (rolling reset, not “midnight UTC”). (Hugging Face)

Separately, ZeroGPU requests are also rate-limited using an identity token Hugging Face adds to requests (X-IP-Token). If a request is missing the user-specific identity signal, it can be treated as unauthenticated/IP-based, which has lower quota. (Gradio)

That framing matters because your symptoms (“billing shows Pro minutes exist” but requests say “0s left” everywhere) almost always means the Space runtime is not attributing your requests to your Pro identity.


What your symptoms most strongly imply

You have:

  • Pro is active (/api/whoami-v2 says isPro: true)
  • Billing page shows 1.0 / 25 minutes (so entitlement exists)
  • Yet every ZeroGPU Space returns “0s left”, and “Try again in 0:00:00”

The best-fitting explanation is:

Your requests are being counted under the wrong identity bucket (guest/IP-based), not under your Pro account

This has been repeatedly observed when:

  • using incognito, or
  • using the direct *.hf.space URL, or
  • browser privacy settings block the login context/cookies needed for the Space runtime to recognize your account

A closely matching report: the same Space works while logged in, but in incognito or via the .hf.space direct URL it instantly says “0s left” (treated as anonymous). (Hugging Face Forums)
Hugging Face staff have also diagnosed Pro quota issues as simply “you’re not logged in” at the point the Space is being used. (Hugging Face Forums)


Fixes to try (in priority order)

1) Browser usage: force the Space to see your logged-in session

This resolves the majority of “0s left despite Pro” cases.

  1. Open the Space from the Hub page:

    • https://huggingface.co/spaces/<owner>/<space>
    • (avoid opening https://<owner>-<space>.hf.space directly)
  2. Ensure you are actually logged in (avatar/top-right).

  3. Test in a fresh browser profile (no extensions) or a different browser.

Why: direct .hf.space access and incognito frequently behave as unauthenticated (no Pro quota). (Hugging Face Forums)

If it works in a clean profile but not your main profile: your main profile is blocking the auth context.

  • Allow cookies/site data for Hugging Face / hf.space
  • Temporarily disable tracking blockers/privacy extensions for that test

(One thread explicitly points at cookie settings as the culprit.) (Hugging Face Forums)


2) API / client usage: ensure requests carry the right identity signal

If you’re calling Spaces programmatically, it’s possible to “prove” you’re Pro via whoami, but still have the actual inference call treated as unauthenticated if the runtime request isn’t authenticated the way the rate-limiter expects.

Key concept: ZeroGPU limiting is tied to an identity token (X-IP-Token) Hugging Face injects; if requests are missing the “logged-in user” identity, they’re treated as unauthenticated. (Gradio)

Concrete checks:

  • Use a standard Read token for debugging (fine-grained tokens can be scoped). (Hugging Face)

  • If using Python gradio_client, pass the token as documented: (Gradio)

    from gradio_client import Client
    client = Client("black-forest-labs/FLUX.1-schnell", hf_token="YOUR_TOKEN")
    
  • If you’re doing raw HTTP/curl against a Space, ensure the Space request (not just whoami) includes Authorization: Bearer ... (Gradio documents this pattern for private Spaces). (Gradio)

Special case (common pitfall): if you are calling a ZeroGPU Space from another Space or a proxy app, you may need to forward/extract the user’s X-IP-Token, otherwise downstream calls look unauthenticated and exhaust the low quota. (Gradio)


3) Space-specific bug: some apps fall back to IP-based quotas (Pro not applied)

This is less consistent with “all Spaces fail,” but it’s important context:

HF staff noted a Gradio bug where some ZeroGPU + non-SSR apps would “always fall back to IP-based” quotas, which does not take Pro into account, and the fix was upgrading Gradio. (Hugging Face)

This matters if:

  • the Space you’re using is on an older/broken Gradio setup, or
  • you duplicated a Space (your copy may have older dependencies)

Community guidance in another Pro/quota thread also points to “bug suspected” with older Gradio versions and suggests upgrading or duplicating/fixing the Space. (Hugging Face Forums)


4) Quota state “stuck” / not refreshing properly (rarer, but real)

There are reports where Pro quota tracking stops updating and then “recovers all at once” later. (Hugging Face Forums)
If your quota backend is stuck, you can see nonsensical timers (“Try again in 0:00:00”) while still being blocked.

This is the scenario where collecting a clean repro and asking HF staff to resync/fix is appropriate.


A fast decision tree for your exact case

  1. Does it work in a clean browser profile when opened via huggingface.co/spaces/...?
  • Yes → it’s almost certainly cookie/tracking/privacy settings causing .hf.space to see you as logged out.
  • No → go to (2).
  1. Does a minimal gradio_client call with a Read token still show “0s left”?
  • No → your previous method wasn’t authenticating the actual inference requests.
  • Yes → likely a backend quota state issue (or widespread incident), and you should escalate with a minimal repro.

If you escalate, include these specifics (so it gets fixed faster)

  • Username: rapportAX
  • Confirmation: whoami-v2 shows isPro: true
  • Billing screenshot showing remaining ZeroGPU minutes
  • Repro: one Space URL opened via huggingface.co/spaces/... in a clean profile
  • Whether direct .hf.space behaves differently than the hub page
  • Whether you’re behind VPN/corporate NAT (relevant if falling back to IP-based quotas)
1 Like