The following are common troubleshooting steps, but based on my personal experience, the most frequent cause of that symptom is either an accidental change to the browser’s cookie settings due to updates, or if it’s an HTTP request, an incorrect header being added.
Background: what ZeroGPU is measuring
ZeroGPU quota is time-based GPU runtime, and it’s enforced per tier:
- Unauthenticated: 2 min/day
- Free: 3.5 min/day
- Pro: 25 min/day
- Reset: exactly 24 hours after your first GPU usage (rolling reset, not “midnight UTC”). (Hugging Face)
Separately, ZeroGPU requests are also rate-limited using an identity token Hugging Face adds to requests (X-IP-Token). If a request is missing the user-specific identity signal, it can be treated as unauthenticated/IP-based, which has lower quota. (Gradio)
That framing matters because your symptoms (“billing shows Pro minutes exist” but requests say “0s left” everywhere) almost always means the Space runtime is not attributing your requests to your Pro identity.
What your symptoms most strongly imply
You have:
- Pro is active (
/api/whoami-v2 says isPro: true)
- Billing page shows 1.0 / 25 minutes (so entitlement exists)
- Yet every ZeroGPU Space returns “0s left”, and “Try again in 0:00:00”
The best-fitting explanation is:
Your requests are being counted under the wrong identity bucket (guest/IP-based), not under your Pro account
This has been repeatedly observed when:
- using incognito, or
- using the direct
*.hf.space URL, or
- browser privacy settings block the login context/cookies needed for the Space runtime to recognize your account
A closely matching report: the same Space works while logged in, but in incognito or via the .hf.space direct URL it instantly says “0s left” (treated as anonymous). (Hugging Face Forums)
Hugging Face staff have also diagnosed Pro quota issues as simply “you’re not logged in” at the point the Space is being used. (Hugging Face Forums)
Fixes to try (in priority order)
1) Browser usage: force the Space to see your logged-in session
This resolves the majority of “0s left despite Pro” cases.
-
Open the Space from the Hub page:
https://huggingface.co/spaces/<owner>/<space>
- (avoid opening
https://<owner>-<space>.hf.space directly)
-
Ensure you are actually logged in (avatar/top-right).
-
Test in a fresh browser profile (no extensions) or a different browser.
Why: direct .hf.space access and incognito frequently behave as unauthenticated (no Pro quota). (Hugging Face Forums)
If it works in a clean profile but not your main profile: your main profile is blocking the auth context.
- Allow cookies/site data for Hugging Face /
hf.space
- Temporarily disable tracking blockers/privacy extensions for that test
(One thread explicitly points at cookie settings as the culprit.) (Hugging Face Forums)
2) API / client usage: ensure requests carry the right identity signal
If you’re calling Spaces programmatically, it’s possible to “prove” you’re Pro via whoami, but still have the actual inference call treated as unauthenticated if the runtime request isn’t authenticated the way the rate-limiter expects.
Key concept: ZeroGPU limiting is tied to an identity token (X-IP-Token) Hugging Face injects; if requests are missing the “logged-in user” identity, they’re treated as unauthenticated. (Gradio)
Concrete checks:
-
Use a standard Read token for debugging (fine-grained tokens can be scoped). (Hugging Face)
-
If using Python gradio_client, pass the token as documented: (Gradio)
from gradio_client import Client
client = Client("black-forest-labs/FLUX.1-schnell", hf_token="YOUR_TOKEN")
-
If you’re doing raw HTTP/curl against a Space, ensure the Space request (not just whoami) includes Authorization: Bearer ... (Gradio documents this pattern for private Spaces). (Gradio)
Special case (common pitfall): if you are calling a ZeroGPU Space from another Space or a proxy app, you may need to forward/extract the user’s X-IP-Token, otherwise downstream calls look unauthenticated and exhaust the low quota. (Gradio)
3) Space-specific bug: some apps fall back to IP-based quotas (Pro not applied)
This is less consistent with “all Spaces fail,” but it’s important context:
HF staff noted a Gradio bug where some ZeroGPU + non-SSR apps would “always fall back to IP-based” quotas, which does not take Pro into account, and the fix was upgrading Gradio. (Hugging Face)
This matters if:
- the Space you’re using is on an older/broken Gradio setup, or
- you duplicated a Space (your copy may have older dependencies)
Community guidance in another Pro/quota thread also points to “bug suspected” with older Gradio versions and suggests upgrading or duplicating/fixing the Space. (Hugging Face Forums)
4) Quota state “stuck” / not refreshing properly (rarer, but real)
There are reports where Pro quota tracking stops updating and then “recovers all at once” later. (Hugging Face Forums)
If your quota backend is stuck, you can see nonsensical timers (“Try again in 0:00:00”) while still being blocked.
This is the scenario where collecting a clean repro and asking HF staff to resync/fix is appropriate.
A fast decision tree for your exact case
- Does it work in a clean browser profile when opened via
huggingface.co/spaces/...?
- Yes → it’s almost certainly cookie/tracking/privacy settings causing
.hf.space to see you as logged out.
- No → go to (2).
- Does a minimal
gradio_client call with a Read token still show “0s left”?
- No → your previous method wasn’t authenticating the actual inference requests.
- Yes → likely a backend quota state issue (or widespread incident), and you should escalate with a minimal repro.
If you escalate, include these specifics (so it gets fixed faster)
- Username:
rapportAX
- Confirmation:
whoami-v2 shows isPro: true
- Billing screenshot showing remaining ZeroGPU minutes
- Repro: one Space URL opened via
huggingface.co/spaces/... in a clean profile
- Whether direct
.hf.space behaves differently than the hub page
- Whether you’re behind VPN/corporate NAT (relevant if falling back to IP-based quotas)