Gateway
Status
Active
Date
2026-04-28
Owners
- Platform Backend
Last Verified Commit
7a579730
Runtime
API only
Purpose
gateway is the player-facing HTTP edge for servers_v2.
It keeps old player route families externally stable while routing requests to
the appropriate owner service.
Primary Entry Points
- player route families under
/api/v1/* - legacy root-level player aliases such as
/user/*,/game/*,/finance/* - provider pass-through routes handled by
provider_routes.py
Dependencies
- Redis
- JWT secret
- downstream URLs for:
player_servicewallet_servicegame_servicerolling_servicepromotion_service
admin_serviceandagent_serviceare not gateway downstreams; their frontends connect directly to those services.
Background Work
None.
Owned Data
None.
gateway is an edge and compatibility layer, not a domain owner.
Events
Emits:
- none
Consumes:
- none
Health
- exposes
/health - exposes
/metricsin Prometheus text format through the shared observability middleware - binds or generates
X-Request-ID, echoes it on responses, and forwards it to downstream services - emits structured JSON logs with
service=gatewayandrequest_id - no database dependency
- request correctness relies on downstream health and middleware behavior
Key Env Vars
REDIS_URLSECRET_KEYMULTI_BRAND_ENFORCEMENT— required; one ofoff/observe/enforce. Production target isenforcepost-Phase-16. Drivesmulti_brand_enforcement_mode{service="gateway"}gauge and the JWT-vs-domain reject decision inBrandEnforcementMiddleware.BRAND_SIGNING_KEY— required in production. HMAC-SHA256 secret used to signX-Brand-Signatureon outbound brand-scoped wallet writes (gateway is one of the 6 wallet write callers).INTERNAL_SERVICE_TOKEN_GATEWAY— required in production; per-caller token presented to every downstream service. Required oncePER_CALLER_TOKEN_REQUIRED=onon the consumer.JWT_PUBLIC_KEYS— required in production; JSON map ofkid -> public_keyfor RS256 verification. Hot-reloadable via thejwt_public_keys_changedRedis pub/sub channel (T4-D-I4); spikes ingateway_jwt_unknown_kid_totalindicate stale verifier caches.JWT_KID— currently-activekid; used for diagnostic/healthreporting. Verifier accepts every kid inJWT_PUBLIC_KEYS.PER_CALLER_TOKEN_REQUIRED—onis the Phase 16 target on the consumer side; gateway is a caller, so this env var is consulted only by the downstream services it calls.INTERNAL_SERVICE_TOKEN— legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.PLAYER_SERVICE_URLWALLET_SERVICE_URLGAME_SERVICE_URLROLLING_SERVICE_URLPROMOTION_SERVICE_URLRATE_LIMIT_PER_MINUTE
Multi-Brand Constraints
Per ADR-009:
- player-facing routes resolve
brand_idfrom the request domain via the existing_extract_domainhelper plus the Redis mapsdomain:agent:{host}anddomain:level:{host}; both maps now resolve to a value carryingbrand_id - a request whose domain does not resolve to a brand is rejected at the edge
- the resolved brand is attached to
request.state.brand_idand forwarded to all downstream services as theX-Brand-Idheader - after login, JWT carries a
brand_idclaim;gatewaychecksjwt.brand_id == request.state.brand_idand behaves perMULTI_BRAND_ENFORCEMENT:observelogs + counts mismatches,enforcehard-rejects with the documented error envelope. The current mode is included in/healthresponse payload gatewaydoes not accept a brand override from request body, query, or inboundX-Brand-Idheader from external clients (any inboundX-Brand-Idis stripped before the resolved value is injected)X-Brand-Idmust be > 0 (T6-E1): theBrandContextbuilder rejects values<= 0(and any non-numeric value) by treating them as missing. Previously a0or negative value silently passed through and downstream services treated it as "brand 0" -- the legacy default. Callers that need to operate without a brand must omit the header entirely; they must NOT sendX-Brand-Id: 0_extract_domainprecedence:Originfirst, thenHostfallback; trust assumes TLS-terminating LB upstream- JWT signing uses RS256 with public key bundled in
gatewayfor verification; verifier rejects unknownkid,alg=none, and any HS-family algorithm. Private signing key is held byplayer_service,agent_service, and (post-T6-B1)admin_service - internal calls to downstream services use per-caller tokens (
INTERNAL_SERVICE_TOKEN_GATEWAY); brand-scoped wallet write commands carryX-Brand-SignatureHMAC over(gateway, brand_id, request_id, timestamp)usingBRAND_SIGNING_KEY - legacy player routes inject
domaininto the body the same way as today; the brand projection is layered on top, not replaced POST /api/v1/game/launchperforms a best-effort provider allow-list precheck from Redis keybrand:provider_allowlist:{brand_id}when present.game_serviceremains authoritative and rechecksbrand_provider_configfrom PostgreSQL before minting any provider launch URLPOST /api/v1/game/launchis JWT-protected and is the only supported player launch entrypoint. The gateway overwrites any client-suppliedplayer_idwith the verified JWT player id, forwards the resolvedX-Brand-Id, and adds internal-service headers to the backinggame_service /integration/launchcall; direct account-only or caller-supplied-player-id launch requests are not part of the public contract.
Security
Player session cross-check (T1-D-C1, post Phase 4E)
The gateway validates every authenticated request against the active player
session row written by player_service at login. The lookup contract:
- The brand-prefixed key
user-session:brand:{brand_id}:{account}(matchingplayer_service.app.services.cache_keys.session_key) is the only session key that gateway reads. Thebrand_idsource is, in order:request.state.brand_id(set byBrandResolutionMiddleware)- the verified JWT
brand_idclaim (Phase 5B and later tokens)
- If neither source carries a
brand_id, gateway returns the v1AUTH_REQUIREDenvelope. It does not fall back to the pre-Phase-4Euser-session-{account}key. - If the brand-prefixed lookup misses, the gateway returns the v1
AUTH_REQUIREDenvelope (HTTP 200 +{"status":3,...}) AND incrementsgateway_jwt_session_missing_totalso dashboards can split "JWT-valid-but-session-gone" from "JWT-bad/expired" (the latter returns from the verifier earlier and never reaches the session lookup).
The legacy fallback and its
gateway_session_legacy_key_used_total cutover metric were removed by
the Phase 4E dual-read sunset cleanup. Player sessions minted before
brand-prefixed session keys are no longer accepted.
JWT verifier hot-reload (T4-D-I4)
Each gateway pod caches a JwtVerifier on app.state.jwt_verifier so the
per-request hot path does not re-parse JWT_PUBLIC_KEYS on every call.
That cache is invalidated by Redis pub/sub on the
jwt_public_keys_changed channel:
- Operator distributes the new public key to every pod (env-var refresh
- SIGHUP / process-manager reload). The new
JWT_PUBLIC_KEYSmap should contain BOTH the old and new kid for the duration of the overlap window.
- SIGHUP / process-manager reload). The new
- Operator runs
tools/jwt_kid/rotate_jwt_kid.sh(orredis-cli PUBLISH jwt_public_keys_changed <ts>) once. Every gateway pod is subscribed at startup; on receiving the message it setsapp.state.jwt_verifier = Noneand the next request rebuilds with the freshly-loaded settings. - After the overlap window, drop the old kid from the env and re-run the rotate script.
The old "rolling restart per kid rotation" workflow is no longer required.
gateway_jwt_unknown_kid_total{service="gateway"} is emitted whenever
the verifier rejects a token because its kid header is not in the
active key set. A spike during rotation indicates pods that haven't seen
the pub/sub message yet (transient -- should drain within the cache-
invalidation latency); sustained non-zero indicates either a stale
signer or attempted forgery.
Middleware-order structural assertion (T7-B5 / T8-D2)
Starlette executes middleware in LIFO order relative to add_middleware
calls — middleware added LATER runs EARLIER on the request. The brand
resolver MUST run before JWT auth so JWT can read the brand-prefixed
Redis session key (otherwise JWT silently falls back to the legacy
unprefixed key, defeating cross-brand session isolation).
tests/test_middleware_order.py::test_brand_resolution_runs_before_jwt_auth
asserts the contract structurally by introspecting
app.user_middleware after import: BrandResolutionMiddleware must
appear at an earlier index than JWTAuthMiddleware (= registered
later = runs first). A future PR that reorders add_middleware calls
in app/main.py and breaks the contract fails this test immediately
rather than at runtime via mysterious cross-brand session leaks.
Tests
cd servers_v2/gateway && uv run pytest- key suites:
tests/test_player_route_contracts.pytests/test_legacy_routes.pytests/test_middleware.pytests/test_internal_auth_headers.pytests/test_brand_session_lookup.py-- T1-D-C1 brand-aware session cross-check + dual-read fallback + cross-brand isolation