Player Service

Status

Active

Date

2026-04-28

Owners

Platform Backend

Last Verified Commit

Use git log -- <this file> for current last-touch history; this field is intentionally not pinned to a static hash so it does not become stale after unrelated commits.

Runtime

API + dedicated player_worker (the outbox loop runs in the worker; the API process runs with ENABLE_OUTBOX_POLLER: "false")

Purpose

player_service owns player authentication, registration helpers, profile data, messages, and public/common player-side data that back the gateway player flows.

Primary Entry Points

/internal/players/*
/internal/common/*

Internal player lookups use /internal/players/lookup with a mandatory X-Brand-Id. The only headerless-brand exception is /internal/players/relay-lookup, which is internal-token protected and requires both a globally stable player_id and Aggregator-signed canonical brand_code; it joins the authoritative player/brand rows and returns data only when the brand is enabled and owns that player. There is no generic brandless lookup fallback.

Main route groups:

auth and registration helpers
player profile and account info
player messages
common notices, banners, captcha, and site settings

Protection model:

/internal/players/* and /internal/common/* require X-Internal-Service-Token
gateway is the intended caller for player-web traffic

Dependencies

PostgreSQL
Redis
JWT secret
AES and password-salt config
optional SMS provider config

Background Work

player_service runs its outbox poller in a dedicated player_worker.

That means:

the API process runs with ENABLE_OUTBOX_POLLER: "false"; the standalone player_worker runs it with "true"
the API /health checks DB and Redis; player_worker exposes its own heartbeat/health (WORKER_HEARTBEAT_FILE)

Owned Data

player auth and profile state
player registration semantics
player-owned outbox rows
message/common projections served to gateway

Events

Emits:

player-domain outbox events such as PLAYER_REGISTERED

Consumes:

no cross-service stream is currently documented as a required input

Health

API /health checks:
- DB
- Redis
- outbox loop status

Key Env Vars

DATABASE_URL
REDIS_URL
SECRET_KEY
PASSWORD_SALT — required in production; protects the password-hash pipeline. Empty in a prod-grade env crashes boot via assert_aes_pii_keys_configured.
AES_KEY — required in production; 32-byte AES-256 key for phone_num/bank_num PII columns. New writes use versioned AES-256-GCM with a random nonce (aesgcm:v1: prefix), and reads keep legacy AES-256-CBC compatibility for historical rows. Boot guard refuses to start if empty in RGB_ENV/APP_ENV/ENVIRONMENT ∈ {staging,production,prod}. Without this guard the _encrypt_pii helpers historically wrote plaintext (T1-D-C3).
AES_IV — required in production until the legacy CBC backfill is complete; 16-byte IV used only to decrypt historical AES-256-CBC rows. New writes do not reuse a static IV.
MULTI_BRAND_ENFORCEMENT — required; one of off / observe / enforce. Production target is enforce post-Phase-16. Drives multi_brand_enforcement_mode{service="player_service"} gauge.
JWT_PRIVATE_KEY — required in production; RS256 private key. Held only by player_service and agent_service (not on any verifier-only service). Rotation procedure documented in the rollout runbook under "JWT RS256 key rotation". app/main.py fails fast in production-grade runtimes when this is empty.
JWT_KID — required in production; currently-active kid stamped onto every freshly-minted JWT. app/main.py fails fast in production-grade runtimes when this is empty.
RECOVERY_TOKEN_TTL_MINUTES — TTL for recovery short codes / reset tokens.
RECOVERY_EXPOSE_TOKEN_FOR_TESTS — test-only escape hatch. When on, recovery routes log raw tel=<...> / code=<...> lines for integration-test inspection. assert_recovery_token_exposure_safe (Codex P1-#5) refuses to allow this when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod. Production deploy templates MUST leave this unset.
SMS_URL
SMS_PLATFORM
COOLSMS_API_KEY
COOLSMS_API_SECRET
COOLSMS_FROM
PER_CALLER_TOKEN_REQUIRED — on is the Phase 16 target; activates the legacy-token hard-reject (T4-D-I2) for inbound /internal/players/* calls.
INTERNAL_SERVICE_TOKEN_GATEWAY — required in production; per-caller token accepted on inbound calls from gateway.
INTERNAL_SERVICE_TOKEN_GAME — required in production; per-caller token accepted on inbound lookup calls from game_service.
INTERNAL_SERVICE_TOKEN — legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.

When PER_CALLER_TOKEN_REQUIRED=on, player_service fails fast at boot in production-grade runtimes unless the active gateway and game caller tokens above are configured. _PREV variants are accepted only during request-time rotation overlap and do not satisfy startup readiness.

Boot guard: AES PII keys

app/main.py calls rgb_contracts.infra.aes_guard.assert_aes_pii_keys_configured() before the FastAPI app starts. In a production-grade runtime env (per is_production_runtime_env — any of RGB_ENV/APP_ENV/ENVIRONMENT set to staging/production/prod), an empty or whitespace-only AES_KEY, AES_IV, or PASSWORD_SALT raises AesPiiKeysMisconfigured and the service refuses to start. The same guard validates AES_KEY is 32 bytes and AES_IV is 16 bytes. Outside production the guard is a no-op so dev fixtures with empty keys keep working; the runtime helpers still emit pii_aes_unconfigured_total{service,op} + WARN logs in that branch so an accidental misconfiguration is visible on dashboards. Versioned GCM ciphertext that fails AEAD authentication raises in production regardless of AES_STRICT_DECRYPT; the AES_STRICT_DECRYPT flag now applies only to ambiguous legacy base64-shaped CBC/plaintext compatibility reads.

Multi-Brand Constraints

Per ADR-009:

every player aggregate row carries brand_id; uniqueness on player becomes (brand_id, account) so the same account string may register independently in two brands
registration resolves brand_id from the request domain and rejects any client-supplied brand override
agent_setting is keyed by (agent_id, brand_id) so a single agent may carry different registration modes per brand
agent_domain keeps its surrogate guid BIGINT PK; a composite UNIQUE(agent_id, brand_id, domain) is added on top of the existing UNIQUE(domain) global constraint, so a domain value cannot appear twice across any (agent_id, brand_id) combination -- a domain belongs to exactly one brand and one agent
the domain:agent:{host} and domain:level:{host} Redis maps now resolve to values carrying brand_id
normal internal player routes require X-Brand-Id; mismatch behavior is gated by MULTI_BRAND_ENFORCEMENT (the same flag as gateway): observe logs + counts via brand_resolution_failed_total{reason="player_brand_mismatch",service="player_service"} and proceeds using the request brand; enforce rejects with the documented error envelope. The relay lookup does not consume X-Brand-Id; its signed brand_code and player join are the trust boundary.
player-owned outbox events include brand_id in payload
recovery flow brand precedence (per spec ### Recovery flow brand precedence): recovery tokens issued after Phase 5 carry brand_id as a signed claim. Token-embedded brand_id wins over request domain; if the token lacks brand_id (issued before Phase 5), fall back to request domain; if the resolved brand does not match the underlying player's brand_id, the request is rejected hard regardless of MULTI_BRAND_ENFORCEMENT mode (recovery is too sensitive for observe-mode tolerance). Recovery responses for unknown accounts produce the same response shape as known accounts to prevent existence-oracle leakage; the same email or phone registered in two brands recovers each brand independently with brand-scoped tokens
recovery flows write recovery_audit (alembic 0033) per ADR-009 audit guarantees: every observable outcome (REQUEST / VERIFY / RESET success or rejection, plus RATE_LIMITED / TOKEN_INVALID / BRAND_MISMATCH typed rejections) inserts one append-only row carrying brand_id, SHA-256 hash of contact, SHA-256 hash of source IP, and a per-action JSONB metadata blob (NEVER the contact, token, or password). The audit row commits in the same DB transaction as the side effect being audited (e.g. password UPDATE on /reset) so an audit-side failure rolls back the recovery write
recovery code & raw contact are NEVER logged in production (Codex P1-#6): the SMS-success / SMS-not-sent / SMS-exception / email-only paths log contact_hash=<sha256_first_8_hex> only, and emit recovery_sms_delivery_failed_total{service="player_service"} or recovery_email_unprovisioned_total{service="player_service"} for operator visibility. The raw tel=<...> code=<...> lines remain ONLY behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on, an integration-test escape hatch that assert_recovery_token_exposure_safe refuses to allow when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod (Codex P1-#5: previously the guard checked ENVIRONMENT only, so APP_ENV=production could silently bypass it)
the same no-PII-in-logs invariant covers the legacy registration SMS path /internal/players/register/sendSMS (Codex T1-D-C4). Previously the unprovisioned-provider fallback wrote logger.info(f"SMS code for {tel}: {sms_code}") and the hard-failure path wrote logger.error(f"SMS send failed for {tel}"), both leaking the registration OTP + raw phone into centralised logs. Redacted to contact_hash=<sha256_first_8_hex>; the unprovisioned + hard-failure branches both increment recovery_sms_delivery_failed_total{service="player_service"}; raw values still log behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on

Tests

cd servers_v2/player_service && uv run pytest
key suites:
- tests/test_auth_contracts.py
- tests/test_profile_contracts.py
- tests/test_message_contracts.py
- tests/test_outbox_poller.py
- tests/test_internal_auth.py

Status​

Date​

Owners​

Last Verified Commit​

Runtime​

Purpose​

Primary Entry Points​

Dependencies​

Background Work​

Owned Data​

Events​

Health​

Key Env Vars​

Boot guard: AES PII keys​

Multi-Brand Constraints​

Tests​