Skip to main content

Player Service

Status

Active

Date

2026-04-28

Owners

  • Platform Backend

Last Verified Commit

56362a7a

Runtime

API + dedicated player_worker (the outbox loop runs in the worker; the API process runs with ENABLE_OUTBOX_POLLER: "false")

Purpose

player_service owns player authentication, registration helpers, profile data, messages, and public/common player-side data that back the gateway player flows.

Primary Entry Points

  • /internal/players/*
  • /internal/common/*

Main route groups:

  • auth and registration helpers
  • player profile and account info
  • player messages
  • common notices, banners, captcha, and site settings

Protection model:

  • /internal/players/* and /internal/common/* require X-Internal-Service-Token
  • gateway is the intended caller for player-web traffic

Dependencies

  • PostgreSQL
  • Redis
  • JWT secret
  • AES and password-salt config
  • optional SMS provider config

Background Work

player_service runs its outbox poller in a dedicated player_worker.

That means:

  • the API process runs with ENABLE_OUTBOX_POLLER: "false"; the standalone player_worker runs it with "true"
  • the API /health checks DB and Redis; player_worker exposes its own heartbeat/health (WORKER_HEARTBEAT_FILE)

Owned Data

  • player auth and profile state
  • player registration semantics
  • player-owned outbox rows
  • message/common projections served to gateway

Events

Emits:

  • player-domain outbox events such as PLAYER_REGISTERED

Consumes:

  • no cross-service stream is currently documented as a required input

Health

  • API /health checks:
    • DB
    • Redis
    • outbox loop status

Key Env Vars

  • DATABASE_URL
  • REDIS_URL
  • SECRET_KEY
  • PASSWORD_SALTrequired in production; protects the password-hash pipeline. Empty in a prod-grade env crashes boot via assert_aes_pii_keys_configured.
  • AES_KEYrequired in production; 32-byte AES-256 key for phone_num/bank_num PII columns. New writes use versioned AES-256-GCM with a random nonce (aesgcm:v1: prefix), and reads keep legacy AES-256-CBC compatibility for historical rows. Boot guard refuses to start if empty in RGB_ENV/APP_ENV/ENVIRONMENT ∈ {staging,production,prod}. Without this guard the _encrypt_pii helpers historically wrote plaintext (T1-D-C3).
  • AES_IVrequired in production until the legacy CBC backfill is complete; 16-byte IV used only to decrypt historical AES-256-CBC rows. New writes do not reuse a static IV.
  • MULTI_BRAND_ENFORCEMENT — required; one of off / observe / enforce. Production target is enforce post-Phase-16. Drives multi_brand_enforcement_mode{service="player_service"} gauge.
  • JWT_PRIVATE_KEYrequired in production; RS256 private key. Held only by player_service and agent_service (not on any verifier-only service). Rotation procedure documented in the rollout runbook under "JWT RS256 key rotation".
  • JWT_KID — currently-active kid stamped onto every freshly-minted JWT.
  • RECOVERY_TOKEN_TTL_MINUTES — TTL for recovery short codes / reset tokens.
  • RECOVERY_EXPOSE_TOKEN_FOR_TESTStest-only escape hatch. When on, recovery routes log raw tel=<...> / code=<...> lines for integration-test inspection. assert_recovery_token_exposure_safe (Codex P1-#5) refuses to allow this when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod. Production deploy templates MUST leave this unset.
  • SMS_URL
  • SMS_PLATFORM
  • COOLSMS_API_KEY
  • COOLSMS_API_SECRET
  • COOLSMS_FROM
  • PER_CALLER_TOKEN_REQUIREDon is the Phase 16 target; activates the legacy-token hard-reject (T4-D-I2) for inbound /internal/players/* calls.
  • INTERNAL_SERVICE_TOKEN_GATEWAYrequired in production; per-caller token accepted on inbound calls from gateway.
  • INTERNAL_SERVICE_TOKEN — legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.

Boot guard: AES PII keys

app/main.py calls rgb_contracts.infra.aes_guard.assert_aes_pii_keys_configured() before the FastAPI app starts. In a production-grade runtime env (per is_production_runtime_env — any of RGB_ENV/APP_ENV/ENVIRONMENT set to staging/production/prod), an empty or whitespace-only AES_KEY, AES_IV, or PASSWORD_SALT raises AesPiiKeysMisconfigured and the service refuses to start. The same guard validates AES_KEY is 32 bytes and AES_IV is 16 bytes. Outside production the guard is a no-op so dev fixtures with empty keys keep working; the runtime helpers still emit pii_aes_unconfigured_total{service,op} + WARN logs in that branch so an accidental misconfiguration is visible on dashboards. Versioned GCM ciphertext that fails AEAD authentication raises in production regardless of AES_STRICT_DECRYPT; the AES_STRICT_DECRYPT flag now applies only to ambiguous legacy base64-shaped CBC/plaintext compatibility reads.

Multi-Brand Constraints

Per ADR-009:

  • every player aggregate row carries brand_id; uniqueness on player becomes (brand_id, account) so the same account string may register independently in two brands
  • registration resolves brand_id from the request domain and rejects any client-supplied brand override
  • agent_setting is keyed by (agent_id, brand_id) so a single agent may carry different registration modes per brand
  • agent_domain keeps its surrogate guid BIGINT PK; a composite UNIQUE(agent_id, brand_id, domain) is added on top of the existing UNIQUE(domain) global constraint, so a domain value cannot appear twice across any (agent_id, brand_id) combination -- a domain belongs to exactly one brand and one agent
  • the domain:agent:{host} and domain:level:{host} Redis maps now resolve to values carrying brand_id
  • internal player routes require X-Brand-Id; mismatch behavior is gated by MULTI_BRAND_ENFORCEMENT (the same flag as gateway): observe logs + counts via brand_resolution_failed_total{reason="player_brand_mismatch",service="player_service"} and proceeds using the request brand; enforce rejects with the documented error envelope
  • player-owned outbox events include brand_id in payload
  • recovery flow brand precedence (per spec ### Recovery flow brand precedence): recovery tokens issued after Phase 5 carry brand_id as a signed claim. Token-embedded brand_id wins over request domain; if the token lacks brand_id (issued before Phase 5), fall back to request domain; if the resolved brand does not match the underlying player's brand_id, the request is rejected hard regardless of MULTI_BRAND_ENFORCEMENT mode (recovery is too sensitive for observe-mode tolerance). Recovery responses for unknown accounts produce the same response shape as known accounts to prevent existence-oracle leakage; the same email or phone registered in two brands recovers each brand independently with brand-scoped tokens
  • recovery flows write recovery_audit (alembic 0033) per ADR-009 audit guarantees: every observable outcome (REQUEST / VERIFY / RESET success or rejection, plus RATE_LIMITED / TOKEN_INVALID / BRAND_MISMATCH typed rejections) inserts one append-only row carrying brand_id, SHA-256 hash of contact, SHA-256 hash of source IP, and a per-action JSONB metadata blob (NEVER the contact, token, or password). The audit row commits in the same DB transaction as the side effect being audited (e.g. password UPDATE on /reset) so an audit-side failure rolls back the recovery write
  • recovery code & raw contact are NEVER logged in production (Codex P1-#6): the SMS-success / SMS-not-sent / SMS-exception / email-only paths log contact_hash=<sha256_first_8_hex> only, and emit recovery_sms_delivery_failed_total{service="player_service"} or recovery_email_unprovisioned_total{service="player_service"} for operator visibility. The raw tel=<...> code=<...> lines remain ONLY behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on, an integration-test escape hatch that assert_recovery_token_exposure_safe refuses to allow when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod (Codex P1-#5: previously the guard checked ENVIRONMENT only, so APP_ENV=production could silently bypass it)
  • the same no-PII-in-logs invariant covers the legacy registration SMS path /internal/players/register/sendSMS (Codex T1-D-C4). Previously the unprovisioned-provider fallback wrote logger.info(f"SMS code for {tel}: {sms_code}") and the hard-failure path wrote logger.error(f"SMS send failed for {tel}"), both leaking the registration OTP + raw phone into centralised logs. Redacted to contact_hash=<sha256_first_8_hex>; the unprovisioned + hard-failure branches both increment recovery_sms_delivery_failed_total{service="player_service"}; raw values still log behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on

Tests

  • cd servers_v2/player_service && uv run pytest
  • key suites:
    • tests/test_auth_contracts.py
    • tests/test_profile_contracts.py
    • tests/test_message_contracts.py
    • tests/test_outbox_poller.py
    • tests/test_internal_auth.py