Skip to main content

Agent Service

Status

Active

Date

2026-04-28

Owners

  • Platform Backend

Last Verified Commit

56362a7a

Runtime

API + standalone worker

Purpose

agent_service owns the agent-facing portal APIs and the legacy compatibility aliases still required by older agent frontends.

Primary Entry Points

agent_service has an empty api_prefix (servers_v2/agent_service/app/core/settings/app.py), so routes are mounted at the top level. There is no /api/v1/ namespace.

New-shape routes (mounted via app/api/routes/api.py):

  • /agent/* — auth (/agent/login, /agent/forget, …)
  • /agent/home/* — dashboard
  • /agent/withdraw/* — withdrawals (/agent/withdraw/init, /agent/withdraw/add)
  • /agent/player/* — downstream-player views
  • /agent/sub/* — sub-agent management
  • /agent/bank/* — bank cards
  • /agent/messages/* — messages
  • /agent/coupon/* — coupons (/agent/coupon/grant, /agent/coupon/recycle)
  • /agent/provider/* — game providers
  • /agent/common/* — shared endpoints
  • /agent/recovery/* — SMS-driven password reset

Legacy aliases (mounted from app/api/routes/legacy_routes.py, include_in_schema=False) preserve top-level paths the older agent frontend already calls:

  • /user/login, /user/refresh_token, /user/forget, /user/top_info, /user/info, /user/edit/password, /user/edit/bank/*, /user/edit/phone/*
  • /home/*
  • /withdraw/*
  • /player/*
  • /provider/* (list / percentage / sub/percentage / income)
  • /messages/* (list / {id} / read / unread/count)
  • /bank/list, /common/agent_min_withdrawal_limit
  • /coupon/* (grant / recycle / logs / recycle/logs)
  • /agent/* legacy reporting endpoints (/agent/agentStatistic, /agent/agentStatisticDaily, /agent/statisticDaily) — separate from the new /agent/sub/* namespace above, distinguished by suffix

For the authoritative current list, regenerate the OpenAPI snapshot with python3 tools/openapi/export_openapi.py and inspect docs/reference/openapi/agent_service.json.

Dependencies

  • PostgreSQL
  • Redis
  • JWT secret
  • AES and password-salt config

Reserved configuration

  • ADMIN_SERVER_HOST — Reserved (no current runtime calls; placeholder for future cross-service hooks). agent_service does not issue HTTP requests to admin_service today; see servers_v2/agent_service/CLAUDE.md.

Background Work

agent_worker publishes agent outbox rows to Redis Streams.

Owned Data

  • agent portal operational flows
  • agent-owned outbox rows
  • agent compatibility semantics for legacy clients

Events

Emits:

  • agent-domain outbox events published by agent_worker

Consumes:

  • no required upstream stream consumer is currently documented

Health

  • API /health checks DB and Redis
  • agent_worker uses heartbeat-backed supervision for outbox publishing

Key Env Vars

  • DATABASE_URL
  • REDIS_URL
  • SECRET_KEY
  • PASSWORD_SALTrequired in production; protects the password-hash pipeline. Empty in a prod-grade env crashes boot via assert_aes_pii_keys_configured.
  • AES_KEYrequired in production; 32-byte AES-256 key for phone_num/bank_num/card_number PII columns on agent and agent_withdraw. New writes use versioned AES-256-GCM with a random nonce (aesgcm:v1: prefix), and reads keep legacy AES-256-CBC compatibility for historical rows. Boot guard refuses to start with an empty value in a prod-grade env. T1-D-C3: previously the _encrypt_pii fallback wrote plaintext to agent.phone_num / agent.bank_num when this was unset.
  • AES_IVrequired in production until the legacy CBC backfill is complete; 16-byte IV used only to decrypt historical AES-256-CBC rows. New writes do not reuse a static IV.
  • MULTI_BRAND_ENFORCEMENT — required; one of off / observe / enforce. Production target is enforce post-Phase-16. Drives multi_brand_enforcement_mode{service="agent_service"} gauge and the agent_brand allow-list reject decision in AgentBrandResolutionMiddleware.
  • JWT_PRIVATE_KEYrequired in production; RS256 private key. Held only by player_service and agent_service.
  • JWT_KID — currently-active kid.
  • RECOVERY_TOKEN_TTL_MINUTES — TTL for agent recovery short codes / reset tokens.
  • RECOVERY_EXPOSE_TOKEN_FOR_TESTStest-only escape hatch. Same boot-guard semantics as on player_service (Codex P1-#5).
  • BRAND_SIGNING_KEYnot used by agent_service. agent_service has no outbound wallet_client and signs no wallet writes, so it does not load a brand-signing key. The 6 signing callers are gateway, admin_service, game_service, promotion_service, recon_service, rolling_service (see docs/runbooks/multi-brand/multi-brand-isolation-rollout.md "Stage A" caller list and the BRAND_SIGNING_KEY rotation section).
  • INTERNAL_SERVICE_TOKEN_AGENT — shipped in the shared &internal-service-env compose anchor and therefore visible to every consumer (wallet/rolling/promotion/etc.) as the inbound credential they would accept from a future caller identifying itself as X-Caller-Service: agent. agent_service itself does NOT emit this header today (no outbound /internal/* HTTP client), so the env var exists as future-proofing rather than as an active runtime requirement.
  • PER_CALLER_TOKEN_REQUIREDon is the Phase 16 target; activates the legacy-token hard-reject (T4-D-I2) on inbound internal calls.
  • SMS_URL
  • ENABLE_OUTBOX_POLLER
  • INTERNAL_SERVICE_TOKEN — legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.

Boot guard: AES PII keys

app/main.py calls rgb_contracts.infra.aes_guard.assert_aes_pii_keys_configured() before the FastAPI app starts. Same shape as player_service — production-grade env with any of AES_KEY/AES_IV/PASSWORD_SALT empty raises AesPiiKeysMisconfigured, and the guard validates AES_KEY is 32 bytes plus AES_IV is 16 bytes. Non-prod runs are a no-op; runtime fallbacks emit pii_aes_unconfigured_total{service="agent_service",op} + WARN logs. Versioned GCM ciphertext that fails AEAD authentication raises in production regardless of AES_STRICT_DECRYPT; the flag now applies only to ambiguous legacy base64-shaped CBC/plaintext compatibility reads.

Multi-Brand Constraints

Per ADR-009:

  • the agent aggregate is brand-global; agent rows do not carry brand_id
  • agent_brand is a join aggregate of (agent_id, brand_id, status) and is the only place where brand membership for an agent is recorded
  • admin_service writes agent_brand (which brands an agent may serve); agent_service reads it and enforces it on every agent-facing route
  • agent_setting is keyed per (agent_id, brand_id) so a single agent can hold different registration modes per brand
  • agent_domain keeps its surrogate guid BIGINT PK; a composite UNIQUE(agent_id, brand_id, domain) is added on top of the existing UNIQUE(domain); a domain cannot be reused across brands or across agents
  • the agent frontend resolves brand from its request domain (per ADR-009); a request whose resolved brand is not in the authenticated agent's agent_brand allow list is checked per MULTI_BRAND_ENFORCEMENT (observe logs + counts; enforce rejects)
  • agent_service strips any inbound X-Brand-Id header on its agent-frontend edge before injecting the resolved value (mirrors the gateway strip; prevents external header spoofing)
  • agent-domain outbox events include brand_id in payload (envelope field on DomainEvent, schema_version=2)
  • agent_brand reads cached per process with 60s TTL; invalidated by AGENT_BRAND_CHANGED Redis pub/sub channel from admin_service. Active JWT survives agent_brand revocation; the next request after revocation is rejected by allow-list check
  • internal calls to downstream services use per-caller token INTERNAL_SERVICE_TOKEN_AGENT; brand-scoped wallet write commands carry X-Brand-Signature per ADR-009
  • agent recovery flow follows the same brand-precedence contract as player recovery (per spec ### Recovery flow brand precedence): token-embedded brand_id wins over request domain, fall back to domain when token lacks the claim, hard-reject when the resolved brand is not enabled in agent_brand regardless of MULTI_BRAND_ENFORCEMENT mode, and unknown-account responses are shape-identical to known-account responses
  • agent recovery flows write recovery_audit (alembic 0033) per ADR-009 audit guarantees with subject_type='agent': every observable outcome (REQUEST / VERIFY / RESET success or rejection, plus RATE_LIMITED / TOKEN_INVALID / BRAND_MISMATCH typed rejections) inserts one append-only row carrying brand_id, SHA-256 hash of contact, SHA-256 hash of source IP, and a per-action JSONB metadata blob (NEVER the contact, token, or password). RESET writes the audit row and the password UPDATE in one transaction; an audit failure rolls back the password change
  • agent recovery code & raw contact are NEVER logged in production (Codex P1-#6): the SMS-success / SMS-not-sent / SMS-exception / email-only paths log contact_hash=<sha256_first_8_hex> only, and emit recovery_sms_delivery_failed_total{service="agent_service"} or recovery_email_unprovisioned_total{service="agent_service"} for operator visibility. The raw tel=<...> code=<...> lines remain ONLY behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on, an integration-test escape hatch that assert_recovery_token_exposure_safe refuses to allow when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod (Codex P1-#5: previously the guard checked ENVIRONMENT only)
  • legacy auth.py agent flows (/agent/forget, /agent/edit/phone, /agent/edit/bank, and the matching send-sms endpoints) follow the same no-PII-in-logs invariant (Codex T1-D-C4). Previously these wrote logger.warning(f"... SMS code for agent {agent_id}: {code}") whenever the SMS provider was unprovisioned (very common in staging / dev), exposing OTPs that gate /edit/phone and /edit/bank -- log read = full agent takeover including changing payout banks. Redacted to contact_hash=<sha256_first_8_hex> only and surfaced via recovery_sms_delivery_failed_total{service="agent_service"}. Audit coverage: /agent/forget is brand-scoped through the resolved request brand plus agent_brand; every post-resolution recovery_audit row carries that resolved brand_id (only the missing-brand-context branch uses brand_id=0 because no tenant was resolved). /agent/edit/phone, /agent/edit/bank and their send-sms siblings write an admin_audit row with operator_id='agent:<guid>', resource='agent', action one of AGENT_EDIT_PHONE / AGENT_EDIT_BANK / AGENT_EDIT_PHONE_SEND_SMS / AGENT_EDIT_BANK_SEND_SMS. The UPDATE-side audit rows commit in the SAME transaction as the agent-row UPDATE so a payout-account swap cannot proceed unaudited. All SMS OTPs in the active agent auth and legacy compatibility surfaces are generated with CSPRNG (secrets.randbelow) rather than random.randint; new-value payloads carry contact_hash / bank_id only -- never the raw phone, raw card, or OTP. Test escape hatch (RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on) is identical to the recovery-path semantics

Tests

  • cd servers_v2/agent_service && uv run pytest
  • key suites:
    • tests/test_agent_integration.py
    • tests/test_outbox_poller.py
    • tests/test_events_health.py