Agent Service

Status

Active

Date

2026-04-28

Owners

Platform Backend

Last Verified Commit

Use git log -- <this file> for current last-touch history; this field is intentionally not pinned to a static hash so it does not become stale after unrelated commits.

Runtime

API + standalone worker

Purpose

agent_service owns the agent-facing portal APIs and the legacy compatibility aliases still required by older agent frontends.

Primary Entry Points

agent_service has an empty api_prefix (servers_v2/agent_service/app/core/settings/app.py), so routes are mounted at the top level. There is no /api/v1/ namespace.

New-shape routes (mounted via app/api/routes/api.py):

/agent/* — auth (/agent/login, /agent/forget, …)
/agent/home/* — dashboard
/agent/withdraw/* — withdrawals (/agent/withdraw/init, /agent/withdraw/add)
/agent/player/* — downstream-player views
/agent/sub/* — sub-agent management
/agent/bank/* — bank cards
/agent/messages/* — messages
/agent/coupon/* — coupons (/agent/coupon/grant, /agent/coupon/recycle)
/agent/provider/* — game providers
/agent/common/* — shared endpoints
/agent/recovery/* — SMS-driven password reset

Legacy aliases (mounted from app/api/routes/legacy_routes.py, include_in_schema=False) preserve top-level paths the older agent frontend already calls:

/user/login, /user/refresh_token, /user/forget, /user/top_info, /user/info, /user/edit/password, /user/edit/bank/*, /user/edit/phone/*
/home/*
/withdraw/*
/player/*
/provider/* (list / percentage / sub/percentage / income)
/messages/* (list / {id} / read / unread/count)
/bank/list, /common/agent_min_withdrawal_limit
/coupon/* (grant / recycle / logs / recycle/logs)
/agent/* legacy reporting endpoints (/agent/agentStatistic, /agent/agentStatisticDaily, /agent/statisticDaily) — separate from the new /agent/sub/* namespace above, distinguished by suffix

For the authoritative current list, regenerate the OpenAPI snapshot with python3 tools/openapi/export_openapi.py and inspect docs/reference/openapi/agent_service.json.

Dependencies

PostgreSQL
Redis
JWT secret
AES and password-salt config

Cross-Service Calls

WALLET_SERVICE_URL — used by the durable coupon wallet saga worker to call wallet_service /internal/wallet/v2/coupons/grant and /internal/wallet/v2/coupons/recycle.
ADMIN_SERVER_HOST — Reserved (no current runtime calls; placeholder for future cross-service hooks). agent_service does not issue HTTP requests to admin_service today; its active cross-service HTTP dependency is wallet_service for the coupon wallet saga.

Background Work

agent_worker owns all agent background loops:

publishes agent outbox rows to Redis Streams
runs hourly agent outbox retention cleanup
dispatches the agent coupon wallet saga

The API process does not run the coupon wallet saga. Coupon grant/recycle requests commit local audit rows plus agent_coupon_wallet_saga; the worker then calls wallet_service with a brand-signed idempotent request. Grant messages and coupon events are emitted only after wallet settlement succeeds. Recycle credits agent.coupon (the legacy column exposed to clients as coupon_balance) only after wallet has debited the player's topology coupon grant. This is a settlement-safety scaffold, not final Single Money Writer closure: agent.coupon (exposed as coupon_balance) remains a legacy-compatible direct write until ADR-010 moves agent-money commands behind the wallet-owned path and agent_balance_legacy_write_total stays at zero for the required observation window.

Owned Data

agent portal operational flows
agent-owned outbox rows
agent compatibility semantics for legacy clients

Events

Emits:

agent-domain outbox events published by agent_worker

Consumes:

no required upstream stream consumer is currently documented

Health

API /health checks DB and Redis
agent_worker uses heartbeat-backed supervision for outbox publishing, outbox cleanup, and coupon wallet saga dispatch

Key Env Vars

DATABASE_URL
REDIS_URL
SECRET_KEY
PASSWORD_SALT — required in production; protects the password-hash pipeline. Empty in a prod-grade env crashes boot via assert_aes_pii_keys_configured.
AES_KEY — required in production; 32-byte AES-256 key for phone_num/bank_num/card_number PII columns on agent and agent_withdraw. New writes use versioned AES-256-GCM with a random nonce (aesgcm:v1: prefix), and reads keep legacy AES-256-CBC compatibility for historical rows. Boot guard refuses to start with an empty value in a prod-grade env. T1-D-C3: previously the _encrypt_pii fallback wrote plaintext to agent.phone_num / agent.bank_num when this was unset.
AES_IV — required in production until the legacy CBC backfill is complete; 16-byte IV used only to decrypt historical AES-256-CBC rows. New writes do not reuse a static IV.
MULTI_BRAND_ENFORCEMENT — required; one of off / observe / enforce. Production target is enforce post-Phase-16. Drives multi_brand_enforcement_mode{service="agent_service"} gauge and the agent_brand allow-list reject decision in AgentBrandResolutionMiddleware.
JWT_PRIVATE_KEY — required in production; RS256 private key. Held only by player_service and agent_service. app/main.py fails fast in production-grade runtimes when this is empty.
JWT_KID — required in production; currently-active kid stamped onto every freshly-minted agent JWT. app/main.py fails fast in production-grade runtimes when this is empty.
RECOVERY_TOKEN_TTL_MINUTES — TTL for agent recovery short codes / reset tokens.
RECOVERY_EXPOSE_TOKEN_FOR_TESTS — test-only escape hatch. Same boot-guard semantics as on player_service (Codex P1-#5).
WALLET_SERVICE_URL — wallet base URL used by the coupon wallet saga.
BRAND_SIGNING_KEY — required for brand-signed outbound wallet writes when wallet brand-signature enforcement is on.
ENABLE_COUPON_WALLET_SAGA_WORKER — enabled on agent_worker and disabled on the API container in compose; disables only the coupon wallet saga dispatcher, not the API routes.
INTERNAL_SERVICE_TOKEN_AGENT — per-caller token accepted by wallet when agent_service calls wallet with X-Caller-Service: agent.
INTERNAL_SERVICE_TOKEN_GATEWAY — required in production on agent_service; per-caller token accepted by internal recovery routes from gateway.
INTERNAL_SERVICE_TOKEN_ADMIN — required in production on agent_service; per-caller token accepted by internal recovery/admin-tooling routes from admin_service.
PER_CALLER_TOKEN_REQUIRED — on is the Phase 16 target; activates the legacy-token hard-reject (T4-D-I2) on inbound internal calls.
SMS_URL
ENABLE_OUTBOX_POLLER
INTERNAL_SERVICE_TOKEN — legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.

When PER_CALLER_TOKEN_REQUIRED=on, agent_service fails fast at boot in production-grade runtimes unless the active gateway and admin caller tokens above are configured. _PREV variants are rotation overlap only.

Boot guard: AES PII keys

app/main.py calls rgb_contracts.infra.aes_guard.assert_aes_pii_keys_configured() before the FastAPI app starts. Same shape as player_service — production-grade env with any of AES_KEY/AES_IV/PASSWORD_SALT empty raises AesPiiKeysMisconfigured, and the guard validates AES_KEY is 32 bytes plus AES_IV is 16 bytes. Non-prod runs are a no-op; runtime fallbacks emit pii_aes_unconfigured_total{service="agent_service",op} + WARN logs. Versioned GCM ciphertext that fails AEAD authentication raises in production regardless of AES_STRICT_DECRYPT; the flag now applies only to ambiguous legacy base64-shaped CBC/plaintext compatibility reads.

Multi-Brand Constraints

Per ADR-009:

the agent aggregate is brand-global; agent rows do not carry brand_id
agent_brand is a join aggregate of (agent_id, brand_id, status) and is the only place where brand membership for an agent is recorded
admin_service writes agent_brand (which brands an agent may serve); agent_service reads it and enforces it on every agent-facing route
agent_setting is keyed per (agent_id, brand_id) so a single agent can hold different registration modes per brand
agent_domain keeps its surrogate guid BIGINT PK; a composite UNIQUE(agent_id, brand_id, domain) is added on top of the existing UNIQUE(domain); a domain cannot be reused across brands or across agents
the agent frontend resolves brand from its request domain (per ADR-009); a request whose resolved brand is not in the authenticated agent's agent_brand allow list is checked per MULTI_BRAND_ENFORCEMENT (observe logs + counts; enforce rejects)
agent_service strips any inbound X-Brand-Id header on its agent-frontend edge before injecting the resolved value (mirrors the gateway strip; prevents external header spoofing)
agent-domain outbox events include brand_id in payload (envelope field on DomainEvent, schema_version=2)
agent_brand reads cached per process with 60s TTL; invalidated by AGENT_BRAND_CHANGED Redis pub/sub channel from admin_service. Active JWT survives agent_brand revocation; the next request after revocation is rejected by allow-list check
internal calls to downstream services use per-caller token INTERNAL_SERVICE_TOKEN_AGENT; brand-scoped wallet write commands carry X-Brand-Signature per ADR-009
agent recovery flow follows the same brand-precedence contract as player recovery (per spec ### Recovery flow brand precedence): token-embedded brand_id wins over request domain, fall back to domain when token lacks the claim, hard-reject when the resolved brand is not enabled in agent_brand regardless of MULTI_BRAND_ENFORCEMENT mode, and unknown-account responses are shape-identical to known-account responses
agent recovery flows write recovery_audit (alembic 0033) per ADR-009 audit guarantees with subject_type='agent': every observable outcome (REQUEST / VERIFY / RESET success or rejection, plus RATE_LIMITED / TOKEN_INVALID / BRAND_MISMATCH typed rejections) inserts one append-only row carrying brand_id, SHA-256 hash of contact, SHA-256 hash of source IP, and a per-action JSONB metadata blob (NEVER the contact, token, or password). RESET writes the audit row and the password UPDATE in one transaction; an audit failure rolls back the password change
agent recovery code & raw contact are NEVER logged in production (Codex P1-#6): the SMS-success / SMS-not-sent / SMS-exception / email-only paths log contact_hash=<sha256_first_8_hex> only, and emit recovery_sms_delivery_failed_total{service="agent_service"} or recovery_email_unprovisioned_total{service="agent_service"} for operator visibility. The raw tel=<...> code=<...> lines remain ONLY behind RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on, an integration-test escape hatch that assert_recovery_token_exposure_safe refuses to allow when ANY of RGB_ENV / APP_ENV / ENVIRONMENT resolves to staging / production / prod (Codex P1-#5: previously the guard checked ENVIRONMENT only)
legacy auth.py agent flows (/agent/forget, /agent/edit/phone, /agent/edit/bank, and the matching send-sms endpoints) follow the same no-PII-in-logs invariant (Codex T1-D-C4). Previously these wrote logger.warning(f"... SMS code for agent {agent_id}: {code}") whenever the SMS provider was unprovisioned (very common in staging / dev), exposing OTPs that gate /edit/phone and /edit/bank -- log read = full agent takeover including changing payout banks. Redacted to contact_hash=<sha256_first_8_hex> only and surfaced via recovery_sms_delivery_failed_total{service="agent_service"}. Audit coverage: /agent/forget is brand-scoped through the resolved request brand plus agent_brand; every post-resolution recovery_audit row carries that resolved brand_id (only the missing-brand-context branch uses brand_id=0 because no tenant was resolved). /agent/edit/phone, /agent/edit/bank and their send-sms siblings write an admin_audit row with operator_id='agent:<guid>', resource='agent', action one of AGENT_EDIT_PHONE / AGENT_EDIT_BANK / AGENT_EDIT_PHONE_SEND_SMS / AGENT_EDIT_BANK_SEND_SMS. The UPDATE-side audit rows commit in the SAME transaction as the agent-row UPDATE so a payout-account swap cannot proceed unaudited. All SMS OTPs in the active agent auth and legacy compatibility surfaces are generated with CSPRNG (secrets.randbelow) rather than random.randint; new-value payloads carry contact_hash / bank_id only -- never the raw phone, raw card, or OTP. Test escape hatch (RECOVERY_EXPOSE_TOKEN_FOR_TESTS=on) is identical to the recovery-path semantics

Tests

cd servers_v2/agent_service && uv run pytest
key suites:
- tests/test_agent_integration.py
- tests/test_outbox_poller.py
- tests/test_events_health.py

Status​

Date​

Owners​

Last Verified Commit​

Runtime​

Purpose​

Primary Entry Points​

Dependencies​

Cross-Service Calls​

Background Work​

Owned Data​

Events​

Health​

Key Env Vars​

Boot guard: AES PII keys​

Multi-Brand Constraints​

Tests​