Admin Service

Status

Active

Date

2026-04-28

Owners

Platform Backend

Last Verified Commit

Use git log -- <this file> for current last-touch history; this field is intentionally not pinned to a static hash so it does not become stale after unrelated commits.

T16-A logs.py brand-scope contract

servers_v2/admin_service/app/api/routes/logs.py::_query_log_table is the shared paginator for every operator log read. Pre-T16-A the helper built WHERE 1=1 and appended user-supplied filters with no brand predicate, so any operator could read every brand's player_balance_log / agent_balance_log / player_login_log / agent_login_log history by guessing pagination params (CR9-D-Crit-1).

Post-T16-A:

_query_log_table(session, table, body, *, brand_id, ...) — the brand_id argument is required.
The first predicate the helper appends is unconditionally conditions.append("brand_id = :brand_id"), so the brand_id pin is always present BEFORE any user-supplied filter.
The 4 real-table endpoints (player_balance_logs, agent_balance_logs, player_login_logs, agent_login_logs) call resolve_brand_id
- brand_required_envelope and refuse the request when X-Brand-Id is missing.
The 9 endpoints whose target tables are not yet provisioned (rebate_log, cashback_log, etc.) get the same brand-required plumbing for future safety.

The static RW classifier could not have caught this helper because the SQL was built dynamically (f"SELECT ... FROM {table} WHERE {where} ..."). T16-C added a FIX-DYNAMIC class to the classifier that scans for the WHERE 1=1 builder + f-string template patterns and flags them when the surrounding scope has no recognised conditions.append("brand_id ...") idiom — so a regression in this contract (or any sibling helper that crops up later) breaks CI the same way a static missing-predicate would.

Runtime

API + standalone worker

Purpose

admin_service is the only intended back-office write/action HTTP edge in servers_v2. Normal list/detail/report reads are Supabase-owned; the public admin_service API exposes state-changing operations and narrowly-scoped PII reveal actions for fields that require server-side AES decryption.

It also owns the admin-facing wallet topology and wallet policy configuration edge, while delegating all money mutation execution to wallet_service.

Primary Entry Points

Public: /api/v1/*
Public wallet control plane: /api/v1/wallet/*
Internal-only operational sync: /internal/meta/* (not mounted on public_router)

Removed by ADR-009 (now respond 404):

/v2/* compatibility slice for legacy admin_server cutover (served by deleted legacy_admin_v2.py)
/api/admin/user/* (served by deleted legacy_auth.py)
/api/admin/pushbullet/* and /api/admin/shooter/* (served by deleted legacy_recon.py)
/api/admin/web/* (served by deleted legacy_web_content.py)

Dependencies

PostgreSQL
Redis
JWT secret
AES and password-salt config
wallet_service
rolling_service
promotion_service
recon_service
Cloudflare R2-compatible object storage
internal service token for trusted internal sync paths

Background Work

admin_worker runs scheduler-backed operational jobs, including:

complete cumulative-stat materialization (one row per brand/player plus the platform-wide singleton)
Asia/Seoul daily materialization: provider, agent-by-provider and agent rows are replaced atomically at 00:10; the complete platform row is published at 00:15 only after their keyspace, source totals and post-midnight freshness validate. Business-key uniqueness and a date advisory lock make reruns safe. Before replacement, the worker fails closed if the per-player and per-provider betting projections disagree, the Provider catalog/type is invalid, or an enabled agent hierarchy cannot be attributed to the brand. Agent coupon grants and back-office balance adjustments are recovered from the immutable agent_balance_log when the agent and platform rows are built.
level checks
stateful coupon/register/attendance expiration jobs; public promotions use their start/end window and do not have a status column
attendance processing
brand-scoped tag tasks. The daily worker reads automatic definitions from player_tag, writes idempotent associations to player_tag_mapping, and evaluates activity only inside the same brand. A brand with no configured automatic tags is a valid no-op, not a worker failure.
optional Vera sync

Scheduled jobs propagate failures to APScheduler. The worker health registry therefore becomes unhealthy when a job fails instead of treating a logged exception as a successful execution.

Migration 0065 removes only the four persisted daily Cron rows (agent_stat_day, admin_stat_day, tag_daily_task, process_attendance) so the first upgraded worker boot recreates their serialized triggers in Asia/Seoul. Interval jobs and their persisted next-run timestamps are left untouched.

Migration 0066 fails closed on duplicate/cross-brand tag data, repairs the tag-definition and tag-mapping GUID sequences, and installs the unique keys used by the daily task's idempotent conflict handling.

Owned Data

admin edge compatibility semantics
admin-facing operational control surfaces
narrowly-scoped PII reveal actions for encrypted fields

Read boundary:

Frontend list/detail/report reads go to Supabase.
admin_service must not expose broad query/list/log/stat routes publicly.
Exceptions are explicit POST actions that reveal selected encrypted PII fields for a single brand-scoped player or agent.
Public API naming and migration rules are documented in docs/services/admin-service-api-standard.md.
External OpenAPI must be generated from public_router; the internal router still carries compatibility handlers and is intentionally broader.

Important boundary:

admin_service orchestrates and approves
admin_service does not become a second money writer
wallet config writes (/api/v1/wallet/topologies/*, /api/v1/wallet/policies/*) are control-plane operations only; money movement stays in wallet_service

Legacy migration note:

/v2/* compatibility is being restored incrementally for no-diff admin cutover
the current source of truth for that work is docs/specs/migrations/2026-04-22-admin-v2-route-compatibility.md

Events

Emits:

no primary Redis stream ownership is documented

Consumes:

no primary Redis stream consumption is documented

Internal signaling:

internal websocket sync hooks are used to refresh top-info after meaningful state changes

Health

API health endpoints are exposed without auth
admin_worker health depends on heartbeat plus scheduler and job-health readiness

Key Env Vars

DATABASE_URL
REDIS_URL
SECRET_KEY — legacy session-cookie material only post-T6-B1; admin JWTs no longer use HS256, so this value is consulted by the legacy session cookie path (and a small set of pre-RS256 fixtures) but never by decode_admin_token. Keep set in dev/test for fixture compatibility; production deployments may leave it unset once all session-cookie consumers are retired.
JWT_PRIVATE_KEY — required only where this service/helper mints admin JWTs; RSA private key (PEM) used by create_admin_token to mint RS256 admin JWTs. For third-party backend integrations, provision a partner-specific keypair and configure only the public key on admin_service; do not reuse a platform master private key.
JWT_PUBLIC_KEYS — required in production (T6-B1); JSON {kid: pem} map of accepted RSA public keys used by decode_admin_token to verify inbound admin JWTs. Multiple kids allow per-partner keys and rotation without downtime. T7-B4 added assert_jwt_public_keys_configured boot guard: empty value in a production-grade runtime refuses to boot — otherwise the in-process test-keypair singleton fallback would silently accept attacker-forged tokens.
JWT_KID — required in production (T6-B1); the kid value create_admin_token stamps into the JWT header so verifiers know which entry of JWT_PUBLIC_KEYS to use.
PASSWORD_SALT
AES_KEY — required in production for admin-owned PII helpers; 32-byte AES-256 key. New writes use versioned AES-256-GCM with a random nonce (aesgcm:v1: prefix), and reads keep legacy AES-256-CBC compatibility for historical rows.
AES_IV — required in production until the legacy CBC backfill is complete; 16-byte IV used only to decrypt historical AES-256-CBC rows. New writes do not reuse a static IV.
MULTI_BRAND_ENFORCEMENT — required; one of off / observe / enforce. Production target is enforce post-Phase-16. Drives multi_brand_enforcement_mode{service="admin_service"} gauge. Admin JWTs authenticate the operator but do not by themselves authorize a tenant; brand-scoped admin routes resolve an explicit operating brand from path/query/body/request state and fail loud when it is missing. This flag primarily governs downstream enforcement/metrics posture, not a silent accept path for missing brand context.
BRAND_SIGNING_KEY — required in production; HMAC-SHA256 secret used to sign X-Brand-Signature on outbound brand-scoped wallet writes from admin-side flows (one of the 7 wallet write callers).
INTERNAL_SERVICE_TOKEN_ADMIN — required in production; per-caller token presented to wallet/rolling/promotion/recon when admin calls them.
INTERNAL_SERVICE_TOKEN_RECON — required in production on admin_service; per-caller token accepted from recon_service for /internal/meta/ws/sync.
PER_CALLER_TOKEN_REQUIRED — on is the Phase 16 target. Admin is both a caller and the /internal/meta/ws/sync consumer; in production-grade runtimes the admin boot guard refuses to start unless the active INTERNAL_SERVICE_TOKEN_RECON is configured. _PREV is rotation overlap only.
WALLET_SERVICE_URL
ROLLING_SERVICE_URL
PROMOTION_SERVICE_URL
RECON_SERVICE_URL
INTERNAL_SERVICE_TOKEN — legacy single-shared-token; deprecated. Phase 16 release gate requires the bare variant to be absent.

Known Migration Gap

admin_service is strong on core admin flows, but it is not yet the full replacement for all legacy middle_server modules.

Still unresolved for a full legacy cutover:

/api/admin/role/*
/api/admin/menu/*
/api/admin/config/*
/api/admin/i18n/*
/api/admin/bi/*
/coin/*
legacy POST /{path:path} catch-all forwarding behavior

Those prefixes are planning-only in docs/runbooks/legacy-middle-server-retirement.md; admin_service must not be described as a full middle_server replacement until that runbook becomes executable and then closes.

Multi-Brand Constraints

Per ADR-009:

admin_service runs behind VPN ingress with an IP allow-list; every write resolves the immutable operator username from the verified admin JWT (username, see app/services/auth.py::create_admin_token). Clients do not send operator identity in request bodies or operator headers. The username string is captured on every audit row. Network isolation plus JWT-bound operator username replaces the deleted staff identity layer
admin_service owns the brand catalog, the brand_config per-brand configuration aggregate, and brand_provider_config provider allow-list policy; brand-create retains the historical prefix-collision check for already-persisted provider identities, although the current Aggregator relay resolves brand from signed RGB player context. The first single-provider PATCH /api/v1/brands/{brand_id}/providers/{provider_id} on a brand with no provider rows materializes the current globally-visible provider set before applying the edit, so that brand stops inheriting future provider additions until the policy is cleared or replaced
brand_config now has an operator-facing schema registry at GET /api/v1/brands/config/schema and a bulk import route at POST /api/v1/brands/{brand_id}/config/import. The import accepts {config: {...}}, {entries: [...]}, or the static server-config export shape under brand.config; known keys are normalized and validated before write, while dry-run returns the normalized payload without mutating data. The first seven operational customizations are implemented as brand-scoped policy keys: frontend_config, registration_policy, feature_policy, payment_policy, sms_policy, provider allow-list (brand_provider_config), and the existing reward/rolling scalar config (rebate_rate, cashback_rate, lossback_rate, rolling_completion_ratio, valid_odds_threshold). Runtime consumers are intentionally split by ownership: player_service exposes public brand settings and enforces registration/login/SMS policy; wallet_service enforces payment channel, limits, and maintenance policy; game_service enforces game list/launch switches in addition to provider allow-list; promotion_service enforces promotion feature switches and brand-scoped event/coupon reads/writes
Policy precedence is deliberate: feature_policy.*_enabled is the maintenance/kill-switch layer, while registration_policy and payment_policy are product/channel-level gates. The effective runtime decision is an AND across the relevant layers. For payment limits and Plisio, the payment_policy object wins over legacy scalar compatibility keys (deposit_min_amount, payment_channel_plisio_enabled, etc.); new server-config exports should emit only the policy object. payment_policy.deposit_maintenance_enabled and its maintenance time fields are tri-state: omitted/null falls back to legacy global_var, false or empty string is an explicit brand override
admin_service owns the write surface for agent_brand (which brands an agent may serve)
Public content rows (player_notice, banner, promotion, promotion_event) are brand-scoped in runtime reads. When enabling a new brand or migrating old global content, operators must confirm each existing row has the intended brand_id; one announcement or promotion no longer fans out to every brand automatically
Release note for frontend/ops: site_setting now also returns public brand config objects, Plisio support endpoints require X-Brand-Id, and gate_create/coin_create reject amounts outside the effective brand deposit range. Existing JS clients that ignore unknown fields continue to work, but strict clients and direct internal health checks should be updated before rollout. See docs/runbooks/multi-brand/2026-05-08-brand-customization-release-notes.md
the admin entry surface is brand-global only for catalog/global-control routes. Routes that read or write brand-scoped operational tables (player, player_deposit, player_withdraw, player_message, coupon, coupon_log, coupon_recycle_log, etc.) must resolve one operating brand from request.state.brand_id and apply it as a SQL predicate before touching tenant data. Callers should provide X-Brand-Id through the gateway/operator UI; if a brand-scoped route cannot resolve a positive brand id it returns the standard status=54 / brand context required envelope instead of defaulting to all brands. Downstream services still apply MULTI_BRAND_ENFORCEMENT semantics on the brand-id admin forwards, so a stale or wrong brand_id from admin is caught at the wallet/rolling/promotion edge via *_cross_brand_rejected_total. The audit row written for every money-mutating admin route captures brand_id so cross-brand admin actions are observable post-hoc
per-brand configuration writes are audited (timestamp, payload, prior value); per-brand topology and policy writes flow through wallet_service topology/policy CRUD with the brand pinned in the request
between Phase 2 and Phase 12, a temporary Postgres BEFORE INSERT trigger on agent auto-creates the matching agent_brand(agent_id, default_brand_id, 'enabled') row; from Phase 12 onward admin_service (and agent_service's agent-create path, if any) writes agent_brand explicitly, and the trigger is dropped by 0028_remove_agent_brand_autoseed_trigger.py
after every agent_brand write admin_service publishes AGENT_BRAND_CHANGED so agent_service allow-list caches invalidate; after every brand_provider_config write it publishes BRAND_PROVIDER_CHANGED and refreshes the Redis brand:provider_allowlist:{brand_id} cache used by gateway and game_service prechecks. Brand catalog changes no longer publish BRAND_CATALOG_CHANGED because Aggregator callbacks resolve the brand directly from the signed API key and no reverse account-prefix cache exists. The provider snapshot stores configured provider ids; admin read APIs return effective ids after applying global provider.is_show
seven admin_service legacy route files are deleted by ADR-009: legacy_admin_v2.py, legacy_auth.py, legacy_agents_v2.py, legacy_agent_withdrawals_v2.py, legacy_meta_v2.py, legacy_recon.py, legacy_web_content.py; the supporting staff identity helpers are removed with them; previously served paths under /api/admin/user/*, /api/admin/pushbullet/*, /api/admin/shooter/*, /api/admin/web/rules/*, /api/admin/web/faq/*, and /api/admin/web/config/* now return 404; back-office routes that survive operate without per-staff identity for the duration of this change
BrandResolutionMiddleware (T6-B / T7-B6 / T8-D1). A pure-ASGI middleware (app/middleware/brand.py) resolves request.state.brand_id once per request so admin routes can read the brand consistently without re-parsing the header. Resolution order is (1) X-Brand-Id header — set by the gateway after T6-B2 or by the operator UI; (2) Redis domain map (domain:agent:{host} / domain:level:{host}) when admin_service is hit directly. The middleware itself is best-effort: a missing header AND Redis miss leaves brand_id = None. Brand-scoped route handlers are not best-effort; they call brand_required_envelope() and refuse the operation when brand_id is missing. Two T8-D1 hardenings: (a) the middleware skips /health, /healthz, /docs, /openapi.json, and /redoc so a tight liveness probe schedule does not amplify a Redis brownout into route latency; /auth/login is intentionally NOT skipped because login may legitimately need a brand mapping; (b) the previously-silent except Exception Redis path now emits brand_resolution_failed_total{service="admin_service",reason="redis_error"} so a Redis incident is alarmable instead of degrading silently.

Security

Operator identity (P1-3)

require_operator_id (app/api/deps.py) is retained as a compatibility import name, but it now returns the immutable operator username captured on every audit row. It enforces:

JWT-only source. Reads the username claim from the verified admin JWT. Caller-supplied operator identity is ignored and is not part of the public API contract.
Minimal payload contract. Admin JWT payloads must contain only the caller username and an exp timestamp; sub, role, brand_id, iat, and jti are not required and are not part of the contract. Missing username or exp is rejected as an invalid token.
Business persistence. Money, approval, PII, and state-transition writes persist the username string; downstream operator strings use admin:{username} when a prefixed value is required.
Permission boundary. admin_service does not evaluate operator role/menu/button/brand permissions. The calling BO system must complete those checks before invoking the API.

Money-write audit guarantees (P1-4)

Every admin endpoint that mutates a player or agent balance (directly or via wallet_service) writes an admin_audit row after the wallet/state mutation returns successfully. Routes covered:

player.py: POST /players/{id}/adjust → ADJUST_BALANCE
player_finance.py: POST /players/deposits/{id}/agree → APPROVE_DEPOSIT; POST /players/deposits/{id}/decline → DECLINE_DEPOSIT; POST /players/deposits/{id}/recycle → RECYCLE_DEPOSIT; POST /players/deposits/{id}/coin-agree and POST /player/deposit/coin/agree → APPROVE_COIN_DEPOSIT; POST /players/deposits/shooter-agree → APPROVE_DEPOSIT (per-item); POST /players/withdrawals/{id}/review-agree → REVIEW_AGREE_WITHDRAW; POST /players/withdrawals/{id}/review-decline → REVIEW_DECLINE_WITHDRAW; POST /players/withdrawals/{id}/pay-agree → PAY_AGREE_WITHDRAW; POST /players/withdrawals/{id}/pay-decline → PAY_DECLINE_WITHDRAW
agent_finance.py: POST /agents/withdrawals/{id}/agree → AGREE_AGENT_WITHDRAW; POST /agents/withdrawals/{id}/decline → DECLINE_AGENT_WITHDRAW; POST /agents/withdrawals/{id}/pay → PAY_AGENT_WITHDRAW; POST /agents/withdrawals/{id}/pay-decline → PAY_DECLINE_AGENT_WITHDRAW
agent.py (T4-D-I3): POST /agents/{id}/reset-password → RESET_AGENT_PASSWORD; POST /agents/{id}/lock → LOCK_AGENT; POST /agents/{id}/unlock → UNLOCK_AGENT; POST /agents/{id}/update-commission → UPDATE_AGENT_COMMISSION; POST /agents/{id}/adjust-balance → ADJUST_AGENT_BALANCE
player.py (T4-D-I3): POST /players/{id}/edit-password → EDIT_PLAYER_PASSWORD; POST /players/{id}/edit-phone → EDIT_PLAYER_PHONE

Audit rows capture operator_id (from the resolver above), target_type (player/agent), target_id, brand_id (looked up from the affected row), before/after deltas (status, amount, transaction_id only — never full rows), reason (from the request body when present), and the request X-Request-Id. The shared writer is app/services/admin_audit.py::write_money_admin_audit.

If the wallet_service call fails, the audit row is not written — the action did not happen. If the audit insert fails after the money write committed (a partition or DB outage), the route still returns the wallet response (the money write cannot be rolled back) but logs at ERROR level and increments admin_audit_write_failed_total{route=<route>}. On-call must reconcile the audit gap manually.

Sensitive payload handling. Password routes capture {"changed": true} only — never the plaintext password, hash, or salt. The phone route captures sha256-first-8 hashes for old + new values so reviewers can detect "did the number actually change" without storing PII in the audit table.

Legacy direct write. adjust_agent_balance still mutates agent.balance directly (bypassing wallet_service) and increments agent_balance_legacy_write_total{change_type} on every call. add_agent_coupon still mutates agent.coupon directly and emits the same counter with change_type="add". agent_finance refund compensation paths (decline and pay-decline) also increment the same counter with change_type="add" when they refund agent.balance. The counter must hit zero before the agent-money write domain can be considered closed behind wallet_service.

Admin WebSocket auth (T4-D-I7)

POST /api/v1/ws (the back-office real-time notifications channel) accepts the admin JWT via:

Preferred — Sec-WebSocket-Protocol subprotocol. Browser usage: new WebSocket(url, ["bearer", "<jwt>"]). The server echoes Sec-WebSocket-Protocol: bearer on accept. Tokens delivered this way do not appear in nginx / ALB access logs, browser history, proxy logs, or Referer headers.
Legacy — ?token=<jwt> query string. Allowed only when admin_ws_allow_query_token=True AND the runtime env is non-production. Production-runtime requests carrying a query-string token are closed with code 4003. Each fallback emits a loud WARN log + admin_ws_legacy_query_token_total so cutover progress is visible. Once the counter is silent, set admin_ws_allow_query_token=False to harden production permanently.

Additional gates:

Per-IP rate limit. admin_ws_rate_limit_per_min (default 20) caps accept() calls per IP per 60s window using a Redis counter. Excess connections are closed with policy-violation (1008) and admin_ws_rate_limited_total{reason="ip"} is emitted. When Redis is unavailable, prod-runtime fails closed (reason="redis_outage"); dev fails open.
Lifetime bound to JWT exp. A timer scheduled at accept time closes the socket with policy-violation (1008) when the token's exp claim passes — a long-lived connection cannot outlive its auth.

Tests

cd servers_v2/admin_service && uv run pytest
key suites:
- tests/test_admin_integration.py
- tests/test_client_urls.py

Status​

Date​

Owners​

Last Verified Commit​

T16-A logs.py brand-scope contract​

Runtime​

Purpose​

Primary Entry Points​

Dependencies​

Background Work​

Owned Data​

Events​

Health​

Key Env Vars​

Known Migration Gap​

Multi-Brand Constraints​

Security​

Operator identity (P1-3)​

Money-write audit guarantees (P1-4)​

Admin WebSocket auth (T4-D-I7)​

Tests​