Multi-Brand Isolation

Status

Approved

Current game-boundary addendum (2026-07-17): The original native provider callback and account-namespacing design in this dated specification has been superseded. servers_v2 now calls only Aggregator's Brand API and accepts only Aggregator's signed Provider-protocol relay below /aggregator/provider/{team_code}/{provider_code}/{operation}. Aggregator authenticates the provider and resolves player identity; RGB owns the original provider protocol, wallet and ledger effects. Native public prefixes, normalized wallet callbacks, RGB-held native credentials, reverse parsing, DLQ replay, and verification bypasses no longer exist. This addendum overrides every historical game-boundary statement below; see docs/services/game-service.md for the live contract.

Date

2026-04-27

Owners

Platform Backend
Player Domain
Wallet Domain
Agent Domain

Affected Services

gateway
player_service
wallet_service
rolling_service
promotion_service
game_service
agent_service
admin_service
recon_service

docs/adr/ADR-009-multi-brand-domain-routed-isolation.md
docs/adr/ADR-005-wallet-topology-bucket-ledger-model.md
docs/adr/ADR-001-document-driven-backend-change-workflow.md

docs/architecture/data-ownership.md
docs/architecture/domain-ownership.md
docs/architecture/system-overview.md
docs/architecture/http-entrypoints.md
docs/services/gateway.md
docs/services/player-service.md
docs/services/wallet-service.md
docs/services/rolling-service.md
docs/services/promotion-service.md
docs/services/game-service.md
docs/services/agent-service.md
docs/services/admin-service.md
docs/services/recon-service.md

docs/runbooks/multi-brand/multi-brand-isolation-rollout.md

Goal

Make servers_v2 capable of hosting multiple distinct player-facing brands on the same runtime, with brand-scoped player identity, money state, rolling and settlement, and configuration, while keeping a single PostgreSQL database, a single Redis, and a single Docker stack.

A "brand" is a self-contained operating product: its own player base, its own wallet topology and policy, its own promotion rules, its own configuration, its own domain footprint. Brands are identified at the request edge from the request domain. After login, brand identity is also carried in the JWT.

Scope

In scope:

A new brand aggregate and a new brand_config aggregate.
A new agent_brand join aggregate (brand membership for an agent).
Adding non-nullable brand_id to every brand-scoped table in servers_v2, with backfill into a single default brand.
Brand-scoped uniqueness constraints on player, wallet_topology, wallet_policy, agent_setting, agent_domain, and any other table whose current uniqueness must change.
Brand resolution in gateway from the request domain via the existing domain:agent:{host} and domain:level:{host} Redis maps, extended to carry brand_id.
Forwarding brand context downstream as X-Brand-Id.
Adding a brand_id claim to JWT and rejecting JWT/domain mismatches.
Brand-scoped query filtering in every domain service.
Wallet command rejection on cross-brand targets.
Brand-scoped wallet topology and wallet policy resolution.
Per-brand promotion configuration, coupon definitions, and rebate/cashback/lossback rates.
Per-brand rolling completion ratios.
Per-brand payment channel selection and per-brand operational config.
Aggregator Brand API credential isolation and signed callback relay identity.
Admin write surfaces for brand, brand_config, and agent_brand.
Agent allow-list enforcement at the agent edge.
Brand-aware structured logs and a brand label on player-, wallet-, rolling-, promotion-, and game-scoped Prometheus metrics.
Brand-aware outbox and cross-service event payloads.
Removal of the legacy_admin_v2 and legacy_auth staff routes and the supporting staff identity logic from admin_service.

Out of scope:

Schema-per-brand or database-per-brand isolation.
Direct or per-brand native game-provider credentials in RGB.
A new back-office staff identity model. Staff is removed in this change and is not replaced.
Per-brand infrastructure (separate Redis, separate Postgres, separate Docker stack).
Cross-brand player identity migration (e.g. promoting a player from one brand to another).
Cross-brand wallet transfer.
Per-brand SLA, billing, or quota management.
Per-brand i18n and theming for any frontend that lives outside this repo (the brand_config slots are defined here; rendering is owned by the consuming frontend).

Background

servers_v2 was designed under a single-brand assumption. The only existing tenant-shaped artifact is the hard-coded project integer in admin_service/app/tasks/tag.py, which switches between two product labels and is not a real isolation mechanism. The platform now needs to host more than one brand at the same time on the same runtime.

gateway and player_service already carry a domain-routing mechanism: gateway._extract_domain reads Host or Origin and injects a domain field into the request body for legacy routes; player_service resolves agent_id and level_id from domain:agent:{host} and domain:level:{host} Redis maps. That mechanism is the natural extension point for brand routing and is preserved.

A no-staff posture is intentional. Back-office staff identity, role/menu management, and the legacy_admin_v2 and legacy_auth staff compatibility paths are being retired together with this change.

Compatibility

Required to remain stable:

The status/msg/data envelope on every external response is unchanged.
Existing player route paths under /api/v1/* and the legacy root aliases in gateway are unchanged.
Existing internal route paths under /internal/* are unchanged; only the required X-Brand-Id header is added.
The Plisio public callback paths (/internal/wallet/plisio/callback and the legacy /player/deposit/plisio/callback) remain public; brand is resolved from the deposit record, not from the caller.
The player-facing /api/v1/game/* surface remains compatible. Provider callbacks move to the signed Aggregator relay namespace; old native prefixes and normalized Aggregator wallet routes return 404.
The agent frontend route surface in agent_service is unchanged.
The wallet topology contracts from ADR-005 are unchanged in shape; the only changes are brand scoping of uniqueness and resolution.
Existing JWT consumers that only read player_id continue to work; they must additionally read brand_id after this change.

Required to break by design:

The following admin_service route files are deleted: legacy_admin_v2.py, legacy_auth.py, legacy_agents_v2.py, legacy_agent_withdrawals_v2.py, legacy_meta_v2.py, legacy_recon.py, legacy_web_content.py. The supporting staff identity helpers are removed with them. The legacy admin paths under /api/admin/user/*, /api/admin/pushbullet/*, /api/admin/shooter/*, /api/admin/web/rules/*, /api/admin/web/faq/*, and /api/admin/web/config/* (and any other route served exclusively by the deleted files) are removed and will respond 404.
Any code path that assumes player.account is globally unique is changed; uniqueness is now (brand_id, account).
Any code path that assumes the active wallet topology is global is changed; topology and policy resolution is per brand.
Native provider callback targets move to Aggregator. For each configured RGB team, Aggregator must relay every supported provider callback operation to RGB; partial per-provider fall-through is forbidden.

Ownership

Aggregate	Owner	Notes
`brand`	`admin_service`	brand catalog
`brand_config`	`admin_service`	per-brand configuration values
`agent_brand`	written by `admin_service`, read by `agent_service`	agent-to-brand allow list
`agent`	`agent_service`	brand-global; no `brand_id` column
`agent_setting`, `agent_domain`	`player_service` (already today)	now scoped per `(agent_id, brand_id)`
`player` and player-owned tables	`player_service`	brand-scoped
wallet aggregates (`wallet_account`, `wallet_bucket`, `wallet_coupon_grant`, `wallet_bet_authorization`, `wallet_ledger`, transfers, idempotency rows, outbox)	`wallet_service`	brand-scoped; ADR-005 unchanged
`wallet_topology`, `wallet_policy`	`wallet_service`	brand-scoped uniqueness
rolling aggregates	`rolling_service`	brand-scoped
coupon, promotion, settlement, saga aggregates	`promotion_service`	brand-scoped
game provider state	`game_service`	brand-scoped rows; native credentials absent from RGB
recon aggregates (`shooter_*`)	`recon_service`	brand-scoped
Aggregator Brand API and relay credentials	`game_service`	explicit per-brand/team mapping; native provider credentials are Aggregator-owned

Requirements

Brand entity

A brand row has:
- brand_id BIGINT autoincrement surrogate primary key (matches existing surrogate convention used by agent.guid and player.guid)
- brand_code VARCHAR(16) constrained by CHECK (brand_code ~ '^[a-z][a-z0-9]{1,15}$') (lowercase alphanumeric, leading letter, 2-16 chars; safe for every supported game provider's account format and for use in Prometheus labels)
- name VARCHAR(64)
- default_currency CHAR(3) (ISO 4217)
- status enum (enabled, disabled)
- created_at, updated_at TIMESTAMP
brand_code is unique and immutable after creation.
brand is read by every domain service. It is written only through admin_service.
The default brand seed has brand_code = 'default', name = 'Default Brand', status = enabled, default_currency matching the current single-brand environment's documented currency.

Brand configuration

A brand_config(brand_id, key, value) row stores per-brand override values. Values are JSON.
Resolution order for any configurable value is: per-brand brand_config[brand_id][key], then a documented global default. The global default may live in code or in a documented global config table.
The configurable keys include:
- rolling completion ratios per provider type
- cashback rate, rebate rate, and lossback rate
- payment channel selection (Plisio enabled, manual bank channels)
- withdrawal min/max amounts and fees
- risk thresholds for deposit and withdrawal
- i18n override token, theme token, customer-service link, email template overrides
Values that must remain global (native provider configuration in Aggregator, infrastructure URLs, shared rate limits) are explicitly listed in the plan and must not be moved into brand_config.

Brand resolution at the edge

gateway resolves brand_id for every external player request from the request domain via the existing _extract_domain helper plus the Redis maps domain:agent:{host} and domain:level:{host}. Both maps now resolve to a value carrying brand_id.
_extract_domain precedence is Origin first (used when the player is on the canonical web origin), then Host header (used as fallback for callers that omit Origin). Both come from the same Redis map; Origin is trusted only because TLS termination enforces hostname authenticity at the load balancer (CORS does not validate brand binding). If a deployment ever exposes the gateway without TLS-terminating LB, the precedence must be reversed; this is documented in gateway's deployment notes.
gateway MUST strip any inbound X-Brand-Id header from external requests before injecting the resolved value. agent_service MUST do the same on its agent-frontend edge. Internal callers may legitimately send X-Brand-Id to internal handlers (the header is trusted only on internal routes and only when paired with a valid X-Internal-Service-Token).
A request whose domain does not resolve to a brand is rejected at the edge with a stable error envelope.
The resolved brand_id is attached to request.state.brand_id and forwarded to all downstream services as the X-Brand-Id header.
agent_service resolves brand from the agent frontend domain the same way and validates the resolved brand against the authenticated agent's agent_brand allow list. Brand resolution applies uniformly to the agent's primary /api/v1/agent/* routes and to its legacy aliases (e.g. /api/v1/user/login); the legacy alias surface is not exempt.
admin_service is brand-global at the entry surface; per-brand operational routes accept an explicit brand_id parameter and forward it as X-Brand-Id on internal calls.
game_service accepts only a valid Aggregator relay signature and resolves brand_id from its signed RGB external-player context. It verifies that player and brand through player_service before invoking provider protocol code.
The Plisio public callback resolves brand_id from the matching deposit record (which carries brand_id from creation time).
recon_service does not have a public HTTP callback surface. Inbound reconciliation signals (SMS, pushbullet) reach recon_service through internal collection paths, and brand is resolved from the matched deposit record, not from message text. There is no edge brand resolution to perform on those signals.
Internal handlers reject brand-scoped operations that arrive without a brand context.

JWT and authorization

JWT issued at login carries brand_id as a claim.
gateway checks jwt.brand_id == request.state.brand_id on every authenticated request. Behavior depends on MULTI_BRAND_ENFORCEMENT (see Staged enforcement below): in observe, mismatches are logged and counted; in enforce, mismatches are rejected with the documented error envelope.
A JWT that lacks brand_id is treated the same way (logged in observe, rejected in enforce); JWTs issued before brand-aware login goes live drain naturally over JWT_EXPIRE_MIN.
Internal services trust the X-Brand-Id header attached by the edge or by another internal caller; they do not re-derive brand from JWT (which is verified at the edge and not always propagated downstream).
Logout, refresh, and impersonation flows preserve brand_id.

Player identity

player uniqueness becomes (brand_id, account). The same account string may exist in two brands as two distinct player_id rows.
A MAX_PLAYER_ACCOUNT_LEN = 32 constant is introduced (codified in rgb_contracts) and enforced at registration time. The DB column remains VARCHAR(255) for legacy compatibility; the runtime cap preserves the player contract and leaves room for Aggregator/provider-side account projections. A backfill audit confirms no existing player.account exceeds 32 chars before the cap is enforced; any non-conforming legacy rows are flagged for manual remediation.
Registration resolves brand_id from the request domain and rejects any client-supplied brand override.
Recovery flows resolve brand_id from the request domain or from the recovery token; cross-brand recovery is forbidden.
Player-owned tables (player_deposit, player_withdraw, player_bank, message rows, common projections, player-owned outbox) carry brand_id.

Agent

agent is brand-global. The agent table carries no brand_id.
agent_brand is the only place where brand membership for an agent is recorded. Rows: (agent_id, brand_id, status, created_at).
agent_setting becomes keyed by (agent_id, brand_id). A single agent may carry different registration modes per brand.
agent_domain keeps its existing surrogate guid BIGINT primary key (preserving any FK references to agent_domain.guid); a composite UNIQUE(agent_id, brand_id, domain) constraint is added on top of the existing UNIQUE(domain) global constraint. Together they guarantee a domain value cannot exist twice under any (agent_id, brand_id) combination, and a domain belongs to exactly one brand and exactly one agent.
agent_service rejects every authenticated request whose resolved brand is not in the agent's agent_brand allow list.
agent_brand reads in agent_service are cached per process with a 60-second TTL AND invalidated by the AGENT_BRAND_CHANGED Redis pub/sub channel published by admin_service after every agent_brand write (insert, update, delete). Each agent_service process subscribes on startup; processes that miss a message refresh on TTL expiry.
agent_brand revocation does NOT invalidate the active JWT (JWT brand validation is gateway-side). The next request after revocation is rejected by allow-list check; the existing session can continue to receive 4xx responses until the agent re-authenticates against a brand they still hold. JWT revocation is a documented follow-up (requires a JWT denylist or short-lived access tokens with refresh).

Wallet

Every wallet-owned aggregate carries brand_id: wallet_account, wallet_bucket, wallet_bucket_type, wallet_coupon_grant, wallet_bet_authorization, wallet_ledger, wallet_transfer, wallet_idempotency, wallet outbox, wallet_inbox, wallet_dead_letter.
wallet_topology and wallet_policy documents are brand-scoped: uniqueness becomes (brand_id, code, version) for wallet_topology and (brand_id, topology_code, version, policy_key) for wallet_policy. The existing partial unique active indexes (uq_wallet_topology_single_active and uq_wallet_policy_active_key, both WHERE status = 'ACTIVE') are rebuilt to include brand_id so every brand may have its own active topology and active policy simultaneously without colliding on the global "exactly one ACTIVE" constraint.
wallet_bucket_type uniqueness becomes (brand_id, topology_code, topology_version, code) so two brands may legitimately share the same topology_code (e.g. both inherit RUBY_SPLIT_V1) without bucket-type collisions.
Active topology and policy resolution are per brand. Topology activation safety checks (per ADR-005: blocked when unresolved balances, coupon grants, rollings, unsettled bets, or transfers would become unreachable) are scoped to the activating brand only; cross- brand state never blocks activation.
wallet_idempotency uniqueness becomes (brand_id, idempotency_key) so the same key may legitimately appear in two brands; two brands' players sharing the same account string can independently retry without a phantom-duplicate rejection.
wallet_bet_authorization provider-bet uniqueness becomes (brand_id, provider_type, provider_id, bet_id) because the same provider may legitimately reuse a bet_id across two brands when outbound calls are namespaced.
wallet_service checks every command's target row brand against the request brand before any money mutation. Behavior depends on MULTI_BRAND_ENFORCEMENT (the same flag as gateway): in observe, mismatches log and increment wallet_cross_brand_rejected_total{command,mode="observe"} and the command is allowed to proceed using the request brand as the authoritative scope (so a missing or wrong X-Brand-Id cannot cause a money write to land in the wrong brand); in enforce, mismatches hard-reject. Money mutations never use a brand inferred from the target row alone.
ADR-005's player-wallet writer rule is unchanged.
Wallet outbox events carry brand_id as a field on the shared DomainEvent envelope (rgb_contracts/events/base.py), not on per-payload schemas. Producers and consumers read/write it from the envelope; per-payload schemas are not modified.

Rolling

Rolling records, rolling inbox/outbox, and completion/cancel retry state carry brand_id.
The rolling event consumer reads brand_id from inbound payloads and persists it.
Per-brand rolling completion ratios resolve from brand_config.

Promotion

Coupon usage rows, promotion configs, settlement projections, and promotion_coupon_saga rows carry brand_id.
Coupon definitions, event configs, and rebate/cashback/lossback rates are per-brand.
Settlement schedulers iterate brand by brand in brand_id ascending order, sequentially (one brand at a time). Cross-brand aggregation in settlement is forbidden. Parallel-per-brand execution may be introduced later as an explicit follow-up; the day-0 contract is sequential to keep timing deterministic across environments.

Game integration

Native game-provider credentials, endpoints, authentication and session lookup are Aggregator-owned; RGB has no direct provider client or fallback.
RGB outbound game operations are restricted to Aggregator Brand API games, launch and event endpoints with one credential explicitly mapped per brand.
Native provider callbacks first reach Aggregator. A configured relay team is all-or-nothing: Aggregator authenticates the provider, resolves its stored player/session identity, and forwards the exact method/query/body to RGB.
The relay signature covers timestamp, method, RGB target path, raw query, signed context and raw body. Context v2 includes contract id, stable event id, exact request hash, team, provider, operation, provider-authenticated state, RGB external player ID and brand code.
Gateway and game_service expose only the same static provider-operation matrix. RGB verifies signature, route/context equality, clock skew, body limit and player brand ownership before provider-specific wallet work.
RGB owns the provider response envelope, wallet commands, transaction records, replay idempotency, rollback and resettlement semantics. Before handler execution, RGB claims a durable Inbox event. Each committed ORM mutation is emitted to a transactional outbox, then returned with the exact Provider response in a canonical RGB-signed result envelope. Aggregator verifies and applies an append-only platform projection before releasing that Provider response; it does not translate callbacks into a normalized debit/credit model or execute a second wallet command.
Provider callback transaction state and provider-specific records (e.g. bti_*, ho_*, mg_*, wc_*, digitain_*) carry brand_id.

Reconciliation

The shooter_* table family is split into operator-infrastructure rows (no brand_id) and brand-scoped event rows:
- Operator-global (no brand_id): shooter_device (physical phone-receiver hardware bindings), shooter_pushbullet (Pushbullet API token bindings), shooter_phone (phone whitelist), shooter_template_recharge (regex parsing rules). These are operator-owned infrastructure and are not duplicated per brand.
- Brand-scoped (carries brand_id): shooter_sms (inbound SMS rows; brand_id is populated when the SMS is matched against a player_deposit row, not parsed from message text), and any recon review-state rows that reference a specific player or deposit.
Recon match decisions resolve brand_id from the matched deposit record (player_deposit.brand_id).
Approvals continue to flow through wallet_service; cross-brand rejection in wallet_service blocks any approval that would touch a different brand's deposit.

Admin surfaces

admin_service exposes brand catalog CRUD: create, enable, disable, edit metadata. brand_code is immutable after creation.
admin_service exposes brand_config CRUD with audited writes (timestamp, payload, prior value).
admin_service exposes agent_brand CRUD with audited writes.
Per-brand topology and policy writes flow through wallet_service CRUD with brand pinned in the request.
The legacy_admin_v2 and legacy_auth staff routes and the supporting staff identity logic are removed. Any route that survived the staff removal must keep its status/msg/data envelope shape.

Staged enforcement

Every brand-scoped enforcement point (gateway JWT/domain mismatch, wallet_service cross-brand command rejection, agent_service brand allow-list rejection, internal-handler X-Brand-Id requirement) is gated by a single environment-variable flag, MULTI_BRAND_ENFORCEMENT, with three modes:
- off: resolve and forward brand context, but never reject.
- observe (default): check, log, and increment brand_resolution_failed_total{reason,service} on mismatch; do not reject.
- enforce: hard-reject on mismatch with the documented error envelope.
The flag is read by gateway, wallet_service, agent_service, and every internal handler that adds brand-scoped enforcement.
The flag is set as a process env var. Mode change requires a service restart. The current mode is included in each service's /health response payload.
The "request brand wins" behavior in observe applies symmetrically to read paths and write paths: a read query whose request brand differs from the JWT brand still uses the request brand for its WHERE brand_id = ? filter. This prevents a stale or wrong-brand JWT from disclosing another brand's data during the soak window; enforce mode then rejects the same scenario hard.
A second brand cannot be enabled in any environment until that environment is in enforce mode.

Authentication and integrity

Per ADR-009 "Authentication and integrity posture":

admin_service runs behind VPN ingress with an IP allow-list. Every brand, brand_config, and agent_brand write requires an immutable operator account in the verified admin JWT; admin rejects writes whose token has no account claim. Audit rows include operator_id (historical column name storing the account string), request IP, request_id, and timestamp.
Internal-service authentication uses per-caller-service tokens (INTERNAL_SERVICE_TOKEN_GATEWAY, INTERNAL_SERVICE_TOKEN_AGENT, etc.). A consumer service knows the set of caller tokens it accepts. A compromise of one caller's token cannot impersonate another.
Brand-scoped wallet write commands carry X-Brand-Signature: HMAC_SHA256(BRAND_SIGNING_KEY, caller_service| brand_id|request_id|timestamp). wallet_service rejects missing or invalid signatures in enforce mode (logged in observe).
JWT signing uses RS256. Private key is held by player_service and agent_service; other services verify with the public key. JWT header carries kid; verifier rejects unknown kids and rejects alg=none and HS-family algorithms unconditionally. Key rotation procedure is documented in the runbook.
brand_code charset ^[a-z][a-z0-9]{1,15}$ PLUS legacy prefix-disjointness: admin_service rejects any new brand_code that is a prefix of an existing brand_code, has an existing brand_code as a prefix, or collides with the prefix of any existing player.account value that has been namespaced and sent to a game provider. This preserves historical identities but is not used to authenticate Aggregator relay callbacks.
Brand-policy caches invalidate on the BRAND_CATALOG_CHANGED Redis pub/sub channel published by admin_service after every brand create/disable. Same contract for AGENT_BRAND_CHANGED consumed by agent_service. Redis pub/sub is best-effort: if Redis is down or a subscriber is briefly partitioned, invalidation messages are lost and correctness reverts to the 60-second TTL fallback. Documented in the runbook's diagnosis playbook for stale-cache scenarios.

Recovery flow brand precedence

Recovery tokens issued after Phase 5 carry brand_id as a signed claim. Recovery flow precedence: token-embedded brand_id wins; if the token lacks brand_id (issued before Phase 5), fall back to the request domain's brand. If the resolved brand does not match the underlying player's brand_id, the recovery request is rejected hard regardless of MULTI_BRAND_ENFORCEMENT mode (recovery is too sensitive for observe behavior).
Recovery responses must not leak the existence of an account in another brand. Same email or phone in two brands recovers each independently with brand-scoped tokens.

Idempotency cross-brand collision detection

The wallet_idempotency constraint is (brand_id, idempotency_key). An inbound idempotency key that matches an existing key in a different brand increments wallet_idempotency_brand_split_total. Phase 16 release gate requires this counter to be zero across the soak window.

Configuration consolidation

The hard-coded project integer in admin_service/app/tasks/tag.py is removed; behavior previously gated by project resolves from brand_config.
Any global config constant or global_var row that is now duplicated by per-brand brand_config entries (rolling ratios, cashback / rebate / lossback rates, payment channel selection, withdrawal min/max, risk thresholds) is either deleted or explicitly demoted to a "documented global default" used only when brand_config has no brand-specific override. The deletion list is enumerated in plan Phase 12.
Staff-related globals that become dead code with the removal of the seven admin_service legacy route files (legacy admin auth secret env vars, legacy session config, supporting helper modules) are removed at the same time. Any unused entries in servers_v2/.env.compose.example are removed.

Migration

A default brand is created at migration time. All existing rows backfill into this brand.
Every existing agent row gets an agent_brand row of the form (agent_id, default_brand_id, status='enabled') seeded at migration time so existing agents retain access through Phase 11 allow-list enforcement and Phase 16 hard-flip.
Backfill is reversible in non-production environments by dropping the added columns and constraints.
Domain-to-brand bindings for the default brand are seeded at migration time so the existing single-brand environment continues to resolve.
Production cutover does not enable a second brand until the runbook's release gate is satisfied.

Observability

Required logs:

structured log fields include brand_id and brand_code
edge logs include the resolved domain, the resolved brand, and whether brand resolution succeeded or failed
wallet command logs include the request brand and the target row brand; cross-brand rejections log both
game callback logs include relay team_code, provider_code, operation, verified brand_id and RGB external player identity without secrets

Required metrics:

player-, wallet-, rolling-, promotion-, and game-scoped counters and histograms carry a brand label whose value is brand_code
a brand_resolution_failed_total{reason,service} counter exists at the edge AND on every internal handler that performs brand-scoped checks, with reason constrained to a documented enum (jwt_domain_mismatch, missing_header, unknown_domain, jwt_missing_brand, player_brand_mismatch, agent_brand_not_allowed, recovery_brand_mismatch)
a wallet_cross_brand_rejected_total{command,mode} counter exists in wallet_service
a game_callback_brand_unresolved_total{provider} counter exists in game_service
a wallet_idempotency_brand_split_total counter exists in wallet_service (inbound key matches existing key in different brand)
a multi_brand_enforcement_mode{service} gauge exists in every service that reads the flag (value: 0=off, 1=observe, 2=enforce)
a brand_resolution_latency_seconds{service} histogram covers the Redis lookup cost on the edge brand-resolution hot path
a request_total{brand_code,service} counter (per-brand legitimate traffic) so a brand getting zero traffic is detectable
an event_legacy_schema_total{stream,consumer} counter increments when a consumer accepts a schema_version=1 event during the soak
a security_downgrade_total{service} counter increments when a service starts in MULTI_BRAND_ENFORCEMENT != 'enforce' while more than one enabled brand exists
per-service *_cross_brand_rejected_total{command,mode} counters exist in agent_service, game_service, promotion_service, recon_service, and rolling_service -- mirroring the wallet variant so each owner surface can alarm independently
a wallet_brand_signature_failed_total{caller_service,reason,mode} counter exists in wallet_service covering the missing_headers:*, invalid_timestamp, signature_mismatch, and signature_replay reasons; signature_replay is also surfaced as wallet_brand_signature_replay_blocked_total for dashboarding convenience
a wallet_brand_signature_missing_total{caller_service} counter exists in wallet_service so operators can see which callers still need to migrate before flipping WALLET_BRAND_SIGNATURE_REQUIRE=on
a wallet_brand_signature_misconfigured_total{mode} counter exists in wallet_service (P1-4) for the runtime fail-closed branch when BRAND_SIGNING_KEY is empty
a wallet_brand_signature_replay_redis_outage_total{caller_service,mode} counter exists in wallet_service (P2-δ) when the replay-cache Redis SETNX raises
a wallet_topology_default_brand_unprimed_total counter exists in wallet_service (P2-δ) when a topology / policy lookup arrives without a brand id and the default-brand cache is unprimed
a wallet_topology_default_brand_fallback_total{caller_origin} counter exists in wallet_service for primed-cache default-brand fallbacks (per-call-site breakdown for brand-onboarding)
a game_callback_verification_bypassed_total{provider} counter exists in game_service (T1-D-C2) for any callback that reached the VERIFY_CALLBACKS=False bypass branch -- MUST be zero in production
a game_callback_brand_unresolved_total{provider,reason?} counter exists in game_service; reason="raw_account_in_enforce" and reason="raw_guid_in_enforce" (P1-3) are hard-blocking on the Phase 16 flip
an internal_caller_token_legacy_total{consumer,caller} counter exists in every internal-token-accepting service for the per-caller rollout (Stage A→B→C signal)
an internal_caller_token_legacy_rejected_total{consumer,caller} counter exists in every internal-token-accepting service (T4-D-I2) to record requests rejected once PER_CALLER_TOKEN_REQUIRED=on
a gateway_session_legacy_key_used_total{reason} counter exists in gateway (T1-D-C1) for the Phase 4E session-key cutover signal, with reason ∈ {missing_brand, fallback_after_miss}
a gateway_jwt_session_missing_total counter exists in gateway (T1-D-C1) for "JWT-valid-but-session-gone" requests
a gateway_jwt_unknown_kid_total counter exists in gateway (T4-D-I4) for JWT verifier rejections by unknown kid; spikes during RS256 kid rotation indicate stale verifier caches
an agent_brand_not_allowed_total{reason,service} counter exists in agent_service for allow-list rejections (the spec-canonical brand_resolution_failed_total{reason="agent_brand_not_allowed"} is also emitted alongside it for cross-service dashboards)
an admin_operator_id_mismatch_total{reason} counter exists in admin_service (P1-3) for the JWT-vs-header cross-check rejections
an admin_audit_write_failed_total{route} counter exists in admin_service (P1-4) for audit-row INSERT failures observed AFTER the money write committed
an admin_ws_legacy_query_token_total counter exists in admin_service (T4-D-I7) for WebSocket auth via the legacy ?token= query string fallback
an admin_ws_rate_limited_total{reason} counter exists in admin_service (T4-D-I7) for per-IP rate-limit rejections, with reason ∈ {ip, redis_outage}
an agent_balance_legacy_write_total{change_type} counter exists in admin_service (T4-D-I3) for the legacy direct-write surface; goal is zero post-migration
a pii_aes_unconfigured_total{service,op} counter exists in player_service and agent_service (T1-D-C3) when the AES PII helpers hit the non-production fallback; MUST be zero in production
a plisio_callback_replay_blocked_total{status} counter exists in wallet_service (T1-D-C5/I6) for Plisio webhook replays caught by the SETNX guard
a plisio_callback_replay_redis_outage_total{env} counter exists in wallet_service (T1-D-C5/I6) when the Plisio replay-cache SETNX raises
a recovery_sms_delivery_failed_total{service} counter exists in player_service and agent_service (Codex P1-#6) when the SMS branch of /recovery/request could not deliver
a recovery_email_unprovisioned_total{service} counter exists in player_service and agent_service (Codex P1-#6) when the email recovery branch is selected but no transactional provider is wired
a recovery_rate_limit_redis_outage_total{bucket} counter exists in player_service and agent_service (T4-D-I5) for fail-closed per-contact rate-limit rejections caused by Redis exceptions in production
all metric labels are bounded enums; no attacker-controlled value (host, account, raw IP, brand_code beyond the 16-char regex bound) appears as a label

Required alerts (with explicit thresholds):

brand_resolution_failed_total{reason="jwt_domain_mismatch"} rate above 0.1/s sustained over 5 minutes (page on-call). Below threshold is expected baseline noise from scanners and stale clients.
brand_resolution_failed_total{reason="missing_header"} non-zero rate sustained over 5 minutes during enforce (page; in observe it is informational only).
wallet_cross_brand_rejected_total{mode="enforce"} non-zero rate sustained over 1 minute (page; potential cross-brand bug or caller-token compromise).
wallet_cross_brand_rejected_total{mode="observe"} non-zero rate during the Phase-16 soak window (warning only; gates the flip but does not page outside the soak window).
game_callback_brand_unresolved_total non-zero rate sustained over 1 minute (page; real money is stuck on the provider side and callbacks are entering the dead-letter queue).
wallet_idempotency_brand_split_total non-zero rate (warning; indicates client misrouting or replay).
security_downgrade_total non-zero in production (page immediately; a service downgraded enforcement mode after enforcement was live).
per-brand wallet outbox publication freshness alert (existing wallet outbox alert, scoped per brand label).
per-brand request_total rate dropping to zero for a previously active brand sustained over 15 minutes (warning; brand outage or domain misconfiguration).
silence procedure: alerts may be silenced by deploy lead for a documented rollout window; silence must be acknowledged in #incidents.

Rollout Notes

Migrations run before any default brand backfill. Backfill runs before any service starts requiring X-Brand-Id.
gateway enforcement of JWT/domain brand mismatch is enabled only after every downstream service has been deployed with brand-aware filtering.
A second brand is enabled only after the runbook release gate passes.
The hard-coded project integer is removed only after brand_config has the equivalent values for the default brand.

Acceptance Criteria

A migration on a fresh database creates brand, brand_config, agent_brand, and adds brand_id to every brand-scoped table with brand-scoped uniqueness.
Backfill on an existing database moves every existing row into a default brand without data loss; reversibility is verified in a non-production environment.
Two brands can be configured (e.g. default and brand2) with distinct domains. A registration on brand2's domain creates a player row in brand2. The same account string may also register on the default domain and create an independent player row in default.
A login on one brand's domain whose JWT was issued for the other brand is rejected at gateway.
A wallet command targeting a player whose brand differs from the request brand is rejected before any money mutation.
Wallet topology and policy can be configured independently per brand; active topology resolution returns the per-brand active document.
A bet authorized through game_service for a player in brand2 launches through brand2's Aggregator credential; the later provider callback resolves in Aggregator and is accepted by RGB only when its signed external player is verified as belonging to brand2.
Per-brand rolling completion ratios, cashback/rebate/lossback rates, and payment channel selection resolve from brand_config.
All seven listed admin_service legacy route files and the supporting staff identity logic are deleted; previously served paths return 404; the test suite reflects the removal.
Structured logs and Prometheus metrics carry brand labels as specified.
ADR-005's player-wallet writer rule is unchanged after this change.
Player registration rejects any account longer than MAX_PLAYER_ACCOUNT_LEN = 32 chars with the documented validation error.
MULTI_BRAND_ENFORCEMENT=enforce is exercised in at least one shared environment after the soak window; gateway, wallet_service, agent_service, and the brand-aware internal handlers all report enforce mode in /health.
An agent_brand row of the form (agent_id, default_brand_id, status='enabled') exists for every pre-migration agent row after 0023_seed_default_brand.py runs. The auto-seed trigger (0023b_install_agent_brand_autoseed_trigger.py) covers any new agent rows created during the Phase 2-12 window; the trigger is removed by 0028_remove_agent_brand_autoseed_trigger.py once admin_service writes agent_brand explicitly at agent creation in Phase 12.
Settlement schedulers iterate brands in brand_id ascending order, sequentially; observable in worker logs.
External-facing gateway and agent_service strip any inbound X-Brand-Id header before injecting the resolved value; verified by a synthetic header-spoofing test.

Required Tests

migration tests for brand, brand_config, agent_brand, and brand_id column additions
backfill tests proving default brand assignment for every existing row family
gateway brand resolution tests for valid domain, unknown domain, and domain bound to another agent
gateway JWT brand validation tests covering: cross-brand JWT replay (agent A's token on a domain bound to brand B), cross-agent JWT replay within the same brand (agent A's token on a domain bound to agent B in same brand), JWT missing brand_id claim, body/query attempts to override brand
gateway enforcement-mode tests: same scenarios under observe (logged + counted, not rejected) and under enforce (rejected with documented envelope)
player_service registration tests proving cross-brand account independence
agent_service allow-list rejection tests
wallet_service cross-brand command rejection tests for every command family (deposit approve, withdrawal create, bet authorize, settle, rollback, transfer, points credit, coupon grant, reversal, adjustment)
wallet_service per-brand topology and policy resolution tests
rolling_service brand-scoped consumption and progress tests
promotion_service per-brand coupon, rebate, cashback, and lossback resolution tests
game_service Aggregator credential-isolation tests, relay HMAC/context tests and protocol/idempotency tests for every supported provider family
recon_service brand-scoped match and approval tests
admin_service brand catalog, brand_config, and agent_brand CRUD tests
admin_service staff removal tests confirming all seven deleted route files no longer mount any router and that all previously served paths return 404 (legacy admin auth, legacy agents v2, legacy agent withdrawals v2, legacy meta v2, legacy recon compatibility, legacy web content)
observability tests confirming log fields and metric labels
local Docker end-to-end flows for two simultaneous brands covering: registration, login, deposit, bet, settlement, rolling, withdrawal, promotion settlement, coupon issue and use, cross-brand isolation
MAX_PLAYER_ACCOUNT_LEN validation tests at registration (boundary, oversize, exact-32-char accounts)
backfill audit test confirming no existing player.account exceeds 32 chars before the cap is enforced
migration test asserting an agent_brand row exists for every pre-migration agent after 0023 runs, AND that the 0023b autoseed trigger creates a row for any agent row inserted between 0023b (install) and 0028 (remove)
migration test asserting agent_setting, player_wallet_limit, daebak_email, mg_account, wc_account, digitain_token, i18n PK rewrites land cleanly with rows seeded pre-migration under both default and a second brand
gateway synthetic test: external request with inbound X-Brand-Id: 99 is stripped; downstream sees only the resolved brand
Redis cache key prefix tests confirming sms_captcha:{brand_code}: {phone} and login-throttle keys are brand-prefixed; same phone in two brands does not collide
per-provider outbound-account length audit results recorded as a test fixture in game-service tests
JWT signing tests: alg=none rejected, alg=HS256 rejected when RS256 is the configured algorithm, JWT with unknown kid rejected, JWT signed with old kid accepted during the documented rotation overlap, JWT signed with old kid rejected after overlap expires
admin_service writes without a JWT operator account rejected; audit row for every write contains operator identity
brand_code prefix-disjointness validator: brand-create with prefix collision against existing brand_code rejected; brand-create whose prefix matches an existing namespaced player.account rejected
internal-service-token-spoof test: caller using INTERNAL_SERVICE_TOKEN_GATEWAY cannot impersonate INTERNAL_SERVICE_TOKEN_AGENT against any consumer
brand-signature verification test: brand-scoped wallet write without X-Brand-Signature is rejected in enforce, logged in observe; signature with wrong BRAND_SIGNING_KEY is rejected in enforce
recovery flow: same email registered in two brands; recovery on brand A's domain returns a token bound to brand A only; recovery response on a non-existent account at brand A produces the same response shape as a known account (no existence oracle)
observe-mode read-path test: GET endpoint with brand-A JWT against brand-B domain returns either auth-error or empty data, never cross-brand data
event schema versioning: pre-Phase-6 event in stream is rejected by consumer in enforce mode and accepted as default brand in observe; event_legacy_schema_total increments accordingly
security_downgrade_total alert fires on synthetic boot in observe while >1 brand is enabled
agent_brand cache invalidation: admin update to agent_brand publishes AGENT_BRAND_CHANGED; agent_service processes invalidate within 1 second; processes that miss the message refresh on TTL expiry within 60 seconds

Status​

Date​

Owners​

Affected Services​

Related ADRs​

Related Stable Docs​

Related Runbooks​

Goal​

Scope​

Background​

Compatibility​

Ownership​

Requirements​

Brand entity​

Brand configuration​

Brand resolution at the edge​

JWT and authorization​

Player identity​

Agent​

Wallet​

Rolling​

Promotion​

Game integration​

Reconciliation​

Admin surfaces​

Staged enforcement​

Authentication and integrity​

Recovery flow brand precedence​

Idempotency cross-brand collision detection​

Configuration consolidation​

Migration​

Observability​

Rollout Notes​

Acceptance Criteria​

Required Tests​