Multi-Brand Isolation Implementation Plan

Status

Done — Phases 1–15 delivered on main (see ADR-009 "Implementation" section for the per-phase commit list).

2026-07-17 note: Native provider callback, account-namespacing, and callback-DLQ items below are retained as implementation history only. The live game boundary is signed provider-protocol relay-backed Seamless; use docs/services/game-service.md and the current Phase 16 checklist for operations.

In-repo Phase 16 closure (2026-05-06). The servers_v2/docker-compose.yml defaults that controlled the hard-flip are now production-grade out of the box:

MULTI_BRAND_ENFORCEMENT defaults to enforce (was observe).
WALLET_BRAND_SIGNATURE_REQUIRE defaults to on (was off).
PER_CALLER_TOKEN_REQUIRED defaults to on (was off).
Game integration is structurally Aggregator-only; native callback routes and the former callback-verification bypass switch are removed.

Companion fail-closed code paths are merged:

Wallet boot guards refuse to start when the enforce-mode combination would silently degrade (assert_brand_signing_key_configured).
Consumer fail-closed for missing envelope brand_id regardless of mode (rolling, promotion).
Producer-side schema-required brand_id on every DomainEvent envelope.
Phase 4E session dual-read fallback removed (gateway and player_service).
Wallet write entry points fail-loud on missing X-Brand-Id — every bucket_wallet.py v2 route, every brand-scoped wallet.py legacy route, every topology.py topology/policy CRUD route, and the queries.py transfer_withdrawal legacy alias all envelope-reject upfront with _missing_brand_envelope (or the topology.py equivalent) instead of letting request_brand_id=None propagate down. The brand_signature_verifier itself fail-closes in enforce mode when brand_id is None as defence-in-depth, so a route-layer regression cannot bypass signature verification.
Settlement / rollback events derive brand_id from the authorization row when the caller does not forward it. The wallet_bet_authorization SELECT now returns brand_id and _write_topology_bet_settled_event / _write_topology_bet_rollback_event fall back to that value, so the DomainEvent envelope can never publish without a numeric brand_id (Pydantic enforces, but the fallback prevents a single missing route forwarding from crashing settlement).
Topology / policy store fallback retained as defence-in-depth. The wallet_topology_store._resolve_brand_id retains its default-brand resolution path for the case a future internal helper accidentally reaches the store with brand_id=None; the wallet_topology_default_brand_fallback_total counter exists to verify the fallback never fires in production soak. With every public route now envelope-rejecting upfront, that counter must stay at zero.

Operator-side Phase 16 (production rollout). The remaining checkboxes in the "Phase 16" section below are deployment-time soak/sequencing activities, not code gaps. They are tracked by docs/runbooks/multi-brand/phase-16-hard-flip-checklist.md and run when the platform is promoted from staging to production. Pre-deploy the in-repo posture is already final.

Companion Plans

Two follow-up plans landed alongside this one and are also marked done:

2026-05-05-event-brand-scoping.md — schema-required brand_id on every domain event; consumers fail-closed regardless of mode.
2026-05-05-phase-4e-dual-read-sunset.md — removal of the legacy user-session-{account} Redis key plus the gateway / player_service dual-read fallback.

Date

2026-04-28

Owners

Platform Backend
Player Domain
Wallet Domain
Agent Domain

Affected Services

gateway
player_service
wallet_service
rolling_service
promotion_service
game_service
agent_service
admin_service
recon_service

docs/adr/ADR-009-multi-brand-domain-routed-isolation.md
docs/adr/ADR-005-wallet-topology-bucket-ledger-model.md

docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md

docs/runbooks/multi-brand/multi-brand-isolation-rollout.md

Goal

Implement domain-routed, single-database, brand-scoped multi-brand isolation across servers_v2 per ADR-009 and the related spec, with a backfilled default brand so the existing single-brand environment continues to operate through every step of the migration.

Success Signals

Two brands can run simultaneously on the same servers_v2 runtime with fully independent player identity, wallet state, rolling, settlement, and configuration.
gateway resolves brand from the request domain; JWT and request brand must match.
wallet_service rejects every cross-brand command with a stable error and a counter increment.
Game provider integrations work for two brands using namespaced outbound accounts and reverse-parsed inbound callbacks.
Every existing test suite continues to pass; new brand-aware suites cover every command, query, and event family.
Local Docker end-to-end runs cover two brands at the same time without cross-brand bleed.
Removed staff routes return 404; no other route depends on staff identity.

Preconditions

Spec and ADR-009 are accepted.
Wallet topology / policy work in ADR-005 is in its current state and not in mid-flight refactor.
Local Docker stack is green on main.
Product confirms the brand catalog seed: at least default for backfill, plus the second brand the rollout will validate.
Product confirms brand_code for each seeded brand and confirms that the chosen brand_code characters are accepted by every game provider's account format.

Implementation Target Map

Primary code areas to add or rewrite:

servers_v2/shared/contracts/src/rgb_contracts/ -- brand DTO, X-Brand-Id header constant, brand context helpers, brand-aware event payload schemas
servers_v2/shared/rgb_db/src/rgb_db/models/ -- brand and brand_config models; brand_id columns on every brand-scoped model; brand-aware unique constraints
servers_v2/admin_service/app/api/routes/brand.py (new) -- brand catalog CRUD
servers_v2/admin_service/app/api/routes/brand_config.py (new) -- per-brand configuration CRUD
servers_v2/admin_service/app/api/routes/agent_brand.py (new) -- agent-to-brand allow-list CRUD
servers_v2/admin_service/app/api/routes/legacy_admin_v2.py -- remove
servers_v2/admin_service/app/api/routes/legacy_auth.py -- remove
servers_v2/admin_service/app/tasks/tag.py -- replace project switch with brand_config resolution
servers_v2/gateway/app/api/routes/player_routes.py -- brand resolution and X-Brand-Id propagation
servers_v2/gateway/app/services/proxy.py -- forward X-Brand-Id
servers_v2/gateway/app/middleware/ -- JWT-vs-domain brand check
servers_v2/player_service/app/services/domain_cache.py -- brand-bearing cache values for domain:agent:* and domain:level:*
servers_v2/player_service/app/api/routes/common.py, servers_v2/player_service/app/api/routes/player.py -- brand-aware registration and lookups
servers_v2/player_service/app/api/helpers.py -- brand resolution helpers; preserve error code 54 semantics
servers_v2/wallet_service/app/services/wallet_topology_store.py -- per-brand topology and policy resolution
servers_v2/wallet_service/app/services/wallet_bucket_commands.py -- cross-brand command rejection
servers_v2/wallet_service/app/api/routes/topology.py, servers_v2/wallet_service/app/api/routes/wallet.py, servers_v2/wallet_service/app/api/routes/bucket_wallet.py, servers_v2/wallet_service/app/api/routes/queries.py -- X-Brand-Id enforcement
servers_v2/rolling_service/app/services/rolling_ops.py, servers_v2/rolling_service/app/services/event_consumer.py -- brand-scoped progress and consumption
servers_v2/promotion_service/app/api/routes/coupon.py, servers_v2/promotion_service/app/tasks/settlement.py -- brand-scoped coupon, settlement, saga
servers_v2/game_service/ -- outbound account namespacing helper; inbound callback reverse-parse for every provider under /HO/*, /mg/*, /wc/*, /bti/*, /splus/*, /bt1/*, /digitain/*, /integration/*
servers_v2/agent_service/app/api/ -- agent_brand enforcement at the agent edge; per-brand agent settings and domains
servers_v2/recon_service/ -- brand_id on shooter_* rows; brand resolved from matched deposit
alembic migrations under the centralized shared migrations directory servers_v2/shared/rgb_db/migrations/versions/ (latest existing is 0021_wallet_bet_settlement_metadata.py). All multi-brand migrations are sequenced from 0022_* upward in this single directory; there are no per-service migration directories in this repo.
backfill verification script under servers_v2/tools/multi_brand_backfill/ (new)

Primary tests to add or rewrite:

contract tests for brand DTO and X-Brand-Id propagation
migration and backfill tests
gateway brand resolution and JWT mismatch tests
player_service cross-brand registration and recovery tests
agent_service allow-list enforcement tests
wallet_service cross-brand command rejection tests for every command
wallet_service per-brand topology / policy resolution tests
rolling_service brand-scoped consumption tests
promotion_service per-brand resolution tests
game_service outbound namespacing and inbound reverse-parse tests per provider
recon_service brand-scoped match and approval tests
admin_service brand catalog, brand_config, agent_brand CRUD tests
admin_service staff removal tests
observability tests for log fields and metric labels
local Docker two-brand end-to-end flows

Tasks

Phase 1 -- Shared Contracts And Models

Delivered by: 53639f23, 1eca859c, bc258a05

Phase 2 -- Migrations And Default Brand Backfill

Delivered by: 304deec6, d01f85a9, c5d39d52, d7d0e134, 69c5d5a0

All migrations live in servers_v2/shared/rgb_db/migrations/versions/ (centralized). Sequence from 0022_* upward. No per-service migration directories exist in this repo.

Production Postgres assumption: PostgreSQL 14+. Every migration in this phase runs with lock_timeout = '5s' and statement_timeout = '30min' set per session; if lock_timeout fires the migration aborts cleanly without queueing connections to the point of an outage. The migrations below are designed to favor metadata-only changes (ALTER TABLE ... ADD COLUMN nullable, CREATE INDEX CONCURRENTLY) over full table rewrites. Per-table runtime estimates against a prod-sized snapshot are recorded in docs/runbooks/multi-brand/multi-brand-isolation-rollout.md Local Docker Validation step before any environment beyond local is migrated.

Phase 3 -- Domain-To-Brand Map And Edge Resolution (Soft-Fail)

Delivered by: 8011f2d6, 030383d5, 586b6588

Phase 4 -- Player Service Brand Scoping

Delivered by: 288a5f18, 5730f74e, a410cbce

Phase 5 -- Issue Brand-Aware JWTs

Delivered by: 910ab341, 5c77fc81

Add brand_id to the JWT claim schema used by player_service and agent_service. Existing tokens without brand_id remain valid until natural expiry; the MULTI_BRAND_ENFORCEMENT=observe flag from Phase 3 guarantees this.
Update login, refresh, and logout flows in player_service and agent_service to include and preserve brand_id.
Refresh-token rotation only mints brand-aware access tokens. A refresh from a brand-unaware refresh token (issued before Phase 5) forces re-login by returning the documented re-auth-required error envelope rather than minting a token that lacks brand_id. Confirm refresh-token TTL: if longer than 2 * JWT_EXPIRE_MIN, the soak window in Phase 16 must be extended to 2 * REFRESH_TOKEN_TTL so every brand-unaware refresh token has expired before hard-flip.
Switch JWT signing from HS-family to RS256. Generate JWT_PRIVATE_KEY and JWT_PUBLIC_KEY per environment; private key is provisioned only into player_service and agent_service; public key is provisioned into gateway and any other verifier. JWT header carries kid (current key ID); verifier rejects unknown kid and rejects any alg other than RS256 (no alg=none, no HS-family fallback).
Update internal handlers to read X-Brand-Id from the request, not from JWT, so internal callers carry the brand in headers.
Add tests for: login issues a token with the correct brand_id, refresh preserves it, observe-mode logs JWT/domain mismatches without rejecting, cross-brand JWT replay is logged in observe mode (and will be rejected after Phase 16).
Run player_service and agent_service JWT tests.
Commit Phase 5.

Phase 6 -- Wallet Service Brand Scoping

Delivered by: 63db9df4, c218e91d

Phase 7 -- Rolling Service Brand Scoping

Delivered by: e16d0974

Update rolling event consumer so it reads brand_id from the DomainEvent envelope and persists it. If the envelope brand_id does not match the target rolling row's brand, apply MULTI_BRAND_ENFORCEMENT semantics: observe logs + counts and proceeds using the envelope brand; enforce rejects the event and surfaces the mismatch to supervision.
Update rolling progress and completion routes to require X-Brand-Id and reject mismatched player rows.
Resolve per-brand rolling completion ratios from brand_config (with documented global default fallback).
Update rolling outbox payloads to include brand_id.
At Phase 7 producer-deploy time, rolling_service emits one BRAND_AWARE_PUBLISHING_BEGINS sentinel event on the rolling stream.
Add tests for: brand-scoped consumption, brand-scoped completion, per-brand ratio resolution.
Run rolling_service tests.
Commit Phase 7.

Phase 8 -- Promotion Service Brand Scoping

Delivered by: c56438b5

Update coupon definitions, event configs, and rebate/cashback/ lossback rates to be per-brand.
Update settlement schedulers to iterate in brand_id ascending order, sequentially (one brand at a time). Each brand iteration is observable in worker logs (structured field brand_id + brand start/end timestamps).
Update coupon saga and points credit calls to wallet_service to forward X-Brand-Id. Saga consumers handling envelope brand_id mismatches apply MULTI_BRAND_ENFORCEMENT semantics (observe: log + count + proceed using envelope brand; enforce: reject and surface to supervision).
Update promotion outbox payloads to include brand_id.
At Phase 8 producer-deploy time, IF a promotion-owned outbound stream exists at that point (per event-catalog.md it is currently a Known Gap), emit one BRAND_AWARE_PUBLISHING_BEGINS sentinel on it. If no promotion-owned stream exists, skip this task and update event-catalog.md Known Gaps to note that promotion remains stream-less in the multi-brand rollout (so the Phase 16 release gate's event_legacy_schema_total zero-check does not require a non-existent stream).
Add tests for: per-brand coupon resolution, per-brand rebate / cashback / lossback resolution, brand-scoped settlement, brand-aware saga.
Run promotion_service tests.
Commit Phase 8.

Phase 9 -- Game Service Brand Scoping

Delivered by: 8da5848a

Phase 10 -- Recon Service Brand Scoping

Delivered by: 2a6e9728

Add brand_id to the shooter_* table family.
Update recon match logic to resolve brand_id from the matched deposit record.
Confirm approvals continue to flow through wallet_service and exercise the wallet_service cross-brand rejection on a synthetic brand-mismatch case.
Add tests for: brand-scoped recon match, brand-mismatch rejection on approval.
Run recon_service tests.
Commit Phase 10.

Phase 11 -- Agent Service: Global Agent + Brand Allow List

Delivered by: 06e853f2

Phase 12 -- Admin Service: Brand Catalog, Configuration, Allow List, And Staff Removal

Delivered by: dd8a0007

Phase 13 -- Observability Wiring

Delivered by: e2d25f0f

Phase 14 -- Local Docker Two-Brand Validation

Delivered by: b73cc22c

Phase 15 -- Stable Doc Refresh And Spec Promotion

Delivered by: (this commit)

Re-verify the following stable docs reflect the implemented behavior: docs/architecture/data-ownership.md, docs/architecture/domain-ownership.md, docs/architecture/system-overview.md, docs/architecture/http-entrypoints.md, docs/architecture/event-catalog.md, docs/architecture/service-catalog.md, docs/architecture/deployment-topology.md, docs/architecture/migration-readiness.md, and every docs/services/*.md.
Update Last Verified Commit markers on each touched stable doc.
Promote docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md status to Approved.
Promote docs/runbooks/multi-brand/multi-brand-isolation-rollout.md status to Ready only after all preceding phases are green.
Commit Phase 15.

Phase 16 -- Flip JWT/Domain Brand Enforcement To Hard-Reject

Risks

A missing brand_id filter on a query is a data-leak risk between brands. Repository helpers and SQLAlchemy mixins must make brand filtering hard to forget; reviews must focus on raw SQL paths.
A wallet command path that bypasses brand validation is a cross-brand money mutation. Wallet command tests must cover every command family.
JWT/domain mismatch enforcement enabled before every downstream service is deployed with brand awareness will lock players out. Mitigation: the MULTI_BRAND_ENFORCEMENT flag introduced in Phase 3 defaulted to observe (log-only) during the rollout and is flipped to enforce only after Phases 4 through 14 are deployed and the soak window shows zero legitimate rejections. The repository compose default is now enforce after the 2026-05-06 in-repo closure.
Game provider account namespacing changes the outbound account string. Provider-side reconciliation must be reviewed before cutover.
Backfill on a large existing database can be slow; migrations must be designed to default new columns at the schema level rather than rewriting every row in a single transaction.
Removing staff routes also removes any in-flight back-office identity capability; consumers of those routes must be confirmed inactive before deletion.
Per-brand topology and policy resolution adds an extra brand dimension to ADR-005's activation safety; activation must remain blocked when unresolved balances or active records would become unreachable, now scoped per brand.

Done Definition

The plan is done when every acceptance criterion in docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md (the ## Acceptance Criteria section) is verifiably satisfied AND every release-gate item in docs/runbooks/multi-brand/multi-brand-isolation-rollout.md (the ## Release Gate section) is satisfied for the target environment, plus the following plan-specific gates:

Spec status is Approved.
ADR-009 is Accepted.
Phases 1 through 16 are complete.
Local Docker two-brand validation passes end to end (see runbook).
All service test suites pass.
The migration verification script reports zero NULL brand_id rows across every brand-scoped table.
MULTI_BRAND_ENFORCEMENT=enforce is active in every environment that has cleared its soak window (per Phase 16).
No brand-scoped query in any service is missing a brand_id filter (verified by code review of the merged PRs).
The seven listed admin_service legacy route files are deleted and their previously served paths return 404.
ADR-005's player-wallet writer rule is unchanged after this work.
Stable docs (architecture and services) reflect the implemented behavior and have updated Last Verified Commit markers.
Runbook is promoted to Ready and is followed for staging validation before any production enablement of a second brand.

Status​

Companion Plans​

Date​

Owners​

Affected Services​

Related ADRs​

Related Spec​

Related Runbooks​

Goal​

Success Signals​

Preconditions​

Implementation Target Map​

Tasks​

Phase 1 -- Shared Contracts And Models​

Phase 2 -- Migrations And Default Brand Backfill​

Phase 3 -- Domain-To-Brand Map And Edge Resolution (Soft-Fail)​

Phase 4 -- Player Service Brand Scoping​

Phase 5 -- Issue Brand-Aware JWTs​

Phase 6 -- Wallet Service Brand Scoping​

Phase 7 -- Rolling Service Brand Scoping​

Phase 8 -- Promotion Service Brand Scoping​

Phase 9 -- Game Service Brand Scoping​

Phase 10 -- Recon Service Brand Scoping​

Phase 11 -- Agent Service: Global Agent + Brand Allow List​

Phase 12 -- Admin Service: Brand Catalog, Configuration, Allow List, And Staff Removal​

Phase 13 -- Observability Wiring​

Phase 14 -- Local Docker Two-Brand Validation​

Phase 15 -- Stable Doc Refresh And Spec Promotion​

Phase 16 -- Flip JWT/Domain Brand Enforcement To Hard-Reject​

Risks​

Done Definition​