跳到主要内容

Multi-Brand Isolation Implementation Plan

Status

Done — Phases 1–15 delivered on main (see ADR-009 "Implementation" section for the per-phase commit list).

In-repo Phase 16 closure (2026-05-06). The servers_v2/docker-compose.yml defaults that controlled the hard-flip are now production-grade out of the box:

  • MULTI_BRAND_ENFORCEMENT defaults to enforce (was observe).
  • WALLET_BRAND_SIGNATURE_REQUIRE defaults to on (was off).
  • PER_CALLER_TOKEN_REQUIRED defaults to on (was off).
  • VERIFY_CALLBACKS already defaulted to true.

Companion fail-closed code paths are merged:

  • Boot guards in wallet_service, game_service refuse to start when the enforce-mode combination would silently degrade (assert_brand_signing_key_configured, assert_verify_callbacks_safe_for_runtime).
  • Consumer fail-closed for missing envelope brand_id regardless of mode (rolling, promotion).
  • Producer-side schema-required brand_id on every DomainEvent envelope.
  • Phase 4E session dual-read fallback removed (gateway and player_service).
  • Wallet write entry points fail-loud on missing X-Brand-Id — every bucket_wallet.py v2 route, every brand-scoped wallet.py legacy route, every topology.py topology/policy CRUD route, and the queries.py transfer_withdrawal legacy alias all envelope-reject upfront with _missing_brand_envelope (or the topology.py equivalent) instead of letting request_brand_id=None propagate down. The brand_signature_verifier itself fail-closes in enforce mode when brand_id is None as defence-in-depth, so a route-layer regression cannot bypass signature verification.
  • Settlement / rollback events derive brand_id from the authorization row when the caller does not forward it. The wallet_bet_authorization SELECT now returns brand_id and _write_topology_bet_settled_event / _write_topology_bet_rollback_event fall back to that value, so the DomainEvent envelope can never publish without a numeric brand_id (Pydantic enforces, but the fallback prevents a single missing route forwarding from crashing settlement).
  • Topology / policy store fallback retained as defence-in-depth. The wallet_topology_store._resolve_brand_id retains its default-brand resolution path for the case a future internal helper accidentally reaches the store with brand_id=None; the wallet_topology_default_brand_fallback_total counter exists to verify the fallback never fires in production soak. With every public route now envelope-rejecting upfront, that counter must stay at zero.

Operator-side Phase 16 (production rollout). The remaining checkboxes in the "Phase 16" section below are deployment-time soak/sequencing activities, not code gaps. They are tracked by docs/runbooks/multi-brand/phase-16-hard-flip-checklist.md and run when the platform is promoted from staging to production. Pre-deploy the in-repo posture is already final.

Companion Plans

Two follow-up plans landed alongside this one and are also marked done:

  • 2026-05-05-event-brand-scoping.md — schema-required brand_id on every domain event; consumers fail-closed regardless of mode.
  • 2026-05-05-phase-4e-dual-read-sunset.md — removal of the legacy user-session-{account} Redis key plus the gateway / player_service dual-read fallback.

Date

2026-04-28

Owners

  • Platform Backend
  • Player Domain
  • Wallet Domain
  • Agent Domain

Affected Services

  • gateway
  • player_service
  • wallet_service
  • rolling_service
  • promotion_service
  • game_service
  • agent_service
  • admin_service
  • recon_service
  • docs/adr/ADR-009-multi-brand-domain-routed-isolation.md
  • docs/adr/ADR-005-wallet-topology-bucket-ledger-model.md
  • docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md
  • docs/runbooks/multi-brand/multi-brand-isolation-rollout.md

Goal

Implement domain-routed, single-database, brand-scoped multi-brand isolation across servers_v2 per ADR-009 and the related spec, with a backfilled default brand so the existing single-brand environment continues to operate through every step of the migration.

Success Signals

  • Two brands can run simultaneously on the same servers_v2 runtime with fully independent player identity, wallet state, rolling, settlement, and configuration.
  • gateway resolves brand from the request domain; JWT and request brand must match.
  • wallet_service rejects every cross-brand command with a stable error and a counter increment.
  • Game provider integrations work for two brands using namespaced outbound accounts and reverse-parsed inbound callbacks.
  • Every existing test suite continues to pass; new brand-aware suites cover every command, query, and event family.
  • Local Docker end-to-end runs cover two brands at the same time without cross-brand bleed.
  • Removed staff routes return 404; no other route depends on staff identity.

Preconditions

  • Spec and ADR-009 are accepted.
  • Wallet topology / policy work in ADR-005 is in its current state and not in mid-flight refactor.
  • Local Docker stack is green on main.
  • Product confirms the brand catalog seed: at least default for backfill, plus the second brand the rollout will validate.
  • Product confirms brand_code for each seeded brand and confirms that the chosen brand_code characters are accepted by every game provider's account format.

Implementation Target Map

Primary code areas to add or rewrite:

  • servers_v2/shared/contracts/src/rgb_contracts/ -- brand DTO, X-Brand-Id header constant, brand context helpers, brand-aware event payload schemas
  • servers_v2/shared/rgb_db/src/rgb_db/models/ -- brand and brand_config models; brand_id columns on every brand-scoped model; brand-aware unique constraints
  • servers_v2/admin_service/app/api/routes/brand.py (new) -- brand catalog CRUD
  • servers_v2/admin_service/app/api/routes/brand_config.py (new) -- per-brand configuration CRUD
  • servers_v2/admin_service/app/api/routes/agent_brand.py (new) -- agent-to-brand allow-list CRUD
  • servers_v2/admin_service/app/api/routes/legacy_admin_v2.py -- remove
  • servers_v2/admin_service/app/api/routes/legacy_auth.py -- remove
  • servers_v2/admin_service/app/tasks/tag.py -- replace project switch with brand_config resolution
  • servers_v2/gateway/app/api/routes/player_routes.py -- brand resolution and X-Brand-Id propagation
  • servers_v2/gateway/app/services/proxy.py -- forward X-Brand-Id
  • servers_v2/gateway/app/middleware/ -- JWT-vs-domain brand check
  • servers_v2/player_service/app/services/domain_cache.py -- brand-bearing cache values for domain:agent:* and domain:level:*
  • servers_v2/player_service/app/api/routes/common.py, servers_v2/player_service/app/api/routes/player.py -- brand-aware registration and lookups
  • servers_v2/player_service/app/api/helpers.py -- brand resolution helpers; preserve error code 54 semantics
  • servers_v2/wallet_service/app/services/wallet_topology_store.py -- per-brand topology and policy resolution
  • servers_v2/wallet_service/app/services/wallet_bucket_commands.py -- cross-brand command rejection
  • servers_v2/wallet_service/app/api/routes/topology.py, servers_v2/wallet_service/app/api/routes/wallet.py, servers_v2/wallet_service/app/api/routes/bucket_wallet.py, servers_v2/wallet_service/app/api/routes/queries.py -- X-Brand-Id enforcement
  • servers_v2/rolling_service/app/services/rolling_ops.py, servers_v2/rolling_service/app/services/event_consumer.py -- brand-scoped progress and consumption
  • servers_v2/promotion_service/app/api/routes/coupon.py, servers_v2/promotion_service/app/tasks/settlement.py -- brand-scoped coupon, settlement, saga
  • servers_v2/game_service/ -- outbound account namespacing helper; inbound callback reverse-parse for every provider under /HO/*, /mg/*, /wc/*, /bti/*, /splus/*, /bt1/*, /digitain/*, /integration/*
  • servers_v2/agent_service/app/api/ -- agent_brand enforcement at the agent edge; per-brand agent settings and domains
  • servers_v2/recon_service/ -- brand_id on shooter_* rows; brand resolved from matched deposit
  • alembic migrations under the centralized shared migrations directory servers_v2/shared/rgb_db/migrations/versions/ (latest existing is 0021_wallet_bet_settlement_metadata.py). All multi-brand migrations are sequenced from 0022_* upward in this single directory; there are no per-service migration directories in this repo.
  • backfill verification script under servers_v2/tools/multi_brand_backfill/ (new)

Primary tests to add or rewrite:

  • contract tests for brand DTO and X-Brand-Id propagation
  • migration and backfill tests
  • gateway brand resolution and JWT mismatch tests
  • player_service cross-brand registration and recovery tests
  • agent_service allow-list enforcement tests
  • wallet_service cross-brand command rejection tests for every command
  • wallet_service per-brand topology / policy resolution tests
  • rolling_service brand-scoped consumption tests
  • promotion_service per-brand resolution tests
  • game_service outbound namespacing and inbound reverse-parse tests per provider
  • recon_service brand-scoped match and approval tests
  • admin_service brand catalog, brand_config, agent_brand CRUD tests
  • admin_service staff removal tests
  • observability tests for log fields and metric labels
  • local Docker two-brand end-to-end flows

Tasks

Phase 1 -- Shared Contracts And Models

Delivered by: 53639f23, 1eca859c, bc258a05

  • Add a brand SQLAlchemy model with brand_id (PK), brand_code (unique, immutable, charset-constrained), name, default_currency, status, created_at, updated_at.
  • Add a brand_config model with (brand_id, key) PK and a JSON value column.
  • Add an agent_brand model with (agent_id, brand_id) PK, status, and created_at.
  • Add a shared BrandContext DTO and a shared X-Brand-Id header constant in rgb_contracts.
  • Add a shared X-Brand-Signature header constant and a shared sign_brand_assertion(caller_service, brand_id, request_id, timestamp, key) helper in rgb_contracts (HMAC-SHA256). Add a shared verify_brand_assertion(...) helper. Both consume the BRAND_SIGNING_KEY env var.
  • Add a shared internal-token enum InternalCallerService in rgb_contracts listing every legitimate caller (gateway, agent, admin, recon, game, promotion, rolling). Consumer-service middleware reads the relevant INTERNAL_SERVICE_TOKEN_<CALLER> env vars and rejects unknown or mismatched tokens. Document the token-rotation procedure in the runbook.
  • Add a brand_id: int field to the shared DomainEvent envelope in servers_v2/shared/contracts/src/rgb_contracts/events/base.py. Every outbox event automatically gains brand_id through the envelope; per- payload schemas (e.g. events/wallet.py) are not modified individually. Producers populate brand_id from the owning row; consumers read brand_id from the envelope and persist it into the consumer-side projection.
  • Add a shared brand resolver helper that reads X-Brand-Id from a FastAPI request and returns a typed BrandContext; reject missing header for brand-scoped operations.
  • Write contract tests for the brand DTO, the header constant, the resolver, and the DomainEvent envelope's new brand_id field.
  • Run shared contract tests.
  • Commit Phase 1.

Phase 2 -- Migrations And Default Brand Backfill

Delivered by: 304deec6, d01f85a9, c5d39d52, d7d0e134, 69c5d5a0

All migrations live in servers_v2/shared/rgb_db/migrations/versions/ (centralized). Sequence from 0022_* upward. No per-service migration directories exist in this repo.

Production Postgres assumption: PostgreSQL 14+. Every migration in this phase runs with lock_timeout = '5s' and statement_timeout = '30min' set per session; if lock_timeout fires the migration aborts cleanly without queueing connections to the point of an outage. The migrations below are designed to favor metadata-only changes (ALTER TABLE ... ADD COLUMN nullable, CREATE INDEX CONCURRENTLY) over full table rewrites. Per-table runtime estimates against a prod-sized snapshot are recorded in docs/runbooks/multi-brand/multi-brand-isolation-rollout.md Local Docker Validation step before any environment beyond local is migrated.

  • 0022_create_brand_aggregates.py: create brand, brand_config, and agent_brand tables. brand.brand_id is BIGINT autoincrement. brand.brand_code is VARCHAR(16) with CHECK (brand_code ~ '^[a-z][a-z0-9]{1,15}$') (lowercase alphanumeric, leading letter, 2-16 chars).
  • 0023_seed_default_brand.py: insert one brand row with brand_code='default', name='Default Brand', status enabled, plus brand_config rows seeded with the values the system currently relies on globally (rolling ratios, cashback / rebate / lossback rates, payment channel selection). Also insert one agent_brand row per existing agent row of the form (agent_id, default_brand_id, status='enabled') so every existing agent retains brand-1 access through Phase 11 enforcement; without this seed every agent's first request after Phase 16 hard-flip would be rejected by the allow-list check.
  • 0023b_install_agent_brand_autoseed_trigger.py: install a Postgres BEFORE INSERT trigger on agent that automatically inserts a corresponding agent_brand(NEW.agent_id, default_brand_id, 'enabled') row in the same transaction. The trigger references the default_brand_id populated by 0023, so it must run after 0023. This closes the seed-race window during Phases 2-12 when admin_service does not yet write agent_brand for new agents. The trigger is removed by 0028_remove_agent_brand_autoseed_trigger.py after Phase 12 ships the application-level write.
  • 0024a_add_brand_id_nullable.py: add brand_id BIGINT NULL to every brand-scoped table enumerated in the ## Brand-Scoped Table Inventory section of docs/architecture/data-ownership.md. Nullable + no DEFAULT is metadata-only on PG 12+ and avoids full-table rewrites. Concurrent writes during this migration land with brand_id = NULL (handled by 0024b below).
  • 0024b_backfill_brand_id.py: chunked UPDATE backfill against a literal default_brand_id value resolved once at migration start (not via per-row subquery). Chunk size 10000 rows; commit between chunks; loop until WHERE brand_id IS NULL returns zero rows. This avoids a full-table rewrite under ACCESS EXCLUSIVE that the original single-statement DDL with DEFAULT (SELECT ...) would have caused.
  • 0024c_set_brand_id_not_null.py: tail-end UPDATE catches any rows landed during 0024b with NULL, then ALTER TABLE ... ALTER COLUMN brand_id SET NOT NULL. PG 12+ does this without a full rewrite when a CHECK (brand_id IS NOT NULL) NOT VALID constraint exists and has been validated; the migration adds the CHECK NOT VALID, validates it via VALIDATE CONSTRAINT (non-blocking scan), then sets NOT NULL atomically and drops the redundant CHECK.
  • (Migration number 0025 is intentionally skipped: an earlier draft used 0025_drop_brand_id_default.py to drop a server DEFAULT that the restructured 0024a no longer sets. The number is reserved to preserve the historical ordering and to avoid renumbering downstream migrations.)
  • 0026_add_brand_id_fk.py: add a foreign key from every brand-scoped table's brand_id to brand(brand_id) with ON DELETE RESTRICT. Use NOT VALID then VALIDATE CONSTRAINT to avoid an exclusive lock long enough to fire lock_timeout.
  • 0027_brand_scoped_uniqueness.py: change uniqueness as follows. Every new index uses CREATE UNIQUE INDEX CONCURRENTLY (non-blocking) before the corresponding old index is dropped, so there is no window where both the old constraint is gone and the new one is not yet enforcing. Wallet topology and policy CRUD MUST be paused (admin freeze) for the duration of this migration step; the runbook documents the freeze procedure and confirmation step.
    • player: create CONCURRENTLY (brand_id, account) unique index, then drop the old account-only constraint
    • wallet_topology: create CONCURRENTLY (brand_id, code, version) unique index, drop old uq_wallet_topology_code_version; create CONCURRENTLY new partial active index (brand_id) WHERE status='ACTIVE', drop old uq_wallet_topology_single_active. Activation is paused for the duration; ADR-005's "single ACTIVE per brand" invariant is preserved across the swap because the new partial index exists before the old one is dropped.
    • wallet_policy: create CONCURRENTLY (brand_id, topology_code, version, policy_key) unique index, drop old uq_wallet_policy_version; create CONCURRENTLY new partial active index (brand_id, topology_code, policy_key) WHERE status='ACTIVE', drop old uq_wallet_policy_active_key. Same pause-then-swap discipline as wallet_topology.
    • wallet_bucket_type: replace uq_wallet_bucket_type_topology_code with (brand_id, topology_code, topology_version, code) unique
    • wallet_account: (brand_id, player_id) unique
    • wallet_bucket: existing constraint becomes (brand_id, player_id, topology_code, bucket_type_code) unique
    • wallet_bet_authorization: extend uq_wallet_bet_authorization_provider_bet to (brand_id, provider_type, provider_id, bet_id); uq_wallet_bet_authorization_request becomes (brand_id, request_id)
    • wallet_transfer: uq_wallet_transfer_request becomes (brand_id, request_id)
    • wallet_idempotency: uniqueness becomes (brand_id, idempotency_key) (allows the same key to legitimately appear in two brands)
    • PK rewrites (drop existing PK, add composite PK after 0024c sets NOT NULL). Each PK rewrite runs in its own transaction with lock_timeout = '5s' so a stuck lock aborts cleanly without queueing connections. Concurrent writes to the affected table are paused via the application-level "table freeze" mechanism (a Redis flag checked by every writer) for the duration of each rewrite; the freeze is documented in the runbook with rollback. Sequencing in 0027 body: i18n last (most-read table; minimize its freeze window), agent_setting, player_wallet_limit, daebak_email, mg_account, wc_account, digitain_token first. Apply to every table in the ### Single-Column PKs Requiring Rewrite In 0027 table in docs/architecture/data-ownership.md: agent_setting -> (agent_id, brand_id), player_wallet_limit -> (brand_id, player_id), daebak_email -> (brand_id, player_id), mg_account -> (brand_id, account), wc_account -> (brand_id, account), digitain_token -> (brand_id, account), i18n -> (brand_id, code). Every other brand-scoped table uses a surrogate guid BIGINT PK that remains untouched; only uniqueness constraints change
    • agent_domain: keep existing surrogate guid PK; add a composite UNIQUE(agent_id, brand_id, domain) constraint and keep the existing UNIQUE(domain) global constraint (so a domain still cannot appear twice in any combination). Do not drop the surrogate PK; FK references to agent_domain.guid would break.
    • any other uniqueness change required by the spec
  • Add a servers_v2/tools/multi_brand_backfill/ verification script that iterates every brand-scoped table and asserts zero rows with NULL brand_id. Exit non-zero on any failure; wire into the deployment pipeline so a non-zero count blocks rollout.
  • Add a reversibility test that runs the migrations forward, inserts seed rows under brand_id=default and brand_id=brand2, then runs the migrations backward; confirm reversal is either refused safely OR preserves enough information to re-run forward without data loss. (An empty-DB reversibility test alone is not sufficient; alembic always reverses pure DDL.)
  • Add a backfill smoke test that runs the migrations against a copy of the local Docker database and confirms row counts and uniqueness post-migration.
  • Run all migration tests.
  • Commit Phase 2.

Phase 3 -- Domain-To-Brand Map And Edge Resolution (Soft-Fail)

Delivered by: 8011f2d6, 030383d5, 586b6588

  • Update servers_v2/player_service/app/services/domain_cache.py so that domain:agent:{host} and domain:level:{host} resolve to a value carrying brand_id (e.g. JSON with brand_id and agent_id / level_id).
  • Add a one-time Redis migration helper that rewrites existing domain:agent:* and domain:level:* values to the new shape, binding every existing domain to the default brand.
  • Update servers_v2/gateway/app/api/routes/player_routes.py so that _extract_domain is followed by a brand lookup; attach request.state.brand_id and request.state.brand_code.
  • Preserve the existing body["domain"] = _extract_domain(request) injection in legacy player routes (servers_v2/gateway/app/api/routes/player_routes.py:526 and :573). The brand projection is layered on top; do not remove the domain body injection or any caller that depends on it.
  • Update servers_v2/gateway/app/services/proxy.py so every forwarded request carries X-Brand-Id AND any inbound X-Brand-Id header from the external client is stripped before injection (so external callers cannot spoof the brand). agent_service performs the same strip on its edge.
  • Introduce a MULTI_BRAND_ENFORCEMENT runtime flag with three modes: off, observe, and enforce. Default is observe.
  • Add a gateway middleware that, on requests with a JWT, computes jwt.brand_id == request.state.brand_id. In observe mode, log the comparison and increment brand_resolution_failed_total{reason="jwt_domain_mismatch",mode="observe"} but do not reject. In enforce mode, reject mismatches with the documented error code. JWTs that lack brand_id are treated the same way (logged in observe, rejected in enforce).
  • Add gateway tests for: valid domain, unknown domain, observe-mode logging on JWT/domain mismatch, observe-mode logging on JWT missing brand_id, and request body / query attempts to override brand.
  • Run gateway tests.
  • Confirm the deployment defaults MULTI_BRAND_ENFORCEMENT=observe during the rollout window; the 2026-05-06 in-repo closure flips the compose default to enforce.
  • Confirm scope: brand resolution attaches request.state.brand_id via a gateway middleware that fires before route dispatch, so it applies uniformly to player_routes.py, legacy_player_routes.py (which uses dispatch_to_path-style passthrough rather than body mutation), and provider_routes.py. gateway/app/api/routes/admin_routes.py and agent_routes.py are not in the brand-resolution scope because back-office and agent traffic do not enter through gateway (per docs/architecture/http-entrypoints.md); confirm both files remain unchanged or are documented as dead code if no longer wired.
  • Commit Phase 3.

Phase 4 -- Player Service Brand Scoping

Delivered by: 288a5f18, 5730f74e, a410cbce

  • Make player queries and writes brand-aware in player_service/app/api/routes/player.py and common.py.
  • Update registration so brand_id is taken from the resolved request brand and is rejected when supplied by the client.
  • Wire MAX_PLAYER_ACCOUNT_LEN = 32 into the registration validator in servers_v2/player_service/app/api/routes/player.py (and any other registration entry point). Reject any account longer than 32 chars with the documented validation envelope. The constant is defined in rgb_contracts per Phase 1.
  • Implement brand-aware recovery flow per spec ### Recovery flow brand precedence. Recovery tokens issued after Phase 5 deploy carry brand_id as a signed claim. Token-embedded brand_id wins over request domain; if the token lacks brand_id (issued before Phase 5), fall back to request domain; if the resolved brand does not match the underlying player's brand_id, reject hard regardless of MULTI_BRAND_ENFORCEMENT mode (recovery is too sensitive for observe). Recovery responses for unknown accounts produce the same response shape as known accounts (no existence oracle). Update player-service.md and agent-service.md to document this contract.
  • Make agent_setting lookups and writes keyed by (agent_id, brand_id).
  • Make agent_domain lookups and writes brand-scoped; ensure a domain cannot be bound to two brands.
  • Update player_service/app/api/helpers.py so that USE_LEVEL_DOMAIN continues to fire when a player tries to log in on a level domain bound to a different brand than their player row.
  • Update player-owned outbox payloads to include brand_id.
  • At Phase 4 producer-deploy time, player_service emits one BRAND_AWARE_PUBLISHING_BEGINS sentinel event on the player stream so consumers can log the exact stream offset where schema_version bumps from 1 to 2.
  • Audit all Redis cache keys derived from user-supplied strings (e.g. account, phone, email, domain) and prefix every shared- namespace key with brand_code. Concrete first targets: sms_captcha_{phone} -> sms_captcha:{brand_code}:{phone}, any login-throttle key keyed on account -> add {brand_code}: prefix. Keys derived from surrogate player_id need no change because player_id is brand-distinct. Document the convention in docs/services/player-service.md.
  • Add tests covering: cross-brand independent registration of the same account string, cross-brand independent agent_setting, per-brand agent_domain, and the USE_LEVEL_DOMAIN brand-mismatch case.
  • Run player_service tests.
  • Commit Phase 4.

Phase 5 -- Issue Brand-Aware JWTs

Delivered by: 910ab341, 5c77fc81

  • Add brand_id to the JWT claim schema used by player_service and agent_service. Existing tokens without brand_id remain valid until natural expiry; the MULTI_BRAND_ENFORCEMENT=observe flag from Phase 3 guarantees this.
  • Update login, refresh, and logout flows in player_service and agent_service to include and preserve brand_id.
  • Refresh-token rotation only mints brand-aware access tokens. A refresh from a brand-unaware refresh token (issued before Phase 5) forces re-login by returning the documented re-auth-required error envelope rather than minting a token that lacks brand_id. Confirm refresh-token TTL: if longer than 2 * JWT_EXPIRE_MIN, the soak window in Phase 16 must be extended to 2 * REFRESH_TOKEN_TTL so every brand-unaware refresh token has expired before hard-flip.
  • Switch JWT signing from HS-family to RS256. Generate JWT_PRIVATE_KEY and JWT_PUBLIC_KEY per environment; private key is provisioned only into player_service and agent_service; public key is provisioned into gateway and any other verifier. JWT header carries kid (current key ID); verifier rejects unknown kid and rejects any alg other than RS256 (no alg=none, no HS-family fallback).
  • Update internal handlers to read X-Brand-Id from the request, not from JWT, so internal callers carry the brand in headers.
  • Add tests for: login issues a token with the correct brand_id, refresh preserves it, observe-mode logs JWT/domain mismatches without rejecting, cross-brand JWT replay is logged in observe mode (and will be rejected after Phase 16).
  • Run player_service and agent_service JWT tests.
  • Commit Phase 5.

Phase 6 -- Wallet Service Brand Scoping

Delivered by: 63db9df4, c218e91d

  • Update wallet_topology_store.py so the active topology and active policy are resolved per brand_id.
  • Update wallet_bucket_commands.py so every command resolves brand_id from X-Brand-Id and rejects any target row whose brand differs.
  • Verify the X-Brand-Signature HMAC on every brand-scoped wallet write command. In enforce mode, missing or invalid signature is rejected with the documented error envelope; in observe mode, increment wallet_brand_signature_failed_total{caller_service,reason,mode} and proceed. Read paths and lower-stakes endpoints do not require the signature in this iteration (documented as a Phase X follow-up).
  • Add a wallet_cross_brand_rejected_total{command} counter and emit it on every rejection.
  • Update topology and policy CRUD routes (servers_v2/wallet_service/app/api/routes/topology.py) so that per-brand uniqueness is honored and activation never crosses brand boundaries.
  • Update wallet outbox event payloads to include brand_id.
  • At Phase 6 producer-deploy time, wallet_service emits one BRAND_AWARE_PUBLISHING_BEGINS sentinel event on the wallet stream so consumers can log the exact stream offset where schema_version bumps from 1 to 2 (per event-catalog.md).
  • Add tests for: cross-brand rejection on every command (deposit approve, withdraw create, withdraw refund, adjustment, bet authorize, bet settle, bet rollback, transfer, points transfer, coupon grant, coupon reverse, points credit), per-brand topology resolution, per-brand policy resolution, per-brand activation isolation.
  • Run wallet_service tests.
  • Commit Phase 6.

Phase 7 -- Rolling Service Brand Scoping

Delivered by: e16d0974

  • Update rolling event consumer so it reads brand_id from the DomainEvent envelope and persists it. If the envelope brand_id does not match the target rolling row's brand, apply MULTI_BRAND_ENFORCEMENT semantics: observe logs + counts and proceeds using the envelope brand; enforce rejects the event and surfaces the mismatch to supervision.
  • Update rolling progress and completion routes to require X-Brand-Id and reject mismatched player rows.
  • Resolve per-brand rolling completion ratios from brand_config (with documented global default fallback).
  • Update rolling outbox payloads to include brand_id.
  • At Phase 7 producer-deploy time, rolling_service emits one BRAND_AWARE_PUBLISHING_BEGINS sentinel event on the rolling stream.
  • Add tests for: brand-scoped consumption, brand-scoped completion, per-brand ratio resolution.
  • Run rolling_service tests.
  • Commit Phase 7.

Phase 8 -- Promotion Service Brand Scoping

Delivered by: c56438b5

  • Update coupon definitions, event configs, and rebate/cashback/ lossback rates to be per-brand.
  • Update settlement schedulers to iterate in brand_id ascending order, sequentially (one brand at a time). Each brand iteration is observable in worker logs (structured field brand_id + brand start/end timestamps).
  • Update coupon saga and points credit calls to wallet_service to forward X-Brand-Id. Saga consumers handling envelope brand_id mismatches apply MULTI_BRAND_ENFORCEMENT semantics (observe: log + count + proceed using envelope brand; enforce: reject and surface to supervision).
  • Update promotion outbox payloads to include brand_id.
  • At Phase 8 producer-deploy time, IF a promotion-owned outbound stream exists at that point (per event-catalog.md it is currently a Known Gap), emit one BRAND_AWARE_PUBLISHING_BEGINS sentinel on it. If no promotion-owned stream exists, skip this task and update event-catalog.md Known Gaps to note that promotion remains stream-less in the multi-brand rollout (so the Phase 16 release gate's event_legacy_schema_total zero-check does not require a non-existent stream).
  • Add tests for: per-brand coupon resolution, per-brand rebate / cashback / lossback resolution, brand-scoped settlement, brand-aware saga.
  • Run promotion_service tests.
  • Commit Phase 8.

Phase 9 -- Game Service Brand Scoping

Delivered by: 8da5848a

  • Audit every supported provider's account-field length cap and document the result in docs/services/game-service.md. For any provider whose cap is exceeded by len(brand_code) + 1 + MAX_PLAYER_ACCOUNT_LEN (worst case 16 + 1 + 32 = 49), document a per-provider fallback format (shorter separator, hash projection, or alternative encoding) before any code lands. This audit must run before Phase 4 enforces MAX_PLAYER_ACCOUNT_LEN at registration, so any account-cap concession is known before users register at the new cap.
  • Add a deposit-brand reconciliation step: scan every player_deposit row created between 0024a deploy and Phase 5 deploy. Cross-check the row's brand_id against the originating request's domain (logged by gateway at creation time, recoverable from access logs). Flag any deposit whose row brand differs from its origin domain's brand for manual review. This catches the edge case where a deposit landed during the migration window with a stale brand assignment.
  • Confirm or remove /sbo/* from the gateway public-route allowlist. Resolved 2026-05-06: sbo confirmed dead (no sbo_callback.py in game_service, no provider record, no callers). Removed /sbo/ from servers_v2/gateway/app/middleware/auth.py, dropped the /sbo/* row from servers_v2/gateway/CLAUDE.md, retired IntegrationType.SBO (value 6 reserved as a tombstone), and deleted sbo from the bounded provider-label list in servers_v2/game_service/tests/test_observability_metrics.py.
  • Add an outbound account-namespacing helper in game_service that returns {brand_code}_{account} (or the documented per-provider equivalent).
  • Update every outbound provider call to use the namespacing helper.
  • Add an inbound reverse-parse helper that performs brand_code prefix lookup against the brand table (longest matching prefix wins; cache per process). Returns (brand_id, player_account) or fails.
  • Update every callback handler under /HO/*, /mg/*, /wc/*, /bti/*, /splus/*, /bt1/*, /digitain/*, /integration/* to use the reverse-parse helper before resolving the bet, settlement, or rollback. (/sbo/* confirmed dead and removed; see above.)
  • Add a game_callback_brand_unresolved_total{provider} counter and emit it on reverse-parse failure; rejected callback returns the documented error envelope.
  • Update game provider transaction state tables to carry brand_id: ho_transaction, mg_account, mg_transaction, wc_account, bti_balance_change, bti_commit_reserve, bti_credit, bti_debit_reserve, bti_reserve, digitain_credit_batch, digitain_debit, digitain_rollback, digitain_token, digitain_transaction (and any other provider-state table found in servers_v2/shared/rgb_db/src/rgb_db/models/).
  • Add tests for: outbound namespacing per provider, inbound reverse-parse per provider including the longest-prefix algorithm with accounts containing _, cross-brand callback rejection in both observe and enforce modes, unresolved-callback counter, per-provider fallback format if any.
  • Run game_service tests.
  • Commit Phase 9.

Phase 10 -- Recon Service Brand Scoping

Delivered by: 2a6e9728

  • Add brand_id to the shooter_* table family.
  • Update recon match logic to resolve brand_id from the matched deposit record.
  • Confirm approvals continue to flow through wallet_service and exercise the wallet_service cross-brand rejection on a synthetic brand-mismatch case.
  • Add tests for: brand-scoped recon match, brand-mismatch rejection on approval.
  • Run recon_service tests.
  • Commit Phase 10.

Phase 11 -- Agent Service: Global Agent + Brand Allow List

Delivered by: 06e853f2

  • Confirm the agent table carries no brand_id.
  • Add an agent_brand repository in agent_service that exposes read-only allow-list lookup.
  • Add agent-edge enforcement: every authenticated agent request whose resolved brand is not in the agent's agent_brand allow list is rejected with the documented error.
  • Confirm brand resolution and allow-list enforcement also apply to agent_service's legacy aliases including /api/v1/user/login (per spec, the legacy alias surface is not exempt). Add a test that exercises the legacy alias under both observe and enforce modes.
  • Strip any inbound X-Brand-Id header on agent_service's agent-frontend edge before injecting the resolved value (mirrors the gateway strip in Phase 3, per spec ### Brand resolution at the edge). Add a header-spoofing test that asserts a request with X-Brand-Id: 99 from the agent frontend is overwritten with the domain-resolved brand.
  • Update agent-side queries (player lists, settlements, withdrawals, messages) to filter by brand_id.
  • Update agent-domain outbox payloads to include brand_id.
  • At Phase 11 producer-deploy time, agent_service emits one BRAND_AWARE_PUBLISHING_BEGINS sentinel event on the agent stream.
  • Add tests for: allow-list enforcement, cross-brand query isolation, brand-scoped agent lookups.
  • Run agent_service tests.
  • Commit Phase 11.

Phase 12 -- Admin Service: Brand Catalog, Configuration, Allow List, And Staff Removal

Delivered by: dd8a0007

  • Add a Phase 12 re-seed verification step: run SELECT count(*) FROM agent a LEFT JOIN agent_brand ab ON a.agent_id=ab.agent_id WHERE ab.agent_id IS NULL. If non-zero, run a one-shot UPDATE that inserts (agent_id, default_brand_id, 'enabled') for every missing agent. Repeat until count is zero. This catches any agent created in a window where the autoseed trigger was not yet installed (e.g. an environment migrated from an earlier branch).
  • After Phase 12 ships the application-level agent_brand write on agent create, run migration 0028_remove_agent_brand_autoseed_trigger.py to drop the temporary trigger.
  • Add servers_v2/admin_service/app/api/routes/brand.py with brand catalog CRUD: list, create, edit metadata, enable, disable. Reject edits to brand_code. Brand-create rejects any brand_code that is a prefix of an existing brand_code, has an existing brand_code as a prefix, or collides with the prefix of an existing player.account value already namespaced and sent to a game provider (validated by scanning player.account for any legacy account starting with <brand_code>_).
  • Add admin_service middleware that reads the X-Operator-Id header (injected by the SSO LB) and rejects every write request without it. Per-write audit row stores operator_id, request IP, request_id, timestamp, payload, and prior value. Read requests do not require X-Operator-Id but are still logged with operator identity in structured logs.
  • Publish BRAND_CATALOG_CHANGED Redis pub/sub message after every successful brand create / disable so game_service reverse-parse caches invalidate within seconds. Publish AGENT_BRAND_CHANGED after every agent_brand write so agent_service allow-list caches invalidate.
  • Add servers_v2/admin_service/app/api/routes/brand_config.py with per-brand configuration CRUD; every write is audited (timestamp, payload, prior value).
  • Add servers_v2/admin_service/app/api/routes/agent_brand.py with agent-to-brand allow-list CRUD; audited writes.
  • Update existing admin operational routes that touch a single brand's data to accept an explicit brand_id parameter and to forward it as X-Brand-Id on internal calls.
  • Replace the hard-coded project integer in admin_service/app/tasks/tag.py with brand_config resolution.
  • Audit and delete any global config constant or global_var row that is now duplicated by per-brand brand_config entries (rolling ratios, cashback / rebate / lossback rates, payment channel selection, withdrawal min/max, risk thresholds). Document the enumerated deletion list in the migration body. Values that remain globally useful become "documented global defaults" used only when brand_config has no brand-specific override.
  • Remove staff-related globals that become dead code with the removal of the seven admin_service legacy route files: legacy admin auth secret env vars, legacy session config, supporting helper modules. Trim any unused entries from servers_v2/.env.compose.example.
  • Delete the following admin_service legacy route files and their registrations in admin_service/app/api/routes/api.py (and any sub- router include in the deleted files themselves): legacy_admin_v2.py, legacy_auth.py, legacy_agents_v2.py, legacy_agent_withdrawals_v2.py, legacy_meta_v2.py, legacy_recon.py, legacy_web_content.py.
  • Delete the supporting staff identity logic that becomes unimported after the seven file deletions: _authenticate_legacy_admin, _refresh_legacy_admin_token, _json_with_token, and any service module exposed only to those routes (e.g. legacy_web_content service).
  • Run grep -rwE 'legacy_admin_v2|legacy_auth|legacy_agents_v2|legacy_agent_withdrawals_v2|legacy_meta_v2|legacy_recon|legacy_web_content' servers_v2/ (word-boundaried) to confirm no surviving import references the removed modules.
  • Update tests/test_admin_integration.py and tests/test_admin_v2_compat.py to remove staff coverage and add removal coverage: every previously served path under /api/admin/user/*, /api/admin/pushbullet/*, /api/admin/shooter/*, /api/admin/web/rules/*, /api/admin/web/faq/*, and /api/admin/web/config/* returns 404.
  • Add tests for brand catalog CRUD, brand_config CRUD, and agent_brand CRUD, including audit-log assertions.
  • Run admin_service tests.
  • Commit Phase 12.

Phase 13 -- Observability Wiring

Delivered by: e2d25f0f

  • Add brand_id and brand_code to the structured log enrichment used in every service.
  • Add a brand Prometheus label (value: brand_code) to player-, wallet-, rolling-, promotion-, and game-scoped counters and histograms.
  • Add brand_resolution_failed_total{reason,service} at the edge (gateway, agent_service, game_service, Plisio callback path).
  • Confirm wallet_cross_brand_rejected_total{command,mode} and game_callback_brand_unresolved_total{provider} are wired.
  • Wire the additional observability counters from spec ## Observability: brand_resolution_failed_total{reason="missing_header",service}; multi_brand_enforcement_mode{service} gauge (per service); brand_resolution_latency_seconds{service} histogram on the edge hot path; request_total{brand_code,service} for per-brand legitimate traffic; event_legacy_schema_total{stream,consumer} for stale-schema events; security_downgrade_total{service} for startup-time downgrade detection; wallet_idempotency_brand_split_total for cross-brand idempotency collisions; wallet_brand_signature_failed_total{caller_service,reason,mode} for HMAC verification failures.
  • Confirm every metric label set is bounded enums; no host, account, raw IP, or attacker-supplied value enters as a label.
  • Add alert definitions to the observability runbook layer (or to the existing per-service alert config) for the three new counters and for per-brand wallet outbox publication freshness.
  • Add tests asserting log fields and metric labels.
  • Run cross-service observability tests.
  • Commit Phase 13.

Phase 14 -- Local Docker Two-Brand Validation

Delivered by: b73cc22c

  • Seed two brands (default and brand2) in the local Docker stack, each with a distinct domain and a distinct brand_code.
  • Configure two domains in the local Docker domain:agent:* and domain:level:* Redis maps, one per brand.
  • Run an end-to-end flow on default: register, login, deposit, bet, settle, rolling, withdraw, coupon issue and use, points credit.
  • Run the same end-to-end flow on brand2 with a player whose account string equals an existing default brand player; confirm two independent player rows and two independent wallet states.
  • Confirm a JWT issued for brand2 is rejected on default's domain at gateway.
  • Confirm a wallet command targeting a default player from a brand2 request is rejected at wallet_service and the rejection counter increments.
  • Confirm a game callback for brand2 reverse-parses correctly and routes to the brand2 player.
  • Confirm per-brand brand_config overrides take effect (e.g. rolling ratio, cashback rate) and that documented global defaults still resolve when a brand has no override.
  • Confirm structured logs and Prometheus metrics carry brand labels.
  • Adversarial test sweep: (a) every GET endpoint exercised with a no-brand-claim JWT against the wrong domain returns either auth-error or empty data, never cross-brand data; (b) recovery flow with the same email registered in two brands recovers each independently with brand-scoped tokens and never leaks the existence of an account in the other brand; (c) a synthetic service-to-service spoof (process holding INTERNAL_SERVICE_TOKEN but no BRAND_SIGNING_KEY) is rejected by wallet_service for a brand-scoped write command; (d) a JWT with alg=none or alg=HS256 is rejected by gateway; (e) brand-create with prefix-collision on brand_code is rejected by admin_service; (f) the security_downgrade_total alert fires when a service starts in observe while more than one brand is enabled.
  • Confirm every worker (wallet_worker, rolling_worker, promotion_worker, agent_worker, admin_worker, recon_worker) reports freshness-based readiness per ADR-003 after the migration with brand-aware events flowing in both brands; a payload-schema change to add brand_id must not silently break a freshness probe.
  • Commit Phase 14.

Phase 15 -- Stable Doc Refresh And Spec Promotion

Delivered by: (this commit)

  • Re-verify the following stable docs reflect the implemented behavior: docs/architecture/data-ownership.md, docs/architecture/domain-ownership.md, docs/architecture/system-overview.md, docs/architecture/http-entrypoints.md, docs/architecture/event-catalog.md, docs/architecture/service-catalog.md, docs/architecture/deployment-topology.md, docs/architecture/migration-readiness.md, and every docs/services/*.md.
  • Update Last Verified Commit markers on each touched stable doc.
  • Promote docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md status to Approved.
  • Promote docs/runbooks/multi-brand/multi-brand-isolation-rollout.md status to Ready only after all preceding phases are green.
  • Commit Phase 15.

Phase 16 -- Flip JWT/Domain Brand Enforcement To Hard-Reject

  • Confirm the brand_resolution_failed_total{reason="jwt_domain_mismatch"} counter has been at zero across normal traffic for the runbook's documented soak window. The soak window starts at the moment Phase 5 (brand-aware JWT issuance) is deployed in the target environment, not at the Phase 3 deploy time, because Phase 5 deploy is the last moment a brand-unaware JWT could be issued. Window length: at least 2 * JWT_EXPIRE_MIN minutes; with the default JWT_EXPIRE_MIN = 60, a minimum of 120 minutes from Phase 5 deploy.
  • Confirm every player- and agent-facing JWT in active use carries brand_id (verified by sampling representative tokens and by the soak counter being zero). Tokens issued before Phase 5 are guaranteed expired by the soak window length.
  • Confirm wallet_cross_brand_rejected_total{mode="observe"} has been at zero across normal traffic for the same soak window (proves every internal caller forwards X-Brand-Id correctly).
  • Confirm event_legacy_schema_total{stream,consumer} has been at zero across the soak window (proves every legacy-schema event has been drained from every stream before consumers flip to enforce; otherwise post-flip consumers will reject leftover legacy events and surface to supervision).
  • Confirm wallet_brand_signature_failed_total{mode="observe"} has been at zero across the soak window (proves every brand-scoped wallet write caller produces a valid X-Brand-Signature).
  • Confirm brand_resolution_failed_total{reason="missing_header"} has been at zero across the soak window (proves every internal caller forwards X-Brand-Id on brand-scoped routes).
  • Confirm PER_CALLER_TOKEN_REQUIRED=on has been advanced AND internal_caller_token_legacy_rejected_total == 0 across the soak window (proves every internal caller has migrated to the per-caller token; legacy single-shared-token issuance is fully drained).
  • Confirm VERIFY_CALLBACKS=true AND game_callback_verification_bypassed_total == 0 across the soak window (proves every game-provider callback path is signature-checked and no provider is silently grandfathered into a bypass).
  • Confirm BRAND_SIGNING_KEY is non-empty in the target environment AND the boot guard reports it green at start (the wallet/admin/etc. startup checks fail-closed on empty in production but the precheck here catches drift in staging).
  • Confirm game_callback_brand_unresolved_total{reason="raw_*_in_enforce"} == 0 across the soak window (proves no provider is hitting the raw-payload fallback for brand resolution under enforce mode).
  • CR3-A sync: confirm agent_brand row count covers every active agent. Run: SELECT (SELECT count(*) FROM agent) AS agents, (SELECT count(DISTINCT agent_id) FROM agent_brand) AS bound; and confirm both numbers match. If they diverge, run the Phase 12 one-shot re-seed and re-check before proceeding (otherwise the affected agents would be locked out at first request post-flip).
  • CR3-A sync: confirm every internal HTTP caller sends an X-Caller-Service header. Run cd servers_v2 && uv run pytest tests/test_per_caller_header_audit.py -v and confirm 0 failures. This regression net catches any new internal client that forgets the per-caller identifier (would surface as internal_caller_token_legacy_total non-zero immediately after the flip).
  • CR3-A sync: confirm internal_caller_token_legacy_total == 0 across the soak window for every consumer service. Per-consumer query: sum by (consumer) (rate(internal_caller_token_legacy_total[15m])) == 0. A non-zero value means at least one caller is still authenticating with the shared INTERNAL_SERVICE_TOKEN; resolve before proceeding.
  • CR3-A sync: confirm Stage C (revoke shared INTERNAL_SERVICE_TOKEN) has completed in this environment. Verify by printenv | grep INTERNAL_SERVICE_TOKEN on each consumer container; only the per-caller variants (INTERNAL_SERVICE_TOKEN_<CALLER>) should be present, the bare INTERNAL_SERVICE_TOKEN must be absent.
  • CR3-A sync: confirm WALLET_BRAND_SIGNATURE_REQUIRE=on in the target environment AND wallet_brand_signature_missing_total{caller_service=*} == 0 across the soak window. A non-zero value identifies a caller that is not signing brand-scoped writes; resolve before proceeding.
  • CR3-A sync: confirm wallet_topology_default_brand_fallback_total == 0 across the soak window. A non-zero value means the per-brand topology lookup silently fell back to the default brand for at least one request -- root-cause before the flip (likely a missing per-brand topology row).
  • CR3-A sync: on-call has paged backup, opened the incident channel placeholder, started the 30-minute wall-clock budget timer, AND has the multi-brand counter Diagnosis Playbook open in a tab (docs/runbooks/multi-brand/diagnosis-playbook.md). Every pre-flip counter check above has a per-counter playbook section there for the case where the soak window is non-zero and root-cause is needed before proceeding.
  • Flip MULTI_BRAND_ENFORCEMENT from observe to enforce in the target environment. Time budget: 30 minutes wall-clock total for the full sequence. Per-step budget: 3 min restart + readiness per service-and-worker group. Sequence (downstream first, edge last; this minimizes the window where edge enforces while downstream still tolerates and accepts the inverse small-window risk where downstream rejects a missing-header request before edge catches it):
    1. wallet_service + wallet_worker
    2. rolling_service + rolling_worker
    3. promotion_service + promotion_worker
    4. player_service
    5. recon_service + recon_worker
    6. agent_service + agent_worker
    7. gateway
  • Between each step: confirm /health on the just-flipped service shows enforce mode AND multi_brand_enforcement_mode gauge is 2 AND wallet_cross_brand_rejected_total{mode="enforce"} rate is at baseline (zero or known background). If any check fails, abort and revert (see below).
  • Mid-flip abort criteria: any of: a single-service step exceeds 5 min wall-clock; readiness check fails twice; alert fires on wallet_cross_brand_rejected_total{mode="enforce"} non-zero during the flip; total budget exceeds 30 min.
  • Mid-flip revert procedure: flip every service that has reached enforce back to observe in reverse order (gateway first, wallet last). Same per-step budget. Document the revert in #incidents; root-cause the failure before re-attempting flip.
  • Final verification after the full sequence: confirm every service reports enforce via Prometheus multi_brand_enforcement_mode == 2. Confirm the security_downgrade_total counter remains zero in subsequent hours (would fire if any service later re-bootstrapped with a different mode).
  • Confirm gateway rejects synthetic cross-brand JWT replay and synthetic JWT-without-brand_id requests with the documented error.
  • Confirm normal traffic is unaffected (no spike in legitimate rejections).
  • Update the runbook's release gate to mark this environment as hard-enforce.
  • Commit Phase 16.

Risks

  • A missing brand_id filter on a query is a data-leak risk between brands. Repository helpers and SQLAlchemy mixins must make brand filtering hard to forget; reviews must focus on raw SQL paths.
  • A wallet command path that bypasses brand validation is a cross-brand money mutation. Wallet command tests must cover every command family.
  • JWT/domain mismatch enforcement enabled before every downstream service is deployed with brand awareness will lock players out. Mitigation: the MULTI_BRAND_ENFORCEMENT flag introduced in Phase 3 defaulted to observe (log-only) during the rollout and is flipped to enforce only after Phases 4 through 14 are deployed and the soak window shows zero legitimate rejections. The repository compose default is now enforce after the 2026-05-06 in-repo closure.
  • Game provider account namespacing changes the outbound account string. Provider-side reconciliation must be reviewed before cutover.
  • Backfill on a large existing database can be slow; migrations must be designed to default new columns at the schema level rather than rewriting every row in a single transaction.
  • Removing staff routes also removes any in-flight back-office identity capability; consumers of those routes must be confirmed inactive before deletion.
  • Per-brand topology and policy resolution adds an extra brand dimension to ADR-005's activation safety; activation must remain blocked when unresolved balances or active records would become unreachable, now scoped per brand.

Done Definition

The plan is done when every acceptance criterion in docs/specs/multi-brand/2026-04-27-multi-brand-isolation-spec.md (the ## Acceptance Criteria section) is verifiably satisfied AND every release-gate item in docs/runbooks/multi-brand/multi-brand-isolation-rollout.md (the ## Release Gate section) is satisfied for the target environment, plus the following plan-specific gates:

  • Spec status is Approved.
  • ADR-009 is Accepted.
  • Phases 1 through 16 are complete.
  • Local Docker two-brand validation passes end to end (see runbook).
  • All service test suites pass.
  • The migration verification script reports zero NULL brand_id rows across every brand-scoped table.
  • MULTI_BRAND_ENFORCEMENT=enforce is active in every environment that has cleared its soak window (per Phase 16).
  • No brand-scoped query in any service is missing a brand_id filter (verified by code review of the merged PRs).
  • The seven listed admin_service legacy route files are deleted and their previously served paths return 404.
  • ADR-005's "single money writer" rule is unchanged after this work.
  • Stable docs (architecture and services) reflect the implemented behavior and have updated Last Verified Commit markers.
  • Runbook is promoted to Ready and is followed for staging validation before any production enablement of a second brand.