2026-04-29 — Exhaustive brand_id RW (SELECT/UPDATE/DELETE) Audit (T14-A baseline)
Round-14 deliverable. The T9 INSERT audit (
docs/runbooks/multi-brand/2026-04-28-brand-id-insert-audit.md) closed the write half of the brand-leak class. This document closes the symmetric read/update/delete half — the failure mode that has burned 8 prior cross-review rounds because every prior audit only looked atINSERTstatements.Codex's T14 review surfaced 8 explicit RW leaks in admin SELECT/UPDATE/DELETE statements that ALL prior cross-reviews missed because the classifier only audited INSERTs. This document expands the audit to the bounded universe of reads/mutations/deletes against brand-scoped tables and, like T9-A, makes the migration table list the canonical source of truth so future regressions are caught by CI.
Future regressions can be detected by re-running the greps in § Reproducing the audit and re-classifying via
tools/audit/brand_id_rw_classifier.py. The Make targetmake audit-brand-id-rwwraps both steps and is gated by CI.
Cross-reference
- INSERT half:
docs/runbooks/multi-brand/2026-04-28-brand-id-insert-audit.md(T9-A baseline). - This doc: T14-A baseline for SELECT/UPDATE/DELETE.
- Both halves run on every PR via
.github/workflows/brand-id-audit.yml. - Phase 16 hard-flip is now gated on both classifiers reporting
FIX=0(seetools/multi_brand_backfill/phase16_dryrun.py).
Scope
- Tree audited:
servers_v2/(active services). The legacyservers/monolith is included in the grep but every match is classifiedOOS-LEGACY. - Tables audited: 122 brand-scoped tables — 111 in the
rgbschema plus 11 in thewalletschema, sourced verbatim fromRGB_SCHEMA_TABLESandWALLET_SCHEMA_TABLESinservers_v2/shared/rgb_db/migrations/versions/0024a_add_brand_id_nullable.py. - Patterns audited:
- Raw
text("SELECT … FROM <table> …"),text("UPDATE <table> SET …"),text("DELETE FROM <table> …")and their f-string variants. - Multi-table SELECTs that JOIN brand-scoped tables — every brand-scoped
join target is classified separately so a missing
brand_idpredicate on one of them surfaces as a FIX even if a sibling join has the predicate.
- Raw
The classifier is conservative by design: ties go to FIX so a human
must explicitly mark a query OOS-INTENTIONAL (via the
# brand-global: <reason> comment marker) before the gate accepts it.
Headline numbers
Column A is the T14-A baseline (the bounded universe at audit time). Column B is post-T14-B after the 10 explicit Codex sites were closed (see § Batch status). Column T15 is the post-T15-E sweep of the deferred 316-site backlog. Column T16-C is the current state after the dynamic-SQL classifier upgrade (no new FIX-DYNAMIC sites surfaced).
| Bucket | SELECT (A → B → T15 → T16-C/current) | UPDATE (A → B → T15 → T16-C/current) | DELETE (A → B → T15 → T16-C/current) |
|---|---|---|---|
| OK (brand_id pinned) | 62 → 92 → 217 → 219 | 13 → 24 → 97 → 97 | 1 → 2 → 24 → 24 |
| FIX (production) | 235 → 209 → 0 → 0 | 88 → 81 → 0 → 0 | 27 → 26 → 0 → 0 |
| FIX-DYNAMIC | n/a → n/a → n/a → 0 | n/a → n/a → n/a → 0 | n/a → n/a → n/a → 0 |
| FIX-TEST | 3 → 3 → 0 → 0 | 0 → 0 → 0 → 0 | 0 → 0 → 0 → 0 |
| OOS-LEGACY | 3 → 3 → 3 → 3 | 0 → 0 → 0 → 0 | 4 → 4 → 4 → 4 |
| OOS-NON-SQL | 37 → 41 → 42 → 46 | 7 → 7 → 7 → 7 | 1 → 1 → 1 → 1 |
| OOS-INTENTIONAL | 0 → 0 → 91 → 91 | 0 → 0 → 18 → 17 | 0 → 0 → 5 → 5 |
| UNKNOWN | 0 → 0 → 0 → 0 | 0 → 0 → 0 → 0 | 0 → 0 → 0 → 0 |
| TOTAL matches | 340 → 348 → 353 → 359 | 108 → 112 → 122 → 121 | 33 → 33 → 34 → 34 |
T14-A baseline production FIX sites: 350 (235 SELECT + 88 UPDATE + 27 DELETE).
Post-T14-B production FIX sites: 316 (209 SELECT + 81 UPDATE + 26 DELETE).
Post-T15-E production FIX sites: 0 (the entire 316-site backlog swept).
Post-T16-C production FIX-DYNAMIC sites: 0 (the new dynamic-SQL
class did not surface any real bugs the static classifier missed —
the upgrade was prompted by the CR9 finding that pre-T16-A
logs.py::_query_log_table was invisible to the classifier; T16-A
already closed that helper).
Test-fixture sites that need a code change: 0 (T15-E swept the
last 3 FIX-TEST entries from the T14-A baseline).
Batch Status
Batch B closed every site Codex's T14 review explicitly flagged plus
their immediate sibling endpoints in the same admin_service files.
Per-route detail is in the commit message of 9287ce50.
T15 superseded the original T14-D / T14-E / T14-F deferred-backlog
plan and swept the remaining 316 production FIX sites. The current
classifier output is the release gate: FIX=0, FIX-DYNAMIC=0, and
FIX-TEST=0 for SELECT, UPDATE, and DELETE. Any new non-zero entry is
therefore a regression, not inherited backlog.
Per-batch fix manifest
The remainder of T14 closes the FIX entries below. Batch B is the fix-everything-in-one-shot batch; the audit baseline (Batch A, this document) is the deliverable that proves the universe is bounded. Batch C re-runs the classifier and updates this document's headline counts to confirm the universe was emptied.
The 350 entries are fanned out across 39 files. Two structural observations from the per-file breakdown:
- Codex's 8 explicit findings are clustered in the admin_service
back-office routes (
agent_finance.py,stats.py,player.py,player_finance.py,promotion.py). Those carry the highest exploitation risk because they are operator-driven mutations on financial state. - The wallet/promotion/player service queries account for ~30% of
the volume but most are inside helpers that are protected by an
upstream
assert_player_brand_matches_operator-style check. They are still defence-in-depth FIXes — the SQL must refuse to scope-cross even when an upstream guard is bypassed — but they do not represent currently-exploitable leaks.
Codex T14-priority sites (the 8 leaks that triggered this audit)
These are the highest-priority sites because they were explicitly exercised by Codex's review and demonstrate live cross-brand exploitability via guid guessing.
| # | Where | Fix |
|---|---|---|
| 1 | servers_v2/admin_service/app/api/routes/agent_finance.py:124,132 (agree_agent_withdrawal) | Add AND brand_id = :brand_id to BOTH the SELECT and the UPDATE; brand from resolve_brand_id(request) |
| 2 | servers_v2/admin_service/app/api/routes/agent_finance.py:186,209 (decline_agent_withdrawal) | Same pattern |
| 3 | servers_v2/admin_service/app/api/routes/agent_finance.py:259,267 (pay_agent_withdrawal) | Same pattern |
| 4 | servers_v2/admin_service/app/api/routes/stats.py:426–432 (player_provider_stat) | Table doesn't exist (runtime-500 bomb) — rename to player_provider_stat_day AND add brand-scope predicate |
| 5 | servers_v2/admin_service/app/api/routes/player.py:993–1006 (/players/registrations) | Prepend brand_id = :brand_id to the dynamic WHERE 1=1 filter; brand from resolve_brand_id(request) |
| 6 | servers_v2/admin_service/app/api/routes/player.py:1170–1192 (player_level_domain set primary) | Add brand-scope to BOTH the SELECT and the UPDATE |
| 7 | servers_v2/admin_service/app/api/routes/stats.py:190–512 (provider_stat_day, player_stat_day, agent_stat_day, promotion_stat_day, usdt_stat_day, player_provider_stat_day, rebate_stat_day, player_stat_acc) | 13 sibling stats endpoints share the same shape — add brand_id = :brand_id, fail-loud on missing brand context. admin_stat_acc / admin_stat_day are excluded (truly brand-global, see OOS-INTENTIONAL) |
| 8 | servers_v2/admin_service/app/api/routes/player_finance.py:1281–1303 (/players/rolling/list) | Prepend brand-scope to the dynamic WHERE 1=1 filter |
| 9 | servers_v2/admin_service/app/api/routes/promotion.py:985–1005 (/promotions/list) | Prepend brand-scope (Codex fixed coupons/list in 9d981cc9 but missed promotions/list in the same file) |
| 10 | servers_v2/admin_service/app/middleware/brand.py:10–11 (docstring) | Update the stale "fall back to body.brand_ids as before" line; the post-T11/T12 reality is fail-loud brand_required_envelope({}) |
Item 7 spans 14 sibling endpoints; the remaining items are 1 endpoint each. Items 1–9 sum to about 25 raw SQL statements out of the 350 total — the rest of the FIX list extends the same fix pattern across the back-office surface.
Full FIX inventory (350 sites — generated by the classifier)
The complete machine-generated list is the FAIL block of
make audit-brand-id-rw (see § Reproducing the audit).
Per-file totals:
| File | S | U | D | Total |
|---|---|---|---|---|
servers_v2/admin_service/app/api/routes/agent.py | 8 | 2 | 3 | 13 |
servers_v2/admin_service/app/api/routes/agent_finance.py | 5 | 3 | 0 | 8 |
servers_v2/admin_service/app/api/routes/banking.py | 4 | 7 | 4 | 15 |
servers_v2/admin_service/app/api/routes/brand.py | 1 | 0 | 0 | 1 |
servers_v2/admin_service/app/api/routes/finance_config.py | 3 | 3 | 3 | 9 |
servers_v2/admin_service/app/api/routes/meta.py | 6 | 0 | 0 | 6 |
servers_v2/admin_service/app/api/routes/player.py | 16 | 18 | 3 | 37 |
servers_v2/admin_service/app/api/routes/player_finance.py | 24 | 1 | 0 | 25 |
servers_v2/admin_service/app/api/routes/promotion.py | 10 | 4 | 2 | 16 |
servers_v2/admin_service/app/api/routes/stats.py | 16 | 0 | 0 | 16 |
servers_v2/admin_service/app/api/routes/system.py | 6 | 5 | 5 | 16 |
servers_v2/admin_service/app/services/brand_helpers.py | 1 | 0 | 0 | 1 |
servers_v2/admin_service/app/tasks/check_expired.py | 0 | 2 | 0 | 2 |
servers_v2/admin_service/app/tasks/expire.py | 1 | 4 | 0 | 5 |
servers_v2/admin_service/app/tasks/level.py | 2 | 2 | 0 | 4 |
servers_v2/admin_service/app/tasks/stat.py | 17 | 0 | 0 | 17 |
servers_v2/admin_service/app/tasks/tag.py | 16 | 0 | 5 | 21 |
servers_v2/admin_service/app/tasks/vera_sync.py | 7 | 0 | 0 | 7 |
servers_v2/agent_service/app/api/routes/legacy_routes.py | 4 | 0 | 0 | 4 |
servers_v2/agent_service/app/api/routes/messages.py | 1 | 0 | 0 | 1 |
servers_v2/agent_service/app/api/routes/withdraw.py | 2 | 0 | 0 | 2 |
servers_v2/agent_service/bootstrap_local_data.py | 1 | 0 | 0 | 1 |
servers_v2/game_service/app/api/routes/integration.py | 6 | 0 | 0 | 6 |
servers_v2/player_service/app/api/routes/auth.py | 2 | 1 | 0 | 3 |
servers_v2/player_service/app/api/routes/player.py | 11 | 0 | 0 | 11 |
servers_v2/player_service/app/api/routes/recovery.py | 1 | 1 | 0 | 2 |
servers_v2/player_service/app/services/domain_cache.py | 4 | 0 | 0 | 4 |
servers_v2/promotion_service/app/api/routes/coupon.py | 4 | 7 | 0 | 11 |
servers_v2/promotion_service/app/api/routes/lossback.py | 3 | 1 | 1 | 5 |
servers_v2/promotion_service/app/api/routes/rebate.py | 3 | 1 | 1 | 5 |
servers_v2/promotion_service/app/api/routes/records.py | 8 | 3 | 0 | 11 |
servers_v2/promotion_service/app/services/event_consumer.py | 2 | 0 | 0 | 2 |
servers_v2/promotion_service/app/tasks/coupon_saga_recovery.py | 0 | 6 | 0 | 6 |
servers_v2/recon_service/app/services/recon_store.py | 5 | 0 | 0 | 5 |
servers_v2/rolling_service/app/api/routes/rolling.py | 2 | 0 | 0 | 2 |
servers_v2/wallet_service/app/api/routes/plisio.py | 5 | 9 | 0 | 14 |
servers_v2/wallet_service/app/api/routes/queries.py | 18 | 3 | 0 | 21 |
servers_v2/wallet_service/app/api/routes/wallet.py | 5 | 3 | 0 | 8 |
servers_v2/wallet_service/app/services/deposit_policy.py | 4 | 0 | 0 | 4 |
servers_v2/wallet_service/app/services/wallet_bucket_commands.py | 1 | 2 | 0 | 3 |
| TOTAL | 235 | 88 | 27 | 350 |
Brand-source recommendation per service
| Service | Brand source for RW predicates |
|---|---|
admin_service/api/routes/* | resolve_brand_id(request); fail-loud brand_required_envelope({"route": ...}) when None. The middleware (app/middleware/brand.py) populates request.state.brand_id from the X-Brand-Id header or the Redis domain map. |
admin_service/tasks/* | Per-brand iteration loop (T9-C pattern): enumerate enabled brands and run the task body once per brand, binding each iteration's brand_id into the SQL params. |
admin_service/services/* | Helpers receive brand_id: int as an explicit parameter; never read it from globals or fall back to a default. |
agent_service/api/routes/* | agent_token_brand_id(agent) (helper added in 9d981cc9) — the agent JWT's brand_id claim is the authoritative source for queries scoped to a single agent's view. |
agent_service/bootstrap_local_data.py | Hard-coded 1 (default brand) with a # dev-only bootstrap comment, mirroring the T9-D pattern for INSERTs. |
wallet_service | _request_brand_id(request) already established. For player-id-keyed queries, augment with a JOIN onto player.brand_id so the SQL refuses to return cross-brand rows even when the input player_id is from another brand. |
game_service | The provider-callback brand is derived from the topology snapshot already loaded in the route; pass it as a named parameter into the SQL. |
player_service | request.state.brand_id (gateway-injected); for routes called pre-auth (login, recovery), the brand is derived from the request domain via the same Redis lookup the gateway uses. |
promotion_service | _request_brand_id(request) for routes; per-brand iteration for the saga recovery task. |
rolling_service | The rolling row's own brand_id (already loaded in the read for the cancel/complete codepath). |
recon_service | The recon match path establishes brand context on shooter_sms UPDATE; SELECTs by ID can read across brands because no PII is exposed in the read shape. Mark with # brand-global: recon ID-based read, no cross-brand PII exposure after Batch B verifies. |
OOS classifications (rationale)
OOS-LEGACY (3 SELECT + 0 UPDATE + 4 DELETE)
Same rationale as the INSERT audit: matches under the legacy
servers/ tree are out of scope for the multi-brand isolation
initiative which targets the v2 service split. See
docs/reference/architecture/2026-04-rebuild-decision.md.
OOS-NON-SQL (37 + 7 + 1)
Matches inside test assertions, regex literals, comments, or
docstrings that mention SELECT/UPDATE/DELETE for diagnostic
purposes but do not execute SQL. The classifier flags any line that
contains one of assert, in sql, in str(, re.compile|search|match,
pattern =, .count(, .split(, r"SELECT|UPDATE|DELETE, leading
#, or a numbered docstring bullet.
OOS-INTENTIONAL
Current count: 113 production sites (91 SELECT + 17 UPDATE + 5 DELETE).
Tables in BRAND_GLOBAL_BY_DESIGN (in brand_id_rw_classifier.py):
admin_stat_acc— platform-level cumulative aggregate, single row per period spanning all brands (perdocs/architecture/data-ownership.md§ Brand-Global Aggregates).admin_stat_day— same, for daily granularity.
Add to this set with care: each entry is a permanent gap in CI coverage. Document the rationale in the classifier docstring AND in this section of the audit doc.
Current OOS-INTENTIONAL breakdown
CR9-B reviewed every site flagged OOS-INTENTIONAL and confirmed each
falls into one of the categories below. The # brand-global: <reason>
comment marker is present at every site (the classifier rejects the
opt-out without it).
| Category | Rationale |
|---|---|
| Resolver helpers (look up brand from a parent row) | Helpers like _load_player_brand_id(player_id) MUST query the player row WITHOUT a brand filter — the whole point is to resolve the brand id from the parent so callers can then pin it. Adding a brand predicate would create a chicken-and-egg dependency. Each site has the marker AND a fail-closed resolve_strict_brand_id immediately downstream. |
| Scheduled tasks (cross-brand by design) | The admin_service scheduler tasks (player_stat_acc, agent_stat_day, expire_coupons, expire_promotions, process_attendance, vera_sync_all) iterate every brand in turn and write per-brand result rows. The driver SELECT is intentionally cross-brand; the per-row UPSERTs DO carry brand_id. |
Brand-global tables (admin_stat_*) | The admin_stat_acc / admin_stat_day reads + writes — the platform-level aggregates that intentionally span all brands. Marker is implicit via the BRAND_GLOBAL_BY_DESIGN allow-list, not the per-line comment. |
Cross-brand admin views (super_admin only) | Routes like super_admin.list_brands, system.list_global_vars, the audit-trail viewers, and the legacy agent table reads (which is brand-global per ADR-009 — membership lives in agent_brand). All gated by require_super_admin. |
| Outbox / event-fanout queries | The wallet/promotion/rolling outbox-poller scans outbox tables across all brands and dispatches per-row to brand-aware handlers. Per-row payloads carry the brand_id; the driver SELECT is intentionally cross-brand. |
| Topology / policy bundle reads | Updated 2026-05-06: per Phase 6B, wallet_topology and wallet_policy are now brand-scoped — uniqueness is (brand_id, code, version) and (brand_id, topology_code, version, policy_key) respectively (see docs/services/wallet-service.md "Multi-Brand Constraints"). Every active-bundle lookup binds brand_id via _brand_filter_clause. The historical "server-global" classification below dates to the pre-Phase-6B audit and is preserved for archaeology only. The store-layer _resolve_brand_id retains a default-brand fallback for defence-in-depth — production routes (topology.py, bucket_wallet.py, wallet.py, queries.py) all envelope-reject missing X-Brand-Id upfront so the fallback path is unreachable; the wallet_topology_default_brand_fallback_total counter exists to verify the fallback never fires in soak. |
| Idempotency / dedup checks (brand_id in JSON payload) | A handful of helpers read idempotency rows by idempotency_key only (the brand_id is part of the JSON payload, not a filter column). Migration 0027 made the unique constraint (brand_id, idempotency_key) — these reads still match a single row. |
| Schema migrations / catalog scans | Operator tooling that reads the migration history / brand catalog intentionally needs the cross-brand view. |
CR9-B verified these sites are sound. The bucket grows over time as new genuinely-cross-brand queries land — the marker requirement keeps it auditable.
Classifier upgrades
T16-C dynamic-SQL detection
CR9-A surfaced one classifier blind spot: _query_log_table in
servers_v2/admin_service/app/api/routes/logs.py built its SQL
dynamically (WHERE 1=1 builder + f-string template
f"SELECT ... FROM {table} WHERE {where} ..."). No static line
carried a literal brand_id token, so the static classifier never
flagged it even though the helper read every operator's log table
across every brand. T16-A closed the helper itself; T16-C extended
the classifier so the same construction pattern can never go
undetected again.
The new FIX-DYNAMIC class flags two patterns when the surrounding
~60-line scope has no recognised brand idiom (conditions.append ("brand_id = :brand_id"), sql += " AND brand_id = :brand_id",
alias-qualified variants):
- SQL strings containing
WHERE 1=1literal. - f-string SQL templates
f"SELECT ... WHERE {where} ..."where both the table AND the WHERE clause are templated.
FIX-DYNAMIC shares the same exit-code semantics as FIX (non-zero
when count > 0). Conservative bias — when uncertain, flag.
A retroactive run on the pre-T16-A tree confirms the upgrade catches
the 2 SQL templates inside _query_log_table (the count + fetch
queries). The current tree (post-T16-A) reports FIX-DYNAMIC = 0
across the repo — the upgrade did not surface any new bugs the
static classifier missed.
Reproducing the audit
mkdir -p /tmp/brand_audit_rw
grep -rniE --include='*.py' \
'SELECT[[:space:]]+' \
servers_v2 servers \
> /tmp/brand_audit_rw/all_selects.txt
grep -rniE --include='*.py' \
'UPDATE[[:space:]]+(rgb\.|wallet\.|`[^`]+`\.)?[a-z_]+[[:space:]]+SET' \
servers_v2 servers \
> /tmp/brand_audit_rw/all_updates.txt
grep -rniE --include='*.py' \
'DELETE[[:space:]]+FROM[[:space:]]+' \
servers_v2 servers \
> /tmp/brand_audit_rw/all_deletes.txt
python3 tools/audit/brand_id_rw_classifier.py
Or use the wrapper:
make audit-brand-id-rw
CI runs the same Make target on every PR (see
.github/workflows/brand-id-audit.yml); the classifier exits
non-zero when RW FIX > 0 so a regression fails the build.
After every fix batch, re-run the greps and confirm the bucket counts move toward the target steady state:
target steady state:
SELECT: FIX=0 FIX-TEST=0 OK ≥ 297 (was FIX=235, FIX-TEST=3, OK=62)
UPDATE: FIX=0 FIX-TEST=0 OK ≥ 101 (was FIX=88, FIX-TEST=0, OK=13)
DELETE: FIX=0 FIX-TEST=0 OK ≥ 28 (was FIX=27, FIX-TEST=0, OK=1)
post-Batch B intermediate state:
SELECT: FIX=209 FIX-TEST=3 OK=92
UPDATE: FIX=81 FIX-TEST=0 OK=24
DELETE: FIX=26 FIX-TEST=0 OK=2
If a new FIX entry appears that is not in the manifest above, treat it
as a regression — a brand-scoped read/mutation/delete was added
without a brand_id predicate between this baseline and the present
commit.
Why the previous rounds missed sites
Every cross-review from T6 through T13 audited INSERT INTO and
insert(<Model>) exclusively. The cross-brand exploit shape — guess a
guid for a row owned by another brand and SELECT/UPDATE/DELETE it —
never showed up in the INSERT audit because INSERTs always run with
the writer's brand context attached. The reads/mutations/deletes are
where guid guessing turns into actual cross-brand data exfiltration or
mutation.
T14-A makes the migration table list the canonical source of truth for
the RW path too: the classifier reads
RGB_SCHEMA_TABLES + WALLET_SCHEMA_TABLES directly from
0024a_add_brand_id_nullable.py so adding a new brand-scoped table to
the migration automatically extends the audit's coverage.
Last Verified Commit
934d47f4 (post-T16-C; T15-A..E swept the entire 316-site deferred
RW backlog → make audit-brand-id-rw reports FIX=0 + FIX-TEST=0;
T16-A closed the CR9 admin_service Critical+Important findings; T16-B
fail-loud-converted every conditional brand predicate in
wallet_service; T16-C extended the classifier with the new
FIX-DYNAMIC class to detect helper-style queries). Phase 16
dry-run gate (tools/multi_brand_backfill/phase16_dryrun.py) now
reports GO on the RW dimension.