Legacy middle_server Retirement Runbook
Status
Planning stage — not yet executable. Every row in §Scope below
still has Owner: TBD and Migration: TBD. Operators MUST NOT treat
this as a step-by-step cut-over guide until each row is fully
assigned and a Rollback section is filled in (see §Rollback below,
which is currently a stub). This document captures the scope of
the retirement and the per-prefix design discussion; the
execution runbook for each prefix will land as a sibling document
under docs/runbooks/legacy-middle-server-retirement/ once owners
pick up individual prefixes.
Last Verified Commit
(see git log -- docs/runbooks/legacy-middle-server-retirement.md)
Goal
Track the 7 remaining middle_server route prefixes that block full
retirement of the legacy servers/ tree. Each prefix entry below
documents the migration target, dependencies, expected effort, and the
cut-over plan.
This runbook is the single source of truth for retiring the legacy
back-office sidecar surface. docs/architecture/migration-readiness.md
points here for that scope; the in-code 410 stubs in
servers_v2/admin_service/app/api/routes/legacy_sunset.py link here
via their Link: rel="sunset" header.
Scope
| Prefix | Target service | Frontend caller | Migration owner | Effort | Status |
|---|---|---|---|---|---|
/api/admin/role/* | admin_service (new module) | bo/admin UI | TBD | 1-2 weeks | not started |
/api/admin/menu/* | admin_service (new module) | bo/admin UI | TBD | 1 week | not started |
/api/admin/config/* | admin_service (extend existing system.py / brand_config.py) | bo/admin UI | TBD | 1 week | not started |
/api/admin/i18n/* | admin_service (new module) | bo/admin UI | TBD | 1-2 weeks | not started |
/api/admin/bi/* | new bi_service or admin_service extension | bo/admin UI | TBD | 2+ weeks | not started |
/coin/* | wallet_service or new coin_service | TBD (admin + player) | TBD | 2 weeks | not started |
POST /{path:path} catch-all | enumerate first, then explicit routes | TBD | TBD | 2-3 weeks | not started |
Spot-check the rows above against servers_v2/admin_service/app/api/routes/
when this runbook is reopened: any prefix that has gained a partial
implementation should flip its Status column to partial and call out
which subpaths are migrated. As of 3f36e7a8 none of the seven prefixes
have an admin_service implementation — the only related code paths
are /api/v1/player/deposit/coin/* (player_finance.py, NOT under the
root /coin/* prefix) and brand-scoped config under
/api/v1/brands/{brand_id}/config/* (NOT under /api/admin/config/*).
Those are explicitly out of scope for the retirement entries below.
Per-prefix detail
/api/admin/role/*
Legacy: CRUD for the staff RBAC role table (role name, permission bit
mask, allowed menu entries). Backs the "Roles" tab in bo/admin.
Data lives in the legacy staff_role and staff_role_permission
tables. The staff table itself was deleted by ADR-009 in favour of
the operator-id-from-JWT pattern, but the role / permission catalog
was not deleted in the same wave because bo/admin still renders
permission checkboxes from it.
v2 target: new app/api/routes/role.py in admin_service, mounted
under /api/v1/roles. Authoritative permission resolution moves into
the JWT mint path in app/services/auth.py (the role claim is
already there — what is missing is a write surface to manage the
catalog). Tests: route parity + a contract test that locks the
permission-bit shape against bo/admin.
Gate: bo/admin must be ready to call the new path before legacy
delete; cut-over is a coordinated front+back deploy.
/api/admin/menu/*
Legacy: stores the bo/admin left-nav menu tree (menu_id, parent_id,
icon, path, required_permission_bit). Hot data — operators rearrange
menus during onboarding for a new brand.
Coupled to /api/admin/role/* (each menu entry references a
permission bit). Migrate together; the same admin_service module
that owns role CRUD should own menu CRUD.
v2 target: app/api/routes/menu.py next to role.py. Storage stays
in the same PostgreSQL instance under a renamed brand-scoped table
(brand_menu_entry) so each brand can carry its own nav.
Gate: same as /api/admin/role/* — frontend redeploy required.
/api/admin/config/*
Legacy: free-form key/value config the bo/admin "System Settings"
tab writes to. Overlaps two real things:
- per-brand operational policy (already migrated to
brand_configunder/api/v1/brands/{brand_id}/config) - platform-global knobs (maintenance window, feature kill switches),
which
admin_serviceexposes under/api/v1/global-varsand/api/v1/settings(super_admin only).
Migration is therefore mostly a frontend re-point. Enumerate every
key bo/admin reads under /api/admin/config/*, classify each as
brand-scoped or platform-global, and route it to the existing
admin_service surface. Net new admin_service code should be minimal.
Effort estimate is "1 week" only because the enumeration step is boring and error-prone, not because the implementation is heavy.
Gate: brand vs platform classification must be reviewed by multi-brand leads before redirecting keys.
/api/admin/i18n/*
Legacy: server-side translation strings for the bo/admin UI and the
player-facing notice templates. Stored in i18n_string (key, locale,
text) with admin-side CRUD.
This is the only retirement item that has a real platform-wide publish channel: changes are picked up by the player web bundle on the next reload. Cut-over must therefore keep the read path live for existing player traffic — design the new endpoint to serve the same JSON shape the player bundle currently consumes from middle_server.
v2 target: new app/api/routes/i18n.py with read + write under
/api/v1/i18n. Player-side reads should move to a fast Redis-cached
endpoint on player_service to remove the cross-service dependency.
Gate: player bundle compatibility test; verify the JSON shape parses
in the existing web/blue translation loader.
/api/admin/bi/*
Legacy: aggregated BI queries (player counts, GGR by date range,
funnel breakdown). Most queries hit admin_stat_day, agent_stat_day,
and the player ledger.
Largest item by effort: needs an aggregator design (materialised views? on-demand SQL? a thin BI service that calls admin_service read-only?) before implementation. Defer until the read-replica-or-warehouse direction is decided.
v2 target: candidate names bi_service or admin_service extension.
Don't pick until the aggregator design lands; the current stats.py
in admin_service already handles real-time counts and we should not
expand that file to BI before deciding the boundary.
Gate: design doc; performance test against production data volumes.
/coin/*
Legacy: crypto deposit / withdrawal flows specific to the legacy
coin balance bucket. Most coin functionality already exists in v2
(Plisio callback in wallet_service, coin-deposit approval in
admin_service.player_finance under /api/v1/player/deposit/coin/*),
but the root /coin/* surface is a separate caller contract used by
older player frontends.
v2 target: either fold the remaining endpoints into wallet_service
under /internal/wallet/coin/* (preferred — keeps the single-money-
writer invariant) or carve out a coin_service. The former is the
default unless a concrete reason emerges.
Gate: enumerate every player-side caller of /coin/*; some of those
calls cross into web_server legacy territory and must be migrated
along with the player gateway cut-over.
POST /{path:path} catch-all
Legacy: middle_server had a generic catch-all that forwarded any unmatched POST to an internal service based on a path-prefix table. It is by far the riskiest of the seven — implicit routing is the hardest thing to retire safely.
Plan:
- Run a 30-day request log capture against the legacy stack to enumerate every concrete path that hits the catch-all. Existing nginx access logs are the cheapest source.
- Classify each into "already migrated to an explicit v2 route" vs "still needs a target".
- For the still-needs-a-target set, design explicit routes; do not re-introduce path-based forwarding under any circumstances.
- After all explicit paths land, delete the legacy catch-all.
A legacy-catchall 410 stub is exposed at
/api/admin/legacy-catchall/{path:path} so frontends that still
expect a write-able catch-all surface fail loudly during the
enumeration phase.
Gate: cannot delete the catch-all until every explicit path is
migrated AND bo/admin + player web bundle confirm zero remaining
calls in a staging-mirror traffic replay.
Cut-over Sequence
The migration must respect frontend deploy cadence. Recommended order:
/api/admin/config/*— lowest risk, used only at admin login and in the System Settings tab. Most subpaths can be redirected to the existingbrand_config/global-varssurfaces with no new backend code./api/admin/i18n/*— read-mostly. The write surface has few operators; the read surface needs a compatibility shape test against the player bundle./api/admin/role/*and/api/admin/menu/*— coupled (menus reference role permission bits). Migrate together. Front+back coordinated deploy./coin/*— money-adjacent. Requires wallet_service coordination and an enumeration of player-side callers./api/admin/bi/*— heaviest. Defer until the aggregator direction is decided. Do not block earlier prefixes on this one.POST /{path:path}catch-all — must be last. Cannot be retired until every explicit prefix above is migrated and a traffic replay confirms no other callers.
Stub routes
servers_v2/admin_service/app/api/routes/legacy_sunset.py returns
410 Gone with Sunset (RFC 8594) and Deprecation headers for
every prefix above plus a /api/admin/legacy-catchall/{path} stub
for the catch-all enumeration phase.
Why a stub instead of letting FastAPI 404:
- 404 is silent — a forgotten caller looks like a deployment glitch, not an intentional retirement.
- 410 + Sunset is the IETF-blessed shape for "this URL is intentionally and permanently dead". Browsers, CDNs, and integration tools surface the deprecation metadata directly.
- The stub also requires the same admin JWT every other route
requires (
Depends(get_current_admin)). It must NOT become an unauthenticated info-disclosure channel.
Guard test: servers_v2/admin_service/tests/test_legacy_sunset.py
locks the 410 + Sunset contract and asserts unauthenticated callers
get 401, not 410.
When a real implementation lands for one of the prefixes:
- Remove the prefix entry from
_RETIRED_PREFIXESinlegacy_sunset.pyin the same commit that adds the new route — otherwise callers see a 410 → 200 oscillation. - Update the corresponding row in this runbook from
not startedtodoneand link the new route. - Update the parametrize list in
test_legacy_sunset.py.
Ownership Follow-ups
Every row in the scope table currently lists TBD as Migration
owner. Before any of these migrations starts, leads must assign
named owners — the platform-backend group is the default fallback
but the work crosses into payments (coin), data (BI), and frontend
(role/menu/i18n) territory and each should sign off explicitly.
Rollback
Each per-prefix migration is independent and rolls back to "no admin_service implementation, legacy_sunset stub returns 410". The generic recipe (to be specialised per prefix as it's adopted):
- Revert the per-prefix admin_service router commit + matching
client-side migration in
bo/adminto keep calling the legacymiddle_serverhost. - Roll back the matching
legacy_sunset.py410-stub entry so callers don't hitSunset:headers after the revert. - Re-run the
test_legacy_sunset.pyparametrize list to confirm the prefix is excluded from the sunset surface.
Each individual per-prefix runbook (filed under
docs/runbooks/legacy-middle-server-retirement/<prefix>.md once
owners adopt it) must include:
- A concrete
git revertSHA pointer once the migration commit lands. - The exact
bo/adminconfig flag (or build) to flip back to the legacy host. - DB-level rollback (none expected — these are routing migrations, not schema migrations — but each runbook should explicitly state "no DB changes" or document them).
- Verification step: which smoke test confirms the revert worked.
Until the per-prefix runbooks land, treat this section as the template for what they must contain, not as an executable recipe.