2026-03-21 16:28:48 +08:00

20 KiB
Raw Blame History

Pitfalls Research

Domain: Go/GORM/MySQL financial analytics — profit/loss aggregation functions Researched: 2026-03-21 Confidence: HIGH (derived directly from existing codebase evidence + confirmed Go/MySQL behavior)


Critical Pitfalls

Pitfall 1: MySQL SUM with Division Returns Decimal, Not SIGNED Integer

What goes wrong: When a SUM() expression includes any division operation (e.g., SUM(amount * draw_count / total_count)), MySQL returns the result as a Decimal type, not BIGINT. Scanning a Decimal into a Go int64 field silently returns 0. The dashboard code already hit this and left a comment documenting it.

Evidence from dashboard_activity.go:174:

// 注意: MySQL SUM()运算涉及除法时会返回Decimal类型需要Scan到float64

The fix used there: scan revenue stats into float64, then cast to int64 in Go.

Why it happens: MySQL promotes arithmetic involving division to Decimal to preserve fractional precision. GORM's Scan() does not coerce types — it matches Go field types exactly, and int64 ≠ Decimal causes a silent zero.

How to avoid: Wrap any SUM that contains division with CAST(... AS SIGNED) in the SQL itself. This forces integer rounding at the database layer and lets you scan directly into int64. The existing cost query in dashboard_activity.go:237 already uses this pattern:

CAST(SUM(...) AS SIGNED) as total_cost

Use CAST(... AS SIGNED) on every aggregated column that involves division. Never scan division-containing SUM results directly into int64 without the cast.

Warning signs:

  • Aggregated monetary fields come back as 0 even when data exists
  • Revenue stats are non-zero but cost stats are zero (or vice versa)
  • Struct fields stay at their zero values after Scan()

Phase to address: Implementation phase — apply during every query that uses proportional allocation (e.g., distributing an order's revenue across multiple activities via draw_count / total_count).


Pitfall 2: Double-Counting Revenue When One Order Spans Multiple Activities

What goes wrong: A single order can result in draw logs across multiple activities (e.g., a user plays activity A and activity B in one checkout). If you SUM(orders.actual_amount) grouped by activity without proportional allocation, the full order amount is counted in every activity it touches. The existing dashboard already experienced this and added two-level subquery attribution.

Evidence from dashboard_activity.go:197-212: the fix was to compute draw_count per (order, activity) and total_count per order in two separate subqueries, then scale the order amount by the ratio draw_count / total_count.

Why it happens: Aggregation joins orders to activity_draw_logs which is a one-to-many relationship. Without explicit proration, the order amount fans out to every matching activity row.

How to avoid: Always attribute revenue using the subquery pattern:

JOIN (
    SELECT order_id, activity_id, COUNT(*) as draw_count
    FROM activity_draw_logs JOIN activity_issues ON ...
    GROUP BY order_id, activity_id
) as order_activity_draws ON order_activity_draws.order_id = orders.id
JOIN (
    SELECT order_id, COUNT(*) as total_count
    FROM activity_draw_logs GROUP BY order_id
) as order_total_draws ON order_total_draws.order_id = orders.id

Then multiply: orders.actual_amount * order_activity_draws.draw_count / order_total_draws.total_count. For the user-dimension function, this pattern still applies if a user's order touches multiple issues.

Warning signs:

  • Total revenue across all activities exceeds the sum of all actual order payments
  • A user's computed spending is greater than what WeChat Pay received
  • Profit rates are implausibly negative across many activities

Phase to address: Implementation phase — design the user-dimension and activity-dimension query structure before writing SQL.


Pitfall 3: Mixing Game-Pass Orders into Cash Revenue (Calculation Mouth-Discrepancy)

What goes wrong: Game-pass orders (次卡) have actual_amount = 0 and source_type = 4 (or order_no LIKE 'GP%' or remark containing use_game_pass). Including them in SUM(actual_amount + discount_amount) makes their "revenue" appear as zero, understating total income. Including them in cost without crediting their imputed value makes every game-pass activity show a loss.

The codebase defines three detection conditions in internal/service/finance/profit_metrics.go:IsGamePassOrder. These must all be checked — any single condition is insufficient because historical data uses different conventions.

Why it happens: Game-pass orders are structurally identical to regular orders but have zero monetary value. Treating all orders uniformly by summing actual_amount misses the imputed value of the subscription the user already paid.

How to avoid: Use strict mutual exclusion in SQL:

  • If game-pass order: revenue = draw_count * activity.price_draw, discount = 0, cash = 0
  • If cash/coupon order: revenue = actual_amount + discount_amount, game-pass value = 0
  • Use CASE WHEN (source_type=4 OR order_no LIKE 'GP%' OR (actual_amount=0 AND remark LIKE '%use_game_pass%')) THEN ... ELSE ... in every SUM

Never add actual_amount + discount_amount + game_pass_value as if they are additive columns of the same thing. They are alternative values for the same economic event.

Warning signs:

  • Activities with many game-pass players show profit rates near -100%
  • Total platform revenue is suspiciously lower than WeChat Pay reports
  • SpendingPaidCoupon and SpendingGamePass are both non-zero for the same order

Phase to address: Implementation phase — encode the mutual-exclusion rule in query construction helpers before writing any aggregate SQL.


Pitfall 4: Silently Ignoring Scan Errors on Aggregation Queries

What goes wrong: Several existing dashboard queries call db.Table(...).Select(...).Scan(&stats) without checking the returned error. If the query fails (schema mismatch, column rename, database failover), stats remains an empty slice, downstream computations produce zero results, and no error is returned to the caller. The data looks correct (all zeros) rather than erroring.

Evidence from dashboard_activity.go:146-158drawStats scan has no .Error check. The pattern appears in multiple places throughout the dashboard handlers.

Why it happens: GORM's method chaining makes it easy to forget error handling. The pattern db.Table(...).Scan(&x) is syntactically identical whether you check .Error or not. In exploratory handler code that was never tested, errors were skipped for brevity.

How to avoid: The new internal/service/finance/ package must check every query error:

if err := db.Table(...).Scan(&result).Error; err != nil {
    return nil, fmt.Errorf("profit_loss query failed: %w", err)
}

Service functions should return error as second return value — not swallow errors internally. The existing profit_metrics.go pure functions have no DB access and are fine; the DB-querying functions must propagate errors.

Warning signs:

  • Function returns zero values with no error in tests against an empty SQLite db
  • Aggregation results are uniformly zero across all parameters
  • Schema changes (column renames, table renames) cause silent failures

Phase to address: Implementation phase — establish error-check convention in the first function written; testing phase — assert non-nil error on deliberately broken queries.


Pitfall 5: Omitting Refunded Orders from Cost Calculation

What goes wrong: Inventory items (user_inventory) awarded from a subsequently refunded order should be excluded from cost. If you compute cost by summing user_inventory.value_cents grouped by activity_id without filtering on orders.status, you count the cost of prizes from refunded orders but don't count their revenue — making the platform appear to have given away prizes for free.

The existing code in dashboard_activity.go:250-251 already had to special-case this:

Where("(orders.status = ? OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)", 2)

Note the legacy data escape hatch: some old inventory rows have order_id = 0 or NULL and cannot be filtered by order status. This must be preserved.

Why it happens: user_inventory records are created when prizes are awarded, which happens before the refund window closes. Refunds do not delete inventory rows — they update orders.status to 4. Naive aggregation on user_inventory ignores order status entirely.

How to avoid: Always join orders to user_inventory via order_id and include the legacy escape hatch:

LEFT JOIN orders ON orders.id = user_inventory.order_id
WHERE (orders.status = 2 OR user_inventory.order_id = 0 OR user_inventory.order_id IS NULL)
AND COALESCE(user_inventory.remark, '') NOT LIKE '%void%'

The void remark filter is also required — manually voided inventory entries should never count as platform cost.

Warning signs:

  • Platform cost is higher than expected for activities with known refund activity
  • Cost-side totals don't reconcile with accounting system data
  • Test cases with a refunded order still show non-zero cost

Phase to address: Implementation phase — add a test case with a refunded order and verify cost = 0 for that order's prizes.


Pitfall 6: Using Write DB (DbW) for Analytics Queries

What goes wrong: The project has master-slave read-write splitting. Analytics queries that run on GetDbW() (master) instead of GetDbR() (replica) add latency to the write path, can block replication, and in the worst case cause master overload under concurrent analytics requests.

The CONCERNS.md already flags 113 direct GetDbW() calls in the handler layer. The pattern of bypassing the correct DB connection is established in the codebase and can propagate to new code.

Why it happens: GetDbR() and GetDbW() look identical in usage. Developers copying from handler code that was written for writes will use GetDbW() by accident. The finance service package does not yet have established conventions.

How to avoid: The new internal/service/finance/ service must accept a *gorm.DB read-only handle at construction time (inject repo.GetDbR()), not a full repository. Document in the function signatures or struct fields that only the read replica is used:

type ProfitLossService struct {
    dbR    *gorm.DB  // read replica only — never use for writes
    logger *zap.Logger
}

Never call repo.GetDbW() inside finance analytics functions.

Warning signs:

  • MySQL master replication lag increases when analytics endpoint is called
  • Write latency spikes during dashboard loads
  • GetDbW() appears in internal/service/finance/ source files

Phase to address: Implementation phase — inject read-only DB handle in constructor; testing phase — verify with a mock that only the read DB is called.


Technical Debt Patterns

Shortcut Immediate Benefit Long-term Cost When Acceptable
Scan revenue into float64 instead of fixing SQL with CAST(AS SIGNED) Avoids SQL rewrite Floating-point rounding on monetary values (e.g., 0.1 + 0.2 ≠ 0.3 in IEEE 754) Never for monetary fields — always use CAST(AS SIGNED)
In-memory sort + full table fetch for custom sort order Simpler than ORDER BY with computed columns Loads unbounded rows into Go heap when activity count grows Only acceptable if total row count is bounded by pagination elsewhere
Hardcoding game-pass detection conditions in each query Avoids abstraction overhead Three different detection conditions must stay in sync across multiple queries Never — centralize detection in IsGamePassOrder() already defined in finance package
Skip error check on Scan() Fewer lines of code Silent wrong data; impossible to distinguish "query returned zero rows" from "query failed" Never for financial data
Use AVG(multiplier) across draws as the cost multiplier One query instead of per-row Hides per-order multiplier variance; a 2x card on one draw inflates cost for all draws in the group Acceptable for summary statistics; not for per-order breakdowns

Integration Gotchas

Integration Common Mistake Correct Approach
GORM Scan into anonymous struct Forgetting to qualify column names in SELECT causes ambiguous column error when multiple tables have id, created_at, etc. Always alias computed columns explicitly: SELECT orders.user_id as user_id, not SELECT user_id
GORM raw SQL with Raw() + Scan() Parameterized values passed in wrong order cause SQL to silently use zero values Verify query with db.Statement.SQL.String() during development; test with non-trivial input values
MySQL COALESCE with nullable int columns COALESCE(NULL, 0) works but COALESCE(column, 0) on a non-nullable column with value 0 returns 0NULLIF needed to distinguish "not set" from "explicitly zero" Use COALESCE(NULLIF(value_cents, 0), fallback_1, fallback_2, 0) pattern already established in existing cost queries
Multiple ID lists in WHERE IN (?) with GORM Passing an empty slice []int64{} produces invalid SQL WHERE id IN () in some GORM versions Guard with if len(ids) == 0 { return emptyResult, nil } before building the query
Read replica lag Querying replica immediately after a write (e.g., after seeding test data) can return stale results In tests, use write DB handle or wait for sync; in production, this is acceptable for analytics

Performance Traps

Trap Symptoms Prevention When It Breaks
Fetching all activities before computing profit/loss (no predicate pushdown) 100% CPU on Find(&activities), slow response time Apply all filters (status, name, date range) in the initial query before scanning, then pass activityIDs to subsequent queries When activity count exceeds ~1,000
Correlated subquery inside SUM for every row Query time grows O(n²) with draw log volume Pre-aggregate into a derived table subquery joined once, not per-row When draw_logs table exceeds ~500K rows
No index on activity_draw_logs.order_id or user_inventory.activity_id Sequential scan on every analytics query Verify indexes exist with SHOW INDEX FROM activity_draw_logs; add composite index (issue_id, order_id) if missing From day one on tables with writes
Loading all activities into memory for in-application sort Memory spike on large result sets; no benefit if caller only wants top-10 Accept this tradeoff only when total activities < 500; add a hard cap with an error if exceeded When activity count exceeds ~500
Querying user_inventory without status IN (1, 3) filter Voided/cancelled inventory items inflate cost Always filter: WHERE user_inventory.status IN (1, 3) Immediately — even small void counts distort cost

Security Mistakes

Mistake Risk Prevention
Interpolating user-supplied user_id or activity_id into raw SQL string instead of parameterized query SQL injection — attacker can exfiltrate all financial data Always use parameterized queries: .Where("user_id IN ?", ids) not fmt.Sprintf("user_id IN (%s)", idsStr)
Exposing raw profit/loss data without admin role check Non-admin users can read platform margin data The new service functions are Service layer — callers (API handlers) must apply RequireAdminRole() middleware; document this requirement in the function's GoDoc
Logging query parameters that contain user IDs User ID lists in error logs can be correlated with financial data Log query failure with a count, not the full ID list: "profit_loss query failed for %d users: %v"

"Looks Done But Isn't" Checklist

  • Game-pass mutual exclusion: Verify that SpendingPaidCoupon and SpendingGamePass are never both non-zero for the same order. Write a test case with a mixed-type order set.
  • Refunded order exclusion: Add a test case where an order is refunded (status=4) and verify it contributes zero to both revenue and cost.
  • Legacy zero order_id: Confirm inventory rows with order_id = 0 are included in cost (not excluded by the orders JOIN). Add a test row with order_id = 0 and verify it appears in cost.
  • Empty parameter handling: Call both functions with nil/empty userIDs and nil/empty activityID — verify they return all-data aggregation, not empty results or SQL errors.
  • All five asset types covered: Points, coupons, item cards, physical products, fragments. Verify all five appear in the breakdown output. Missing one silently understates cost.
  • CAST on division SUM: Open every query with a / operator in a SUM and confirm CAST(... AS SIGNED) wraps the entire expression.
  • Read-only DB used: Grep for GetDbW inside internal/service/finance/ — result must be empty.
  • Error propagation: Every Scan() call inside finance functions must have its .Error checked and returned to the caller.

Recovery Strategies

Pitfall Recovery Cost Recovery Steps
Decimal-to-int64 silent zero LOW Add CAST(AS SIGNED) to affected SQL; rerun query — no data migration needed
Revenue double-counting discovered post-launch MEDIUM Backfill correct totals by recomputing with fixed query over historical data; notify operators of corrected figures
Wrong DB handle (write instead of read) LOW Change constructor injection; no data impact
Missing refund exclusion MEDIUM Recompute affected period's profit/loss with corrected query; mark old reports as superseded
Silently swallowed errors causing wrong zeros LOW-MEDIUM Add error checks; add alerting on zero-result aggregations where data is expected; audit logs for the affected period

Pitfall-to-Phase Mapping

Pitfall Prevention Phase Verification
Decimal/int64 scan mismatch Implementation — SQL design Integration test: query with division-containing SUM, assert non-zero int64 result
Revenue double-counting Implementation — query structure design Test: one order across two activities; assert sum of per-activity revenue equals order total
Game-pass mutual exclusion Implementation — use IsGamePassOrder() helper Unit test: game-pass order contributes to SpendingGamePass only, not SpendingPaidCoupon
Ignored Scan errors Implementation — code review gate Test: deliberately broken query (wrong table name); assert returned error is non-nil
Refunded order in cost Implementation — WHERE clause Test: refunded order inventory; assert cost contribution is zero
Write DB used Implementation — constructor injection Grep check in CI: GetDbW must not appear in internal/service/finance/
Missing LIMIT on supporting queries Implementation — query design Load test with 1000 activities; verify response time stays under 2s

Sources

  • internal/api/admin/dashboard_activity.go — direct evidence of BUG FIX comments for Decimal/int64, double-counting, game-pass misclassification (lines 173-175, 274-275, 544-545)
  • internal/api/admin/dashboard_spending.go — evidence of multi-join aggregation patterns and game-pass CASE expressions
  • internal/service/finance/profit_metrics.goIsGamePassOrder() three-condition detection; ComputeProfit() integer arithmetic; established pattern for cost multiplier
  • internal/service/finance/profit_metrics_test.go — existing test coverage confirming pure-function behavior
  • .planning/codebase/CONCERNS.md — flagged 113 GetDbW() calls in handler layer, silently swallowed errors in financial paths, missing error checks in pay_refund_admin.go
  • Go database/sql specification — Scan() does not coerce types; MySQL 8.x documentation — SUM with division promotes to Decimal

Pitfalls research for: Go/GORM/MySQL profit/loss analytics (Bindbox Game) Researched: 2026-03-21