Fix precision loss for large bigint PKs in table repair JSON roundtrip#98
Fix precision loss for large bigint PKs in table repair JSON roundtrip#98
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughReplaced reflect/float64-based numeric handling with json.Number-aware parsing and a precision-preserving CompareNumeric (int64 fast path, math/big fallback); enabled json.Decoder.UseNumber(); updated conversions, comparisons, parser, repair logic, and tests to preserve exact large-integer and numeric semantics. Changes
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
🧹 Nitpick comments (1)
tests/integration/table_repair_test.go (1)
2215-2224: Minor: Cleanup only removes table from repset on node1.The table is added to the repset on both nodes (lines 2211-2212), but
repset_remove_tableis only called onNode1Pool. For consistency, consider removing from both nodes:Suggested cleanup improvement
t.Cleanup(func() { - _, _ = pgCluster.Node1Pool.Exec(ctx, fmt.Sprintf(`SELECT spock.repset_remove_table('default', '%s');`, qualifiedTableName)) + for _, pool := range []*pgxpool.Pool{pgCluster.Node1Pool, pgCluster.Node2Pool} { + _, _ = pool.Exec(ctx, fmt.Sprintf(`SELECT spock.repset_remove_table('default', '%s');`, qualifiedTableName)) + } for _, pool := range []*pgxpool.Pool{pgCluster.Node1Pool, pgCluster.Node2Pool} { _, _ = pool.Exec(ctx, fmt.Sprintf("DROP TABLE IF EXISTS %s CASCADE", qualifiedTableName)) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/integration/table_repair_test.go` around lines 2215 - 2224, The cleanup currently only calls spock.repset_remove_table via pgCluster.Node1Pool; update the Cleanup to remove the table from the repset on both nodes by invoking spock.repset_remove_table for pgCluster.Node2Pool as well (use the same qualifiedTableName), ensuring symmetry with the earlier addition on both Node1Pool and Node2Pool and keeping the rest of the cleanup (DROP TABLE on both pools, removing diff files) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@tests/integration/table_repair_test.go`:
- Around line 2215-2224: The cleanup currently only calls
spock.repset_remove_table via pgCluster.Node1Pool; update the Cleanup to remove
the table from the repset on both nodes by invoking spock.repset_remove_table
for pgCluster.Node2Pool as well (use the same qualifiedTableName), ensuring
symmetry with the earlier addition on both Node1Pool and Node2Pool and keeping
the rest of the cleanup (DROP TABLE on both pools, removing diff files)
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ffbde1de-b77f-4381-8899-6dd23ebb6e19
📒 Files selected for processing (5)
pkg/common/utils.gopkg/common/utils_test.gopkg/types/types.gopkg/types/types_test.gotests/integration/table_repair_test.go
a123dee to
051f8c1
Compare
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/consistency/repair/plan/parser/parser.go`:
- Around line 617-633: The numeric tokenization still parses all numbers to
float64 in scanNumber, causing precision loss before CompareNumeric runs; update
scanNumber to capture the original numeric text and store it as a json.Number
token (instead of using strconv.ParseFloat -> float64) so that
utils.CompareNumeric can operate on json.Number losslessly, and ensure any token
struct/enum used by parseExpression (the token carrying numeric values) is
adjusted to hold json.Number (and preserve existing paths that expect float64 by
converting only when necessary).
In `@pkg/common/utils.go`:
- Around line 548-563: The json.Number handling branch currently falls back to
parsing Float64 and casting to int64 which silently truncates fractional values;
update the block that handles val.(json.Number) so that after parsing f, you
verify it's an exact integer (e.g., compare f to math.Trunc(f) or check
fractional part) and if it has a fractional component return an error mentioning
the original json.Number and pgType instead of casting, while still keeping the
existing safe-range check for float64-to-int64 conversion.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4af07048-e098-403f-a789-b8c5f02eb4cb
📒 Files selected for processing (8)
internal/consistency/repair/executor.gointernal/consistency/repair/plan/parser/parser.gointernal/consistency/repair/table_repair.gopkg/common/utils.gopkg/common/utils_test.gopkg/types/types.gopkg/types/types_test.gotests/integration/table_repair_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/types/types_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/types/types.go
- pkg/common/utils_test.go
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Duplication | 0 |
TIP This summary will be updated as you push new changes. Give us feedback
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/common/utils.go`:
- Around line 548-571: The float64 branch currently does an unchecked conversion
("if f, ok := val.(float64); ok { return int64(f), nil }"); update it to mirror
the json.Number fallback: declare const maxSafeInt = 1 << 53, check if f >
maxSafeInt || f < -maxSafeInt and return a formatted error including n/a value
and pgType, then check if math.Trunc(f) != f and return a formatted "not an
integer" error, and only then return int64(f), nil. Use the same error
wording/format as the json.Number path and reference the existing pgType
variable for context. Ensure you import math if not already.
- Around line 1480-1483: The float32/float64 branches currently call
new(big.Float).SetPrec(bigFloatPrec).SetFloat64(...) which panics on IEEE-754
NaN; guard these cases by checking math.IsNaN(float64Value) before calling
SetFloat64 (for float32 convert to float64 first), and return (nil, false) when
NaN is detected so the repair path fails gracefully; reference the
float32/float64 case blocks, bigFloatPrec, SetFloat64 and avoid invoking
SetFloat64 when math.IsNaN would be true (thus preventing big.ErrNaN panics).
- Around line 1388-1479: CompareNumeric fails to handle
pgtype.Numeric/pgxv5type.Numeric types, causing fallback to reflect.DeepEqual;
update asInt64 and toBigFloat to detect pgtype.Numeric (and pgx/v5's Numeric
type) and convert them using NormalizeNumericString (or call its helper to get
the normalized string), then parse that normalized string into an int64 (for
asInt64) or into a *big.Float via big.ParseFloat (for toBigFloat); specifically
modify asInt64, toBigFloat, and any type switches in CompareNumeric to include
cases for pgtype.Numeric / pgxv5type.Numeric that call NormalizeNumericString
and proceed with the existing conversion logic so merkle/repair paths use
precise numeric comparisons.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: ffeff7a2-ab85-4ff4-a8b8-a553ee4fe93a
📒 Files selected for processing (2)
internal/consistency/repair/plan/parser/parser.gopkg/common/utils.go
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pkg/common/utils.go (1)
619-625: Questionable fallback tofloat64whenpgtype.Numeric.Setfails.If
num.Set(n.String())fails at line 616, falling back ton.Float64()could introduce silent precision loss. PostgreSQLnumericcan represent values thatfloat64cannot (>15-17 significant digits). This fallback path seems unlikely to be reached for valid numeric strings, but if it is, returningfloat64instead of*pgtype.Numericcould cause type inconsistency issues downstream.Consider returning an error instead of falling back:
Suggested fix
case "numeric", "decimal": if n, ok := val.(json.Number); ok { num := pgtype.Numeric{} if err := num.Set(n.String()); err == nil { return &num, nil } - // Fall back to float64 for simple values - f, err := n.Float64() - if err != nil { - return nil, fmt.Errorf("failed to parse json.Number %q for %s: %w", n.String(), pgType, err) - } - return f, nil + return nil, fmt.Errorf("failed to parse json.Number %q as numeric for %s", n.String(), pgType) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/utils.go` around lines 619 - 625, The current fallback returns a float64 when pgtype.Numeric.Set(n.String()) fails, risking precision loss and type inconsistency; instead, when num.Set(n.String()) returns an error, propagate a formatted error (including n.String(), pgType and the original Set error) rather than calling n.Float64(); update the handling around pgtype.Numeric.Set and remove the n.Float64() fallback so callers always receive a clear error or a proper *pgtype.Numeric value.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pkg/common/utils.go`:
- Around line 619-625: The current fallback returns a float64 when
pgtype.Numeric.Set(n.String()) fails, risking precision loss and type
inconsistency; instead, when num.Set(n.String()) returns an error, propagate a
formatted error (including n.String(), pgType and the original Set error) rather
than calling n.Float64(); update the handling around pgtype.Numeric.Set and
remove the n.Float64() fallback so callers always receive a clear error or a
proper *pgtype.Numeric value.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 3b5c2e53-2d55-4d93-97b3-adbd068bf91c
📒 Files selected for processing (1)
pkg/common/utils.go
OrderedMap.UnmarshalJSON used Go's default JSON number decoding (float64), which silently truncates integers exceeding 2^53. For tables with large bigint primary keys (e.g. snowflake IDs like 415588913294348289), this caused: - PK corruption: 415588913294348289 → 415588913294348288 (off by 1) - PK collisions: adjacent PKs mapped to the same float64, causing rows to silently overwrite each other in repair maps - Wrong upserts/deletes: repairs targeted wrong rows or missed them - Growing diffs after repair: corrupted values replicated via spock Fix: add dec.UseNumber() to OrderedMap.UnmarshalJSON so JSON numbers are preserved as json.Number strings. Update ConvertToPgxType to handle json.Number for integer types (lossless Int64 parse), numeric/decimal (exact string via pgtype.Numeric), and float types (Float64 parse). Replace the three duplicate lossy numeric helpers (toFloat64, asFloat, asNumber) with a single precision-safe CompareNumeric in pkg/common that uses an int64 fast path and falls back to math/big.Float (256-bit) for json.Number decimals, large uint64, and native floats. Also handle json.Number in repair plan when-expression literals (scanNumber), pk_in matching, and origin timestamp extraction. The v4 pgtype dependency will be removed in a follow-on PR. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
e10268f to
2eb0c0b
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/common/utils_test.go (1)
5-6: Assert the numeric value, not just the result type.
TestConvertToPgxType_JsonNumberNumericcurrently passes as long as the result is someNumeric-typed value. A rounded value wrapped inNumericwould still satisfy this test, which leaves the core precision guarantee under-tested.Suggested test tightening
import ( "encoding/base64" "encoding/json" - "fmt" "testing" "time" @@ // Must NOT be float64 — that would lose precision _, isFloat := val.(float64) require.False(t, isFloat, "numeric json.Number must not convert to float64") - // Verify it's a *pgtype.Numeric by checking the type name - typeName := fmt.Sprintf("%T", val) - require.Contains(t, typeName, "Numeric", "expected pgtype.Numeric, got %s", typeName) + cmp, ok := CompareNumeric(val, n) + require.True(t, ok, "expected numeric result, got %T", val) + require.Zero(t, cmp, "numeric json.Number should preserve the exact value") }Also applies to: 319-334
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/common/utils_test.go` around lines 5 - 6, TestConvertToPgxType_JsonNumberNumeric currently only checks the result's type (Numeric) and not its actual numeric value; update the test to assert the exact numeric value/precision returned by ConvertToPgxType when given a json.Number (e.g., convert the returned pgx Numeric to a string or float and compare to the expected precise value), and apply the same tightening to the other similar test cases referenced (the other JsonNumberNumeric checks) so they validate both type and numeric content.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@internal/consistency/repair/plan/parser/parser.go`:
- Around line 232-238: The parser currently validates numeric literals by
calling json.Number.Float64() which rejects syntactically valid but out-of-range
numbers (e.g., 1e1000); remove the Float64() check so that after creating n :=
json.Number(text) the code returns token{typ: tokNumber, lit: text, pos: start,
value: n} directly, leaving numeric comparison/handling to utils.CompareNumeric
and avoiding premature rejection in the tokenization logic that constructs
tokens in parser.go.
---
Nitpick comments:
In `@pkg/common/utils_test.go`:
- Around line 5-6: TestConvertToPgxType_JsonNumberNumeric currently only checks
the result's type (Numeric) and not its actual numeric value; update the test to
assert the exact numeric value/precision returned by ConvertToPgxType when given
a json.Number (e.g., convert the returned pgx Numeric to a string or float and
compare to the expected precise value), and apply the same tightening to the
other similar test cases referenced (the other JsonNumberNumeric checks) so they
validate both type and numeric content.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 97b46597-1eac-4376-b64e-0b0b334e9743
📒 Files selected for processing (8)
internal/consistency/repair/executor.gointernal/consistency/repair/plan/parser/parser.gointernal/consistency/repair/table_repair.gopkg/common/utils.gopkg/common/utils_test.gopkg/types/types.gopkg/types/types_test.gotests/integration/table_repair_test.go
🚧 Files skipped from review as they are similar to previous changes (3)
- pkg/types/types.go
- tests/integration/table_repair_test.go
- pkg/common/utils.go
The lexer already validates numeric literal syntax character-by-character. The Float64() check rejected syntactically valid numbers that exceed float64 range, which is unnecessary since CompareNumeric handles json.Number losslessly via big.Float. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
OrderedMap.UnmarshalJSON used Go's default JSON number decoding (float64), which silently truncates integers exceeding 2^53. For tables with large bigint primary keys (e.g. snowflake IDs like 415588913294348289), this caused:
Fix: add dec.UseNumber() to OrderedMap.UnmarshalJSON so JSON numbers are preserved as json.Number strings. Update ConvertToPgxType to handle json.Number for integer types (lossless Int64 parse), numeric/decimal (exact string via pgtype.Numeric), and float types (Float64 parse). Also update comparePKValues/toFloat64 to recognize json.Number.