HBASE-30049 RestoreSnapshotHelper creates StoreFileTracker with wrong…#8013
HBASE-30049 RestoreSnapshotHelper creates StoreFileTracker with wrong…#8013bjomobo wants to merge 1 commit intoapache:masterfrom
Conversation
| * 3. Restore from snapshot | ||
| * 4. Verify all regions open and data matches the snapshot | ||
| * | ||
| * Before the fix, step 4 would fail with FileNotFoundException because |
There was a problem hiding this comment.
this test testRestoreSnapshotWithFileTrackerAfterDataChange is still working without the fix as global SFT config itself is FILE, but this issue happens when global is Default and Table config is FILE like here - https://github.com/apache/hbase/pull/8013/changes#diff-35d1d555129ce54d8d45094be2f76300b7cd9d29c9a109ff95d81c626b428939R77
There was a problem hiding this comment.
thanks for pointing this out. removed the global TRACKER_IMPL=FILE setting from setupCluster() and now setting it at the table descriptor level only.
| * HFiles, not the compaction output. | ||
| */ | ||
| @Test | ||
| public void testRestoreSnapshotAfterCompaction() throws Exception { |
There was a problem hiding this comment.
this as well, since the global SFT config itself is FILE, this should work, but looks like an issue with the test, it fails at this assert https://github.com/apache/hbase/pull/8013/changes#diff-66c1760bfcfc64f561b37dd18df6a41595e93d1c516a326de27ad6e9c631ebb2R266
it should be failing here instead?
… config causing no-op filelist updates RestoreSnapshotHelper.restoreRegion() creates a StoreFileTracker using the raw Master Configuration object, which does not contain table-level settings like hbase.store.file-tracker.impl=FILE. This causes DefaultStoreFileTracker to be instantiated, whose doSetStoreFiles() is a complete no-op. The .filelist is never updated after the restore moves HFiles to the archive and creates link files for the snapshot's HFiles. When a region subsequently opens, the stale .filelist references HFiles that were moved to the archive, resulting in FileNotFoundException and the region getting stuck in OPENING state indefinitely. This is a regression introduced by HBASE-28564, which refactored reference file creation to go through the StoreFileTracker interface. The cloneRegion() method in the same commit correctly merges the table descriptor config via StoreUtils.createStoreConfiguration() before creating the tracker, but restoreRegion() was missed. The fix applies the same pattern: merge the table descriptor and column family descriptor configuration into the Configuration object before passing it to StoreFileTrackerFactory.create(). This ensures the correct StoreFileTracker implementation is resolved based on the table-level setting. Both locations in restoreRegion() are fixed: 1. For existing families already on disk 2. For new families added from the snapshot
ecafaf9 to
e3e3c55
Compare
Description
RestoreSnapshotHelper.restoreRegion() creates a StoreFileTracker using the raw Master conf which lacks table-level settings like
hbase.store.file-tracker.impl=FILE. This causes DefaultStoreFileTracker to be used, whosedoSetStoreFiles()is a no-op. The.filelistis never updated after restore, leading toFileNotFoundExceptionwhen regions try to open files that were archived.Regression introduced by HBASE-28564.
Changes
StoreUtils.createStoreConfiguration()before creating the tracker inrestoreRegion()snapshotFamilyFiles != nullcheck to avoidNullPointerExceptionon families being removedwithColumnFamilyDescriptor()to the "Add families not present in the table" code pathNew Tests
TestRestoreSnapshotProcedureFileBasedSFT— end-to-end restore with FILE trackerTestRestoreSnapshotHelperWithFileBasedSFT— unit-level.filelistverificationTestRestoreSnapshotFileTrackerTableLevel— table-level FILE tracker with compaction and multi-family restoreJira: https://issues.apache.org/jira/browse/HBASE-30049