perf (alarm): fix O(N²·log N) table update in AlarmTableUI — closes #3504#3770
Merged
shroffk merged 3 commits intoControlSystemStudio:masterfrom Apr 6, 2026
Merged
Conversation
With N active alarms the update() method called copy() on each AlarmInfoRow, firing 8 property-change events per row. Each event triggered a full re-sort of the SortedList (via the CHANGING_PROPERTIES extractor), resulting in O(N²·log N) comparator work — measured as multi-second freezes at ~2000 rows (issue ControlSystemStudio#3504). Fix: temporarily unbind the SortedList comparator before the copy loop and set it to null so property-change events are no-ops. On finally rebind to table.comparatorProperty(), which performs exactly one O(N·log N) sort over all updated rows. Fixes ControlSystemStudio#3504
…ate() Three headless JUnit tests (no JavaFX display required) that: 1. testOldBehaviorTriggersExcessiveSorts — proves that with the CHANGING_PROPERTIES extractor and an active comparator, calling copy() on N rows fires O(8N * N log N) comparator invocations (152,060 calls for N=150). 2. testFixSuppressesIntermediateSortsAndPreservesOrder — proves that setting the comparator to null before the loop and restoring it once after reduces calls to O(N log N) (696 calls for N=150) and that the final sort order is correct. 3. testFixOutperformsOldCodeAndProducesSameOrder — end-to-end comparison: asserts at least 5x fewer comparisons (actual: 218x) and that both paths produce the same sorted order. Accompanies the fix in AlarmTableUI.java (issue ControlSystemStudio#3504).
- Remove 'public' modifier from class and test methods (java:S5786 — JUnit 5 does not require public visibility) - Add assertThat(sorted.size(), equalTo(N)) in testOldBehaviorTriggersExcessiveSorts to resolve the unused 'sorted' variable warning (java:S1481) while also adding a meaningful assertion on the sorted view size
|
kasemir
approved these changes
Apr 6, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.



perf (alarm): fix O(N²·log N) table update in AlarmTableUI — closes #3504
PR link: https://github.com/emilioheredia-source/phoebus/pull/new/alarm_handler_fixes
Upstream issue: #3504
Summary
Fixes the performance regression reported in #3504, which was independently observed at CLS with ~2000 active alarms. When the alarm table contained a few thousand active alarms the UI became unusable, with multi-second freezes on every update cycle. This PR identifies the root cause as a quadratic-time interaction between JavaFX's
ObservableListextractor and a continuously-boundSortedListcomparator, and fixes it with a targeted change that reduces update cost from O(N²·log N) to a single O(N·log N) sort. Three headless regression tests are included that prove both the old behaviour and the fix without requiring a live alarm server.Root cause analysis
AlarmTableUIholds twoObservableList<AlarmInfoRow>fields created with theCHANGING_PROPERTIESextractor:The extractor tells JavaFX to watch all eight
SimpleXxxPropertyfields of each row (pv,description,severity,status,time,value,pv_severity,pv_status) and fire a list-change event whenever any of them changes. This is intentional: it keeps theSortedListwrapper up to date when an alarm's severity changes in place.Each
TableViewwraps the raw list in aSortedListwhose comparator is bound to the table'scomparatorProperty:The
SortedListresponds to every list-change event — including the item-property events from the extractor — by performing a full sort.update()callsAlarmInfoRow.copy()on each existing row to move in the new data.copy()callsset()on each of the eight properties. Everyset()fires a list-change event through the extractor. Every list-change event fires a full sort.Total comparator invocations per
update()call:copy()At N = 2000 (a normal load) and a comparator cost of ~1 µs: roughly 2000² × log(2000) × 8 × 1 µs ≈ 180 seconds of comparator work per update cycle. The profiler screenshot in #3504 shows
SortedList.sort()as the dominant hot-spot, consistent with this analysis.Fix
The fix is in
AlarmTableUI.update(). Before the copy loop, theSortedListcomparator is unbound and set tonull. With a null comparator theSortedListmirrors the source in insertion order and ignores all item-property events. After the loop completes, the comparator is rebound in afinallyblock — producing exactly one O(N · log N) sort over all N updated rows.The
finallyblock is the key safety guarantee: if any exception is thrown inside thetry(e.g. anIndexOutOfBoundsExceptionfrom a concurrent modification, or a comparator throwing on malformed data), thesortedlist is always left with a live comparator bound to the table. Without it, the table would silently stop sorting for the remainder of the session.Updated complexity:
set()calls, no sort)Regression tests
Three headless JUnit 5 tests in
app/alarm/ui/src/test/…/AlarmTableUpdatePerformanceTest.javacover the fix without requiring a live alarm server or a JavaFX display (onlyjavafx.baseclasses are used):testOldBehaviorTriggersExcessiveSortsConstructs N rows with an active comparator and runs the old code path (comparator present throughout
copy()loop). Asserts that comparator invocations exceed the cost of a single sort (i.e. proves the bug). At N = 150: 152,060 comparator calls vs. an upper bound of 3,000 for one sort.testFixSuppressesIntermediateSortsAndPreservesOrderRuns the fixed code path (null → copy → rebind). Asserts that comparator invocations stay within one sort's cost and that the final list is in the correct ascending order. At N = 150: 696 comparator calls, final order verified at every index.
testFixOutperformsOldCodeAndProducesSameOrderRuns both code paths on identical input and asserts at least 5× fewer comparator invocations in the fixed path (actual: 218×), and that both paths produce identical final row order.
Files changed
app/alarm/ui/…/AlarmTableUI.javaupdate(): unbind comparator before copy loop, rebind infinallyapp/alarm/ui/…/AlarmTableUpdatePerformanceTest.javaTesting notes
What has been run
AlarmTableModelTest,AlarmTreeValidatePathTest,AlarmURITest,AreaFilterTest.javafx.baseheadless data classes.End-to-end testing
End-to-end validation against a live alarm server has not yet been performed. We are currently setting up a local test instance of the Phoebus alarm backend to run a smoke-test with synthetic alarms at scale; this PR will be updated once that test is conducted.
In the meantime, the risk of the current change introducing a regression is low: the fix is confined to a single private method, the
finallyblock guarantees theSortedListcomparator is always restored regardless of any exception, and the only externally observable side-effect is that column sort order is applied once per update cycle instead of once per property-change event — which is both correct and the intended behaviour. We therefore opted to submit the fix now with the analytical and unit-test validation already in place rather than hold it until the server setup is complete.