HDDS-14870. Allow balancing of over replicated and quasi closed containers#9964
HDDS-14870. Allow balancing of over replicated and quasi closed containers#9964sarvekshayr wants to merge 6 commits intoapache:masterfrom
Conversation
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @sarvekshayr
I wanted to double check my understanding:
When includeNonStandardContainers is true, ContainerBalancerSelectionCriteria correctly allows:
- Case A: CLOSED + OVER_REPLICATED containers (with min CLOSED replicas + QUASI_CLOSED replicas)
- Case B: QUASI_CLOSED containers with all QUASI_CLOSED replicas
However, MoveManager.move() still enforces:
- Health must be HEALTHY – so OVER_REPLICATED is rejected with REPLICATION_NOT_HEALTHY_BEFORE_MOVE
- Container state must be CLOSED – so QUASI_CLOSED is rejected with REPLICATION_FAIL_CONTAINER_NOT_CLOSED
Since MoveManager doesn’t consider the config, it never sees includeNonStandardContainers. That would mean these containers get selected but fail when we actually try to move them.
Did you intend to update MoveManager as well to honor this config, or is there another path i am missing? I want to make sure i am not misreading the flow.
...va/org/apache/hadoop/hdds/scm/container/balancer/TestContainerBalancerSelectionCriteria.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java
Outdated
Show resolved
Hide resolved
...n/java/org/apache/hadoop/hdds/scm/container/balancer/ContainerBalancerSelectionCriteria.java
Show resolved
Hide resolved
sreejasahithi
left a comment
There was a problem hiding this comment.
Thanks @sarvekshayr for updating MoveManager to respect includeNonStandardContainers as well.
Just wanted to double check something against the PR description, which mentions additional QUASI_CLOSED replicas in the over-replicated CLOSED case.
From the current implementation, it looks like:
- If the replica on the source datanode is CLOSED and health is OVER_REPLICATED (with the flag on), balancing is allowed without requiring that the “extra” replicas are QUASI_CLOSED (could be extra CLOSED replicas too).
- If the replica on the source is QUASI_CLOSED, we only require enough CLOSED replicas (>= required count).
What changes were proposed in this pull request?
Allow container balancer to include non-standard containers such as -
This is allowed only if the new config
hdds.container.balancer.include.non.standard.containersis set to true.What is the link to the Apache JIRA
HDDS-14870
How was this patch tested?
Added tests in
TestContainerBalancerSelectionCriteriaandTestMoveManager.