Skip to content

HDDS-14900. Document S3 Lifecycle Configurations - Object Expiration#9979

Open
xichen01 wants to merge 2 commits intoapache:HDDS-8342from
xichen01:HDDS-14900
Open

HDDS-14900. Document S3 Lifecycle Configurations - Object Expiration#9979
xichen01 wants to merge 2 commits intoapache:HDDS-8342from
xichen01:HDDS-14900

Conversation

@xichen01
Copy link
Copy Markdown
Contributor

What changes were proposed in this pull request?

Document S3 Lifecycle Configurations - Object Expiration

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-14900

How was this patch tested?

--

@xichen01 xichen01 requested a review from ChenSammi March 26, 2026 02:56

| Property | Description |
|----------|-------------|
| `ozone.lifecycle.service.enabled` | Whether to enable the lifecycle management service. |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's show the default value of each property too.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

- In OM HA mode, only the leader OM executes lifecycle evaluation tasks.
- For FSO buckets, a directory is only marked as expired and deleted if all its child files and subdirectories have expired.
- For FSO buckets using Prefix, if the Prefix does not end with `/`, it will match both the directory with the exact name and sibling directories starting with the same prefix (e.g., `dir` matches both `dir` and `dir1`).
- If an invalid lifecycle configuration exists in the OM RocksDB, the service will skip that configuration and log an error without affecting the processing of other buckets.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the HDDS-13548 fix, I believe that we don't have any invalid lifecycle configuration exists in OM DB, right?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is just a theoretical possibility, but it might not need to be included in the user documentation.
Let me delete this


1. The lifecycle evaluation tasks running on the old leader will be interrupted.
2. After the new leader is elected, the lifecycle service restarts from the beginning. Previously evaluated buckets are not skipped, and the task starts over from the first bucket.
3. If `ozone.lifecycle.service.run.interval` is set to a large value (e.g., the default `24h`), the next round of tasks may not be scheduled until the following day after a leader transfer.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The BackgroundService schedules task with 0s delay, so after leader transfer, lifecycle service of new leader will start task immediately.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, Yes, that's a feature in our internal version; it's not the community feature. I will delete this

In an OM HA deployment, the lifecycle service only runs on the leader OM. When a Transfer Leader operation is performed:

1. The lifecycle evaluation tasks running on the old leader will be interrupted.
2. After the new leader is elected, the lifecycle service restarts from the beginning. Previously evaluated buckets are not skipped, and the task starts over from the first bucket.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This reminds one optimization we can do, that is, introduce a finish timestamp of bucket scan in LifeCycleConfiguration. If OM is restarted or OM leader is transferred, we can skip scan these LifeCycleConfiguration if (now() - it's last finish timestamp) < scan interval. What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can add a lastScanningKey field for each rule in the DB, the new leader can list bucket from the lastScanningKey, this is how we implement Transition Action internally.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can consider add both, lastVerifyTimestamp and lastScanningKey.


The compaction operation is executed asynchronously. Check the corresponding OM node's logs for completion status.

## References
Copy link
Copy Markdown
Contributor

@ChenSammi ChenSammi Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a section for the supported metrics ahead of Reference.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm concerned that changes to the metrics might not be synchronized to the document.

In the future, we might be able to export all metrics to the document using a plugin.

@ChenSammi
Copy link
Copy Markdown
Contributor

Thanks @xichen01 , the doc overall looks good to me.

@errose28
Copy link
Copy Markdown
Contributor

Hi @xichen01 is this a design doc or user documentation? User docs should go to the new website in the apache/ozone-site repo. Design docs are still being committed to this repo though.

@ChenSammi
Copy link
Copy Markdown
Contributor

ChenSammi commented Mar 31, 2026

Hi @xichen01 is this a design doc or user documentation? User docs should go to the new website in the apache/ozone-site repo. Design docs are still being committed to this repo though.

@errose28 , this is a user doc. But since the code is still in feature branch, not merged back to master yet, I think maybe it's stay with ozone repo first. After the feature branch merge, we can submit a new patch for this for ozone-site repo, what do you think?

BTW, there is no auto sync between ozone repo and ozone-site repo for document now?

@errose28
Copy link
Copy Markdown
Contributor

this is a user doc. But since the code is still in feature branch, not merged back to master yet, I think maybe it's stay with ozone repo first. After the feature branch merge, we can submit a new patch for this for ozone-site repo, what do you think?

A better option might be to raise a draft PR in ozone-site and merge it when the feature branch is merged here. The formatting supported for pages is a little different, so it is not always a copy/paste from one to the other. Docusaurus has callouts and mdx that could be helpful, for example. ozone-site also has stricter linting rules.

BTW, there is no auto sync between ozone repo and ozone-site repo for document now?

For user docs we should probably delete docs and Hugo content from the Ozone repo to ensure all new user docs go to ozone-site. We have the previous docs captured as Hugo builds in the new website so removing them here would ensure they live on as read-only. The only sync process we may want to add would be for design docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants