Set a hardcoded cache-control for /content/storage#5829
Set a hardcoded cache-control for /content/storage#5829bjester wants to merge 2 commits intolearningequality:hotfixesfrom
Conversation
977e0f4 to
05b0b65
Compare
rtibblesbot
left a comment
There was a problem hiding this comment.
Correct, focused change — cache headers are well-applied to /content/storage/ (content-addressed URLs) while leaving the general /content/ path unmodified.
CI passing. No UI files changed.
- suggestion (see inline on
studio-content.conf): Does this need to land inunstableas well? If production images are built fromunstable, the cache headers won't take effect there until this is merged forward or cherry-picked. - nitpick (
default.conf— not in diff): Line 43 still has the old incorrecttry_filescomment: "we can just pass one named route, because it fails" — should be "can't" (as corrected in this PR'sdevelop-studio-content.confblock comment).
@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly
How was this generated?
Reviewed the pull request diff checking for:
- Correctness: bugs, edge cases, undocumented behavior, resource leaks, hardcoded values
- Design: unnecessary complexity, naming, readability, comment accuracy, redundant state
- Architecture: duplicated concerns, minimal interfaces, composition over inheritance
- Testing: behavior-based assertions, mocks only at hard boundaries, accurate coverage
- Completeness: missing dependencies, unupdated usages, i18n, accessibility, security
- Principles: DRY (same reason to change), SRP, Rule of Three (no premature abstraction)
- Checked CI status and linked issue acceptance criteria
- For UI changes: inspected screenshots for layout, visual completeness, and consistency
| proxy_hide_header Cache-Control; | ||
|
|
||
| # content is md5-addressed, so cache aggressively for successful responses | ||
| add_header Cache-Control "public, max-age=31536000, immutable, no-transform"; |
There was a problem hiding this comment.
praise: Factoring the shared cache policy into a standalone include is clean DRY — one place to update if the directive set changes, and the comments explaining why each directive exists (immutable because URLs are hash-addressed, no-transform for MD5 integrity) are exactly the right level of documentation.
rtibblesbot
left a comment
There was a problem hiding this comment.
Prior findings
Acknowledged (not re-raised):
- Merge-forward to
unstable(suggestion) — author confirmed production images are not built fromunstable; will merge down eventually
1/1 prior suggestion acknowledged. No unresolved findings.
CI passing. No UI files changed.
The new commit corrects a misleading default.conf comment ("can just" → "can't just") — good catch; that wording was backwards and could have misled future readers about nginx's try_files behavior with named routes.
@rtibblesbot's comments are generated by an LLM, and should be evaluated accordingly
How was this generated?
Compared the current PR state against findings from a prior review:
- Retrieved prior bot reviews deterministically via the GitHub API
- Classified each prior finding as RESOLVED, UNADDRESSED, ACKNOWLEDGED, or CONTESTED
- Only raised NEW findings for newly introduced code
- Reviewed the pull request diff checking for correctness, design, architecture, testing, completeness, and adherence to DRY/SRP principles
- Checked CI status and linked issue acceptance criteria
DXCanas
left a comment
There was a problem hiding this comment.
My nginx experience was never very robust; calling myself "rusty" would be overstatement 💀
I think we should change the directive to work if it's not doing what we expect it to, though? I guess it's working right now but. Maybe not for the reasons we expect?
AI-Assisted Review, though:
All other nginx directives used are valid:
- ✅
proxy_hide_header Cache-Control;- valid - ✅
add_header Cache-Control "public, max-age=31536000, immutable, no-transform";- valid (all cache-control directives are standard) - ✅
proxy_intercept_errors on;- valid - ✅
recursive_error_pages on;- valid - ✅
error_page 404 = @production_storage;- valid - ✅
rewrite ^/content/(.*)$ /$1 break;- valid (break flag is correct for location context) - ✅ Location modifier
^~- valid (prevents regex matching) - ✅ All proxy settings are correctly configured
| # Note on try_files | ||
| # ----------------- | ||
| # try_files will only use one named route, and it uses the last one. Although, we can't just pass | ||
| # one named route, because it fails. | ||
|
|
||
| location ^~ /content/storage/ { | ||
| # ensure that the /content/ prefix is stripped from the request | ||
| rewrite ^/content/(.*)$ /$1 break; | ||
|
|
||
| # check staging bucket first, then fall back to production | ||
| try_files @nowhere @hotfixes_storage; | ||
| } | ||
|
|
There was a problem hiding this comment.
AI assisted review. But looking over the docs and stack overflow, it checks out. I can't vouch for the alternatives but they seem viable?:
Critical Issue: Incorrect try_files Usage
Location: New code in studio/docker/nginx/includes/content/develop-studio-content.conf:48-53
Problem:
try_files @nowhere @hotfixes_storage;According to nginx try_files documentation:
"Checks the existence of files in the specified order and uses the first found file for request processing... The last parameter can also point to a named location"
Named locations (starting with @) can only be used as the final fallback parameter, not as file check parameters.
The current usage attempts to:
- Check for a file named "@nowhere" (invalid - this is a named location, not a file path)
- Fall back to
@hotfixes_storage
Intent vs Implementation
The intent appears to be: "always proxy to the named location without checking for files"
Recommended Fix
Replace:
location ^~ /content/storage/ {
rewrite ^/content/(.*)$ /$1 break;
try_files @nowhere @hotfixes_storage;
}With either:
Option 1 (Direct proxy - simplest):
location ^~ /content/storage/ {
rewrite ^/content/(.*)$ /$1 break;
error_page 404 = @hotfixes_storage;
return 404;
}Option 2 (If you want to check for static files first):
location ^~ /content/storage/ {
rewrite ^/content/(.*)$ /$1 break;
try_files $uri @hotfixes_storage;
}Option 3 (Use internal directive):
location ^~ /content/storage/ {
rewrite ^/content/(.*)$ /$1 break;
# Direct internal redirect to named location
# (would require restructuring to use error_page or conditional logic)
}Note
This same pattern exists in the original studio/docker/nginx/includes/content/default.conf:44 (try_files @nowhere @emulator;), so it's a pre-existing issue being propagated. However, the diff introduces a new instance of this incorrect pattern.
There was a problem hiding this comment.
The note a few lines above addresses this. The @nowhere is essentially ignored since try_files only allows one named parameter, but requires at least 2, and the last one is always used. It's really no different than the first suggestion, but IMO a bit cleaner. The documentation notes that this is equivalent, but for this particular purpose option 1 is missing a directive to ignore logging it as 404 which the nginx docs demonstrate, whereas this approach does not need that. So my approach is arguably more concise and better considering it's used more than once.
So since this is shown as equivalent by the nginx docs, and this not "incorrect", combined with the fact that we've been running this without issue for many months, I'm gonna push back and say the AI is confidently incorrect about this. It does work and it is simpler. It's hacky, but so are the other options
| proxy_hide_header Cache-Control; | ||
|
|
||
| # content is md5-addressed, so cache aggressively for successful responses | ||
| add_header Cache-Control "public, max-age=31536000, immutable, no-transform"; |
Summary
Cache-controlfor/content/storagerequestsCache-controlfrom GCS if setmax-ageto 1 year-- Cloudflare's Edge limit should be 1 month, but should respect it up-to that limitimmutableflag, since the URLs contain a hash and that's the modern best practice for static contentno-transformflag, since we rely on MD5 hashes of the content and the content should not be altered in transit (unlikely given our infrastructure)References
https://learningequality.slack.com/archives/C0WHZ9FPX/p1776109009754729
Reviewer guidance
When we deploy to
hotfixes, does it return the proper cache-control headers?AI usage
I knew pretty much what needed to be done, so I talked it through AI, researched CF's TTL limit, and implemented changes. I reviewed, critiqued what it did, ran an image build, and will test on hotfixes.