Skip to content

Fix integrity error handling#536

Open
jesperhodge wants to merge 10 commits intoopenedx:mainfrom
jesperhodge:fix-integrity-error
Open

Fix integrity error handling#536
jesperhodge wants to merge 10 commits intoopenedx:mainfrom
jesperhodge:fix-integrity-error

Conversation

@jesperhodge
Copy link
Copy Markdown

@jesperhodge jesperhodge commented Apr 8, 2026

Fixes openedx/modular-learning#261

Idea right now:

  • Validate against duplicate values before they reach the database and cause an IntegrityError (this is not implemented here yet)
  • Handle any uncaught internal errors, including integrity errors, throughout content-tagging. For security reasons we do not want to expose internal error information directly so we return a generic 500 response.
  • I made a custom error handling mixin so that this is handled for the content_tagging api in general.

Question: This would be better handled at the CMS level?

Answer: There is ongoing work to fix this up, starting with a proposal. See https://openedx.slack.com/archives/CHYH0BDTR/p1775053660743269 and openedx/openedx-platform#38246 in particular.

@openedx-webhooks openedx-webhooks added the open-source-contribution PR author is not from Axim or 2U label Apr 8, 2026
@openedx-webhooks
Copy link
Copy Markdown

openedx-webhooks commented Apr 8, 2026

Thanks for the pull request, @jesperhodge!

This repository is currently maintained by @axim-engineering.

Once you've gone through the following steps feel free to tag them in a comment and let them know that your changes are ready for engineering review.

🔘 Get product approval

If you haven't already, check this list to see if your contribution needs to go through the product review process.

  • If it does, you'll need to submit a product proposal for your contribution, and have it reviewed by the Product Working Group.
    • This process (including the steps you'll need to take) is documented here.
  • If it doesn't, simply proceed with the next step.
🔘 Provide context

To help your reviewers and other members of the community understand the purpose and larger context of your changes, feel free to add as much of the following information to the PR description as you can:

  • Dependencies

    This PR must be merged before / after / at the same time as ...

  • Blockers

    This PR is waiting for OEP-1234 to be accepted.

  • Timeline information

    This PR must be merged by XX date because ...

  • Partner information

    This is for a course on edx.org.

  • Supporting documentation
  • Relevant Open edX discussion forum threads
🔘 Get a green build

If one or more checks are failing, continue working on your changes until this is no longer the case and your build turns green.

Details
Where can I find more information?

If you'd like to get more details on all aspects of the review process for open source pull requests (OSPRs), check out the following resources:

When can I expect my changes to be merged?

Our goal is to get community contributions seen and reviewed as efficiently as possible.

However, the amount of time that it takes to review and merge a PR can vary significantly based on factors such as:

  • The size and impact of the changes that it introduces
  • The need for product review
  • Maintenance status of the parent repository

💡 As a result it may take up to several weeks or months to complete a review and merge your PR.

Copy link
Copy Markdown
Member

@kdmccormick kdmccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jesperhodge I agree that exception details should not be passed to the end user on a production site. However, it's useful behavior for developers. I'd expect the stack trace to be passed upwards when DEBUG==True, and a generic 500 to be shown when DEBUG==False.

Did you test with DEBUG==False? You can do django settings override, or just run the site with tutor local ....

@kdmccormick
Copy link
Copy Markdown
Member

To you other point:

Validate against duplicate values before they reach the database and cause an IntegrityError (this is not implemented here yet)

yep, I agree. I think it's fine to either check for conflicts before renaming the tag, or use a very targeted except IntegrityError to catch the conflict.

@mphilbrick211 mphilbrick211 moved this from Needs Triage to Waiting on Author in Contributions Apr 9, 2026
@jesperhodge
Copy link
Copy Markdown
Author

I did have a look at openedx/openedx-platform#38246, an ADR in review that aims to standardize errors platform-wide. (Also reviewed it and left my thoughts.)

I decided against implementing the proposed format more extensively here at this point, since there is still a lot to discuss around that.

Rather, I think it makes sense that after that ADR is finalized we just implement a top-level CMS error handler, and this temporary solution in this PR can be removed at that time.

@kdmccormick
Copy link
Copy Markdown
Member

To close the loop on this:

Did you test with DEBUG==False? You can do django settings override, or just run the site with tutor local ....

Jesper discussed in Slack that he did test with DEBUG==False, and he and Braden confirmed that this error detail leakage is a standing issue with DRF in general.

Rather, I think it makes sense that after that ADR is finalized we just implement a top-level CMS error handler, and this temporary solution in this PR can be removed at that time.

That approach sounds good to me @jesperhodge . Could you just include a link to the ADR PR in the temporary handlers so that we remember to remove them later?

@jesperhodge jesperhodge requested a review from kdmccormick April 9, 2026 22:08
@jesperhodge jesperhodge marked this pull request as ready for review April 9, 2026 22:09
Copy link
Copy Markdown
Member

@kdmccormick kdmccormick left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for thinking this through and coming up with a stop-gap solution.

log = logging.getLogger(__name__)


def custom_exception_handler(exc, context):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def custom_exception_handler(exc, context):
def _custom_exception_handler(exc: Exception, context: dict) -> Response:

Please use type annotations, and prefix with an underscore unless this is intended for use elsewhere.

if is_expected_exception:
return exception_handler(exc, context)

# if django settings have DEBUG=True
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
# if django settings have DEBUG=True

No need to state the obvious :)


# if django settings have DEBUG=True
if settings.DEBUG:
log.exception(exc)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging unexpected exceptions is just as important, if not more important, on a prod instance, so I'd move this outside the if.


# if django settings have DEBUG=True
if settings.DEBUG:
log.exception(exc)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

log.exception is only to be called in an exception handler (that is, a literal python except: block, not a DRF exception handler). The first argument is supposed to be an additional message, not an exception itself.

I think best the approach would be to call log.error using the detail string you have constructed below.

Tag.objects.filter(taxonomy_id=taxonomy_id, value__in=tags_list)
.values_list("value", flat=True)
)
missing_tags = [tag for tag in tags_list if tag not in existing_tags]
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
missing_tags = [tag for tag in tags_list if tag not in existing_tags]
missing_tags = set(tags_list) - existing_tags

IMO, this is easier to read when written using set arithmetic


assert response.status_code == status.HTTP_400_BAD_REQUEST
# Check that the error message indicates the duplicate value issue
self.assertIn("Tag value \"Updated Tag\" already exists in this taxonomy", str(response.data))
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
self.assertIn("Tag value \"Updated Tag\" already exists in this taxonomy", str(response.data))
assert "Tag value \"Updated Tag\" already exists in this taxonomy" in str(response.data)

It's preferable to use the assert keyword over self.assert* whenever available. ruff will begin enforcing this soon.

"""
self.client.force_authenticate(user=self.staff)
# simulate debug mode by patching the settings.DEBUG value to True
with patch("django.conf.settings.DEBUG", True):
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
with patch("django.conf.settings.DEBUG", True):
with override_settings(DEBUG=True)

standard practice is to use override_settings for this https://docs.djangoproject.com/en/6.0/topics/testing/tools/#django.test.override_settings

Comment on lines +2216 to +2217
the API handles it gracefully and returns a 500 error with a helpful error message,
instead of crashing or exposing any sensitive information.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
the API handles it gracefully and returns a 500 error with a helpful error message,
instead of crashing or exposing any sensitive information.
the API converts it to a generic 500 response without exposing sensitive info.

Sorry to nitpick your wording, just want to highlight a few things about the framing of this:

  • There's nothing wrong with crashing when there's an unexpected error--it's basically what we're doing, anyway. I would characterize a 5xx response as a presentation of a crash.
  • There error message is not helpful, but that's on purpose.

Comment on lines +2227 to +2231
def put_request(update_data):
return self.client.put(
self.small_taxonomy_url, update_data, format="json"
)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit, non-blocking: This test is very comprehensive, but I find its implementation to be overengineered. In general, I find that our standard DRYness patterns (helper methods, always using constants, etc.) tend to increase brittleness and decrease readability when applied to test code.

Consider the fully-inlined version of this test method, which would be easily skimmable and come out to a similar number of lines or fewer.

    EXPECTED_ERROR_MESSAGE = "An unexpected error occurred while processing the request."
    ...
    ...
    response = self.client.patch(self.small_taxonomy_url, update_data, format="json")
    assert response.status == 500
    assert EXPECTED_ERROR_MESSAGE in str(response.data)

@@ -4,12 +4,14 @@
from __future__ import annotations
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the comprehensive tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

open-source-contribution PR author is not from Axim or 2U

Projects

Status: Waiting on Author

Development

Successfully merging this pull request may close these issues.

Fix API integrity on rename

4 participants