Skip to content

[vioscsi] Fix potential typecast overflow error casting to num_queues#1513

Draft
benyamin-codez wants to merge 1 commit intovirtio-win:masterfrom
benyamin-codez:vioscsi-fix-potential-typecast-overflow
Draft

[vioscsi] Fix potential typecast overflow error casting to num_queues#1513
benyamin-codez wants to merge 1 commit intovirtio-win:masterfrom
benyamin-codez:vioscsi-fix-potential-typecast-overflow

Conversation

@benyamin-codez
Copy link
Copy Markdown
Contributor

Fixes a potential typecast overflow error casting to num_queues caused when num_queues was updated to a ULONG from USHORT. The objects being cast are the ULONG variable num_cpus in VioScsiFindAdapter(), and the ULONG struct member msix_vectors from ADAPTER_EXTENSION accessed in VioScsiHwInitialize(). Removal of the cast is all that is necessary to avoid any potential typecast overflow error. It is noteworthy that the overflow error could presently only occur if MAX_CPU was ever increased to 65,536, which is very unlikely.

Fixes a potential typecast overflow error casting to 'num_queues' caused
when 'num_queues' was updated to a ULONG from USHORT. The objects being
cast are the ULONG variable 'num_cpus' in VioScsiFindAdapter(), and the
ULONG struct member 'msix_vectors' from ADAPTER_EXTENSION accessed in
VioScsiHwInitialize(). Removal of the cast is all that is necessary
to avoid any potential typecast overflow error. It is noteworthy that
the overflow error could presently only occur if MAX_CPU was ever
increased to 65,536, which is very unlikely.

Signed-off-by: benyamin-codez <115509179+benyamin-codez@users.noreply.github.com>
@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @benyamin-codez, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a potential typecast overflow issue within the vioscsi driver. By removing explicit USHORT casts during the assignment of num_cpus and msix_vectors to adaptExt->num_queues, the change ensures that ULONG values are correctly handled. This prevents data truncation and improves the robustness of the driver, particularly in scenarios where the number of CPUs or MSI-X vectors might exceed the limits of a USHORT.

Highlights

  • Typecast Removal: Removed explicit (USHORT) casts when assigning num_cpus and msix_vectors to adaptExt->num_queues.
  • Overflow Prevention: This change prevents a potential typecast overflow error, as adaptExt->num_queues is now a ULONG, and the source variables (num_cpus, msix_vectors) are also ULONG.
  • Data Integrity: Ensures that the full range of ULONG values from num_cpus and msix_vectors can be correctly assigned to adaptExt->num_queues without truncation.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

[Jenkins CI]: Can one of the admins verify this patch?

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request fixes a potential integer overflow by removing unnecessary casts to USHORT when assigning to num_queues, which is now a ULONG. The changes are correct in addressing the overflow. However, I found a logical issue in one of the modified locations. When adjusting the number of queues based on available MSI-X vectors, the logic is flawed and could lead to an incorrect number of queues being configured, potentially causing issues. I've provided a suggestion to fix this logic. Other than that, the change is good.

if (adaptExt->num_queues > 1 && ((adaptExt->num_queues + 3) > adaptExt->msix_vectors))
{
adaptExt->num_queues = (USHORT)adaptExt->msix_vectors;
adaptExt->num_queues = adaptExt->msix_vectors;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While removing the cast is correct to prevent a potential overflow, the logic for adjusting num_queues seems incorrect. If the condition (adaptExt->num_queues + 3) > adaptExt->msix_vectors is met, it means there are not enough MSI-X vectors for the requested number of queues. The number of queues should be reduced, but the current code sets num_queues to msix_vectors, which could potentially increase num_queues and worsen the vector deficit.

The number of queues should be reduced to a value that satisfies new_num_queues + 3 <= msix_vectors. A possible correction is to set num_queues to the maximum possible value, which would be msix_vectors - 3.

This should also handle the case where msix_vectors is too small to support even one queue with this scheme (i.e., msix_vectors < 4), by falling back to a single queue. The subsequent driver logic would then need to handle that scenario, possibly by using a single shared interrupt.

        adaptExt->num_queues = (adaptExt->msix_vectors >= 4) ? (adaptExt->msix_vectors - 3) : 1;

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will address this in a separate PR...

Copy link
Copy Markdown
Collaborator

@YanVugenfirer YanVugenfirer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR looks OK.

But I would like to repeat myself in virtio-scsi as well:
" if (adaptExt->num_queues > 1 && ((adaptExt->num_queues + 3) > adaptExt->msix_vectors))"

I don't think we need to distinguish 1 and several queues scenarios for the calculations when MSIx is present.

Basically (1 queue + control queues < msix_vectors) scenario is also problematic.

So if you have energy for PRs - please direct it to the impactful ones.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

ok to test

@benyamin-codez
Copy link
Copy Markdown
Contributor Author

The PR looks OK.

But I would like to repeat myself in virtio-scsi as well: " if (adaptExt->num_queues > 1 && ((adaptExt->num_queues + 3) > adaptExt->msix_vectors))"

I don't think we need to distinguish 1 and several queues scenarios for the calculations when MSIx is present.

Basically (1 queue + control queues < msix_vectors) scenario is also problematic.

So if you have energy for PRs - please direct it to the impactful ones.

Thanks Yan.

My thoughts were it makes sense to get this properly solved for vioscsi and viostor at the same time.
So that's what I am working on at the moment.
Let's see what I come up with.

@benyamin-codez
Copy link
Copy Markdown
Contributor Author

@kostyanf14

Do you know what happened with the HCK-CI/vioscsi-Win11_25H2x64_host_viommu run?
There was no hlkx file generated...

The Failed_84_DF_-_Reinstall_with_IO_Before_and_After__Reliability test log reports:

The operation timed out waiting for this device to complete a PnP query-remove request due to a potential hang in the device stack of a related device. The system may need to be rebooted to complete the operation.

@kostyanf14
Copy link
Copy Markdown
Member

Do you know what happened with the HCK-CI/vioscsi-Win11_25H2x64_host_viommu run?
There was no hlkx file generated...

@benyamin-codez
Have no idea yet. Rerun of test helps. Maybe one more HLK issue. This is updated Windows 11 to 25H2, previously we run 24H2. Let's see if this is regular error or not

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants