Skip to content

[viostor] Fixup VirtIoAdapterControl()#1516

Draft
benyamin-codez wants to merge 4 commits intovirtio-win:masterfrom
benyamin-codez:viostor-fix-VirtIoAdapterControl
Draft

[viostor] Fixup VirtIoAdapterControl()#1516
benyamin-codez wants to merge 4 commits intovirtio-win:masterfrom
benyamin-codez:viostor-fix-VirtIoAdapterControl

Conversation

@benyamin-codez
Copy link
Copy Markdown
Contributor

Fixes up VirtIoAdapterControl().

By commit:

Cleanup variables

Consolidates the adaptExt declaration and definition to top of list.

Refactors the index variable as list_idx.

Also removes unnecessary whitespace in switch-case ladder.

Tracing

Relocates catch-all Control Type trace message to the default case for unsupported Control Types only.
This is reasonable as all supported Control Types have an identifying trace message anyway.

SupportedControlTypes

Refactors the SupportedControlTypes bit field to use the Storport ScsiAdapterControlMax method rather than an explicit BOOLEAN array.

Add ScsiAdapterSurpriseRemoval

Adds a ScsiAdapterSurpriseRemoval Control Type.

NOTE: Storport does not appear to call the ScsiAdapterSurpriseRemoval Control Type on hot-unplug. It calls the ScsiStopAdapter Control Type instead. We include this Control Type for completeness.

Consolidates the 'adaptExt' declaration and definition to top of list.
Refactors the 'index' variable as 'list_idx'.

Also removes unnecessary whitespace in switch-case ladder.

Signed-off-by: benyamin-codez <115509179+benyamin-codez@users.noreply.github.com>
Relocates catch-all Control Type trace message to the default case for
unsupported Control Types only. This is reasonable as all supported
Control Types have an identifying trace message anyway.

Signed-off-by: benyamin-codez <115509179+benyamin-codez@users.noreply.github.com>
Refactors the 'SupportedControlTypes' bit field to use the Storport
'ScsiAdapterControlMax' method rather than an explicit BOOLEAN array.

Signed-off-by: benyamin-codez <115509179+benyamin-codez@users.noreply.github.com>
Adds a 'ScsiAdapterSurpriseRemoval' Control Type.

NOTE: Storport does not appear to call the ScsiAdapterSurpriseRemoval
      Control Type on hot-unplug. It calls the ScsiStopAdapter
      Control Type instead. We include this Control Type for completeness.

Signed-off-by: benyamin-codez <115509179+benyamin-codez@users.noreply.github.com>
@YanVugenfirer
Copy link
Copy Markdown
Collaborator

[Jenkins CI]: Can one of the admins verify this patch?

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello @benyamin-codez, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on refining the VirtIoAdapterControl function within the viostor driver. The changes aim to enhance code readability through variable refactoring, improve the precision of tracing messages, and modernize the management of supported SCSI adapter control types by utilizing ScsiAdapterControlMax. Additionally, it introduces a new control type to explicitly handle surprise device removal events.

Highlights

  • Variable Refactoring: The adaptExt variable declaration and definition have been consolidated and moved to the top of the function. The loop counter Index has been renamed to list_idx for improved clarity.
  • Improved Tracing Logic: The generic RhelDbgPrint for ControlType has been relocated to the default case within the switch statement, ensuring that it only logs for unsupported control types, as supported types already have specific trace messages.
  • Dynamic Supported Control Types: The SupportedControlTypes array has been refactored to use ScsiAdapterControlMax for its size, making it more robust and scalable. Individual supported control types are now explicitly set to TRUE after initialization.
  • New Control Type Added: A new ScsiAdapterSurpriseRemoval control type has been added to handle surprise device removal events, setting the adaptExt->removed flag accordingly. This is included for completeness, although Storport may call ScsiStopAdapter instead on hot-unplug.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • viostor/virtio_stor.c
    • Consolidated the declaration and definition of adaptExt to the beginning of the function.
    • Renamed the loop variable Index to list_idx for better readability.
    • Updated the SupportedControlTypes array to be sized by ScsiAdapterControlMax and explicitly initialized supported control types.
    • Moved the generic ControlType trace message to the default case of the switch statement.
    • Added a new case for ScsiAdapterSurpriseRemoval to handle surprise device removal events.
Activity
  • No human activity (comments, reviews, or progress updates) has been recorded for this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request provides a good cleanup and enhancement of the VirtIoAdapterControl function. The changes, including variable cleanup, improved tracing, and a more robust way of handling supported control types, significantly improve code quality and maintainability. The addition of ScsiAdapterSurpriseRemoval support is also a good step towards completeness. I have a couple of minor suggestions to further improve code style and readability, but overall this is a solid set of changes.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

Hi @benyamin-codez

Sorry, I understand the will to help. But I am against refactoring for the sake of refactoring.

Let's focus on actual issues, functionality, and enabling better debugging.
Also, I suggest before sending refactoring PRs to open issues or discuss it in Slack - so we can have a discussion of refactoring is needed and with way it should go.

BTW: removing tracing from the existing path is bad. I am for always enabling tracing during init\deinit and control. And on hot path with higher debug levels. This is really needed on order to understand the flow of operations that happens to the drivers from the OS level.

If you want to make something useful here is a feature that I think we could add:
In NetKVM we have extensive statistics gathering regarding what happens in the drivers. For example drop packets, last time the descriptor was submitted to the host, last time interrupt was received, mismatch of the RSS hash with the current CPU and the target CPU, and etc. It is both gathered as part of adapter context (similar concept exists in storage drivers) and can be fetched with WMI. Same can be done here.
This I think can be actually very useful. For example we had a discussion of the performance hit function can fail in storport, great add statistics about it. Message ID anomalies? Add statistics about it and etc. This will be really useful, rather than refactoring that hides debugging degradation.

Best regards,
Yan

@benyamin-codez
Copy link
Copy Markdown
Contributor Author

@YanVugenfirer

Hi Yan,

I was just dumping what I had learnt troubleshooting *SurpriseRemoval. I had worked on this some time ago, but it was also recently highlighted in issue #1483. The behaviour for virtio_blk is a little different than what one might expect, with Storport still calling for a UnitControl after stopping the adapter.

As it stands, I'd like to dump the series of PRs for what I have done already. They really will improve the RAS of the storage drivers. This one is a little bit on the edge but covers the bases for anyone researching this further in the future. If the project doesn't want to use them, then so be it, but I'd like to put them out there for consideration anyway, and you and the team can review them as your time permits.

As for this PR, the tracing change still reports for all pathways. You still get a message for any call, either a human readable output or a type code for the relevant enumerator. It is the same mechanics we have in vioscsi and results in a cleaner trace.

I think I really need to do the ETW PR soon, if not next. It would mean we no longer have a need to differentiate between hot and cold paths in ETW provided we turn off DBG - which we should for Release drivers anyway. The DBG option would then become something devs will compile for as the need arises. With that PR, you will get the full suite of WPP macros such as ENTER_FN(), EXIT_FN(), etc. and the tracing becomes much more readable.

Adding statistics... It could be of some advantage provided it doesn't impact performance and I think that would be the challenge. I could work on something, but that's quite a pivot, and I'd prefer to first dump what I already have and clear my horizon before embarking on a new mission. Also, the need for this might be somewhat diminished with proper tracing output as problems tend to be more easily discerned. Do you have a NetKVM function that can serve as a primer for me to consider what you have in mind...?

Best regards,
Ben

@kevmw
Copy link
Copy Markdown
Contributor

kevmw commented Feb 9, 2026

How do you get a surprise removal in practice? virtio-blk is a PCI device and my understanding is that QEMU never just rips a PCI device out, but instead it requests the guest OS to remove the device before it is actually removed on the hardware side.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

@kevmw The complicated way is to run the surprise removal test from HLK manually: https://learn.microsoft.com/en-us/windows-hardware/test/hlk/testref/reproduce-the-test-failure-by-running-the-test-from-the-command-line

I am trying to find the old pnpdtest.exe utility that was much simpler. It was part of the older WDK kits.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

@kevmw you can download the old (Windows 7) kit from here https://download.microsoft.com/download/4/A/2/4A25C7D5-EFBE-4182-B6A9-AE6850409A78/GRMWDK_EN_7600_1.ISO

There is pnpdtest.exe for appropriate architecture. It might say that some DLL is missing, but it is also part of the kit, so just copy to the same directory.
After you run the executable, you will see device tree. Find your device and click on "Test surprise remove" button.

If you want, I can send you only the test binaries.

@benyamin-codez
Copy link
Copy Markdown
Contributor Author

@kevmw

So as Yan has pointed out, there is a HLK requirement, but in practice we don't expect QEMU to do this.

When we do remove the device on QEMU, we observe some differences between the storage drivers in how this operates:

In vioscsi, we get a ScsiUnitSurpriseRemoval followed by a ScsiUnitRemove. We then get the ScsiStopAdapter.
I assume this is because an event is sent to remove the LUN first.

In viostor, we get a ScsiStopAdapter followed by a ScsiUnitRemove. It appears somewhat back-to-front.
I assume this is because there is no mechanism to remove the LUN first.
Is it possible we need a QEMU fix for this? I note it still works, it just doesn't appear clean.

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

@benyamin-codez Surprise removal is a standard Windows PnP operation.

BTW: when having removable device on Windows and removing it from system tray GUI (remember the infamous eject USB device on Windows), I think in such case we also can get a surprise removal.

Regarding the difference between actual HW and QEMU - yes, this is a tricky part. With actual HW, I would expect something like a device reset as well or a device is actually gone. The HW will not attempt to do DMA anymore. But in our case the submitted descriptors to the host might be touched from the host side.
NetKVM can be a good example of handling the corner cases

VOID ParaNdis_OnPnPEvent(PARANDIS_ADAPTER *pContext, NDIS_DEVICE_PNP_EVENT pEvent, PVOID pInfo, ULONG ulSize)
...
    if (pEvent == NdisDevicePnPEventSurpriseRemoved)
    {
        // on simulated surprise removal (under PnpTest) we need to reset the device
        // to prevent any access of device queues to memory buffers

        // sync with lazy alloc thread action
        CMutexLockedContext sync(pContext->systemThread.PowerMutex());

        pContext->bSurprizeRemoved = TRUE;
        pContext->m_StateMachine.NotifySupriseRemoved();
        ParaNdis_ResetVirtIONetDevice(pContext);
    }

There is a state machine of the device that accounts for the surprise removal and notifies the "flows" and bunch of places with if(pContext->bSurprizeRemoved)

@YanVugenfirer
Copy link
Copy Markdown
Collaborator

@benyamin-codez Check ParaNdis-SM.h for the state machine code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants