Skip to content

Return correct type when invoking GetFieldType and GetProviderSpecificFieldType for vector float32 column#4105

Open
apoorvdeshmukh wants to merge 5 commits intomainfrom
dev/ad/4104
Open

Return correct type when invoking GetFieldType and GetProviderSpecificFieldType for vector float32 column#4105
apoorvdeshmukh wants to merge 5 commits intomainfrom
dev/ad/4104

Conversation

@apoorvdeshmukh
Copy link
Copy Markdown
Contributor

Description

Fixes #4104.
GetFieldType returned type from MetaType.ClassType
The fix follows similar pattern as GetValue to determine correct type for Vector metatype.
The fix has been extended to GetProviderSpecificFieldType which had similar issue.

Issues

#4104

Testing

Extended NativeVectorFloat32Tests with test coverage for above APIs.

Guidelines

Please review the contribution guidelines before submitting a pull request:

@apoorvdeshmukh apoorvdeshmukh requested a review from a team as a code owner March 31, 2026 09:55
Copilot AI review requested due to automatic review settings March 31, 2026 09:55
@github-project-automation github-project-automation bot moved this to To triage in SqlClient Board Mar 31, 2026
@apoorvdeshmukh apoorvdeshmukh added this to the 7.0.1 milestone Mar 31, 2026
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes SqlDataReader.GetFieldType() and GetProviderSpecificFieldType() to report the correct CLR type for vector(float32) columns (matching how GetValue() already materializes the value as SqlVector<float>), addressing issue #4104.

Changes:

  • Update SqlDataReader to return typeof(SqlVector<float>) for SqlDbTypeExtensions.Vector (float32) in GetFieldTypeInternal.
  • Apply the same correction to GetProviderSpecificFieldTypeInternal.
  • Add a manual test covering both APIs for a vector(float32) column.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlDataReader.cs Special-cases Vector metadata so GetFieldType/GetProviderSpecificFieldType report SqlVector<float> instead of the metatype’s default byte[].
src/Microsoft.Data.SqlClient/tests/ManualTests/SQL/VectorTest/NativeVectorFloat32Tests.cs Adds test coverage to validate the updated type reporting and alignment with GetValue().

@rhuijben
Copy link
Copy Markdown

rhuijben commented Mar 31, 2026

Shouldn't this be fixed on the SqlBuffer layer. It already identifies SqlVector and does the deserialization.

Something like

diff --git a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
index dc1abf1c..04683c3c 100644
--- a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
+++ b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
@@ -1136,8 +1136,16 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.String:
                         return String;
                     case StorageType.SqlBinary:
-                    case StorageType.Vector:
                         return ByteArray;
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return GetSqlVector<float>();
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                     case StorageType.SqlCachedBuffer:
                         {
                             // If we have a CachedBuffer, it's because it's an XMLTYPE column
@@ -1213,6 +1221,15 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(SqlJson);
                         // Time Date DateTime2 and DateTimeOffset have no direct Sql type to contain them
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                 }
             }
             else
@@ -1262,7 +1279,14 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(string);
                     case StorageType.Vector:
-                        return typeof(byte[]);
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
 #if NET
                     case StorageType.Time:
                         return typeof(TimeOnly);

I think this last hunk is the current issue. The type returned should be retrievable.

Not sure if some of this should be extracted to a helper function or something though. And more than just Float32 could be supported these days, like Half Precision Float Vectors

@apoorvdeshmukh apoorvdeshmukh marked this pull request as draft March 31, 2026 10:30
@mdaigle mdaigle modified the milestones: 7.0.1, 7.1.0-preview1 Apr 1, 2026
@mdaigle mdaigle added the Hotfix 7.0.1 When this PR merges, automatically open a PR to cherry-pick to the 7.0.1 branch label Apr 1, 2026
@mdaigle mdaigle moved this from To triage to In progress in SqlClient Board Apr 1, 2026
@apoorvdeshmukh
Copy link
Copy Markdown
Contributor Author

Shouldn't this be fixed on the SqlBuffer layer. It already identifies SqlVector and does the deserialization.

Something like

diff --git a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
index dc1abf1c..04683c3c 100644
--- a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
+++ b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
@@ -1136,8 +1136,16 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.String:
                         return String;
                     case StorageType.SqlBinary:
-                    case StorageType.Vector:
                         return ByteArray;
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return GetSqlVector<float>();
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                     case StorageType.SqlCachedBuffer:
                         {
                             // If we have a CachedBuffer, it's because it's an XMLTYPE column
@@ -1213,6 +1221,15 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(SqlJson);
                         // Time Date DateTime2 and DateTimeOffset have no direct Sql type to contain them
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                 }
             }
             else
@@ -1262,7 +1279,14 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(string);
                     case StorageType.Vector:
-                        return typeof(byte[]);
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
 #if NET
                     case StorageType.Time:
                         return typeof(TimeOnly);

I think this last hunk is the current issue. The type returned should be retrievable.

Not sure if some of this should be extracted to a helper function or something though. And more than just Float32 could be supported these days, like Half Precision Float Vectors

Shouldn't this be fixed on the SqlBuffer layer. It already identifies SqlVector and does the deserialization.

Something like

diff --git a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
index dc1abf1c..04683c3c 100644
--- a/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
+++ b/src/Microsoft.Data.SqlClient/src/Microsoft/Data/SqlClient/SqlBuffer.cs
@@ -1136,8 +1136,16 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.String:
                         return String;
                     case StorageType.SqlBinary:
-                    case StorageType.Vector:
                         return ByteArray;
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return GetSqlVector<float>();
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                     case StorageType.SqlCachedBuffer:
                         {
                             // If we have a CachedBuffer, it's because it's an XMLTYPE column
@@ -1213,6 +1221,15 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(SqlJson);
                         // Time Date DateTime2 and DateTimeOffset have no direct Sql type to contain them
+                    case StorageType.Vector:
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
                 }
             }
             else
@@ -1262,7 +1279,14 @@ namespace Microsoft.Data.SqlClient
                     case StorageType.Json:
                         return typeof(string);
                     case StorageType.Vector:
-                        return typeof(byte[]);
+                        var elementType = (MetaType.SqlVectorElementType)_value._vectorInfo._elementType;
+                        switch (elementType)
+                        {
+                            case MetaType.SqlVectorElementType.Float32:
+                                return typeof(SqlVector<float>);
+                            default:
+                                throw SQL.VectorTypeNotSupported(elementType.ToString());
+                        }
 #if NET
                     case StorageType.Time:
                         return typeof(TimeOnly);

I think this last hunk is the current issue. The type returned should be retrievable.

Not sure if some of this should be extracted to a helper function or something though. And more than just Float32 could be supported these days, like Half Precision Float Vectors

Thanks for the detailed analysis @rhuijben! You're right that SqlBuffer.GetTypeFromStorageType(isSqlType: false) returning typeof(byte[]) for StorageType.Vector is inconsistent with how other APIs surface vector data. However, GetFieldType()/GetProviderSpecificFieldType() work from column metadata (available before Read()), not from SqlBuffer (populated after Read()), so the SqlDataReader fix is needed regardless.

And yes, support for Half precision vectors is coming soon in SqlClient. :)

@apoorvdeshmukh apoorvdeshmukh marked this pull request as ready for review April 2, 2026 14:11
Copilot AI review requested due to automatic review settings April 2, 2026 14:12
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

@codecov
Copy link
Copy Markdown

codecov bot commented Apr 2, 2026

Codecov Report

❌ Patch coverage is 93.75000% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 66.49%. Comparing base (60d4b92) to head (3b81aa1).
⚠️ Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
...ient/src/Microsoft/Data/SqlClient/SqlDataReader.cs 93.75% 1 Missing ⚠️

❗ There is a different number of reports uploaded between BASE (60d4b92) and HEAD (3b81aa1). Click for more details.

HEAD has 1 upload less than BASE
Flag BASE (60d4b92) HEAD (3b81aa1)
CI-SqlClient 1 0
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4105      +/-   ##
==========================================
- Coverage   73.22%   66.49%   -6.73%     
==========================================
  Files         280      274       -6     
  Lines       43000    65794   +22794     
==========================================
+ Hits        31486    43752   +12266     
- Misses      11514    22042   +10528     
Flag Coverage Δ
CI-SqlClient ?
PR-SqlClient-Project 66.49% <93.75%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@paulmedynski paulmedynski self-assigned this Apr 2, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Hotfix 7.0.1 When this PR merges, automatically open a PR to cherry-pick to the 7.0.1 branch

Projects

Status: In progress

Development

Successfully merging this pull request may close these issues.

GetFieldType() does not Return SqlVector<float>

5 participants