Skip to content

Fix #480: Multi Vector > 10000 documents throws scoring error#574

Open
JiwaniZakir wants to merge 1 commit intoredis:mainfrom
JiwaniZakir:fix/480-multi-vector-10000-documents-throws-scor
Open

Fix #480: Multi Vector > 10000 documents throws scoring error#574
JiwaniZakir wants to merge 1 commit intoredis:mainfrom
JiwaniZakir:fix/480-multi-vector-10000-documents-throws-scor

Conversation

@JiwaniZakir
Copy link
Copy Markdown

@JiwaniZakir JiwaniZakir commented Apr 4, 2026

Closes #480

When a MultiVectorQuery runs against an index with more than 10,000 documents, Redis does not yield a distance value for every document, causing the bare (2 - @distance_i)/2 expression in MultiVectorQuery.__init__ (redisvl/query/aggregate.py) to raise "Could not find the value for a parameter name". The fix wraps each apply() call in an if(exists(@distance_i), (2 - @distance_i)/2, 0) guard so that documents missing a distance score default to 0 instead of erroring.

  • redisvl/query/aggregate.pyMultiVectorQuery.__init__: replaced the two bare apply() expressions with if(exists(...), ..., 0) variants.
  • tests/unit/test_aggregation_types.pytest_multi_vector_query_string: updated the expected query string to match the new if(exists(...)) form.

The fix was verified by updating the unit test in test_aggregation_types.py, which asserts the full serialized aggregation string and now passes with the new expression.


This PR was created with AI assistance (Claude). The changes were reviewed by quality gates and a critic model before submission.


Note

Low Risk
Low risk: small query-string change that only affects how missing @distance_i values are handled, plus a matching unit-test update.

Overview
Fixes MultiVectorQuery scoring failures when Redis does not return @distance_i for some documents by wrapping each per-vector score calculation in if(exists(@distance_i), (2 - @distance_i)/2, 0) so missing distances default to 0 instead of erroring.

Updates the unit test asserting the serialized aggregation query string to match the new guarded APPLY expressions.

Reviewed by Cursor Bugbot for commit c1e69dc. Bugbot is set up for automated code reviews on this repo. Configure here.

When an index has more than 10 000 documents, Redis FT.AGGREGATE can
produce pipeline rows where a VECTOR_RANGE distance attribute is absent,
causing "Could not find the value for a parameter name" errors. Wrap
each score APPLY expression with if(exists(@distance_i), ..., 0) so
missing distances default to 0 instead of crashing.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings April 4, 2026 09:04
@jit-ci
Copy link
Copy Markdown

jit-ci bot commented Apr 4, 2026

Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset.

In case there are security findings, they will be communicated to you as a comment inside the PR.

Hope you’ll enjoy using Jit.

Questions? Comments? Want to learn more? Get in touch with us.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes an aggregation scoring failure in MultiVectorQuery when Redis does not yield a distance_i value for some documents (reported when querying indexes with >10,000 documents), by guarding the distance-to-similarity conversion with exists() and defaulting missing distances to a score of 0.

Changes:

  • Wrap per-vector similarity computation in if(exists(@distance_i), ..., 0) to prevent APPLY from erroring when @distance_i is missing.
  • Update the unit test’s expected serialized aggregation string to match the new guarded APPLY expressions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
redisvl/query/aggregate.py Adds an exists() guard around per-vector distance normalization to avoid aggregation errors when distance_i is absent.
tests/unit/test_aggregation_types.py Updates the expected query string for MultiVectorQuery to reflect the new if(exists(...)) scoring expression.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Multi Vector > 10000 documents throws scoring error

2 participants