Add fast skip pre-check to avoid loading full datasets for up-to-date entries by koenvo · Pull Request #73 · PySport/ingestify

koenvo · 2026-04-09T07:51:27Z

Before processing batches, loads a lightweight {identifier_key: last_modified_at} dict from the database in a single query (no joins to revision/file tables). Datasets where last_modified_at >= max(file.last_modified) are skipped instantly without the expensive get_dataset_collection call.

The cache is built once per (provider, dataset_type) in the loader and reused across selectors within the same run.

No false negatives: datasets that might need updating always fall through to the full should_refetch check.

… entries Before processing batches, loads a lightweight {identifier_key: last_modified_at} dict from the database in a single query (no joins to revision/file tables). Datasets where last_modified_at >= max(file.last_modified) are skipped instantly without the expensive get_dataset_collection call. The cache is built once per (provider, dataset_type) in the loader and reused across selectors within the same run. No false negatives: datasets that might need updating always fall through to the full should_refetch check.

koenvo added 4 commits April 9, 2026 09:50

Rename to get_dataset_last_modified_at_map

38d7309

Add DatasetLastModifiedAtMap type alias

0c5552a

Move DatasetLastModifiedAtMap type alias to dataset.py

dbb7b75

koenvo merged commit 3c77e1b into main Apr 9, 2026
13 checks passed

koenvo deleted the feature/fast-skip-precheck branch April 9, 2026 11:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add fast skip pre-check to avoid loading full datasets for up-to-date entries#73

Add fast skip pre-check to avoid loading full datasets for up-to-date entries#73
koenvo merged 4 commits intomainfrom
feature/fast-skip-precheck

koenvo commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

koenvo commented Apr 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant