Merge TeFlow into codebase by Kin-Zhang · Pull Request #36 · KTH-RPL/OpenSceneFlow

Kin-Zhang · 2026-03-11T16:38:06Z

AI Summary:

TeFlow Merged:

Temporal / multi-frame self-supervised learning support: Major expansion of the self-supervision loss stack in src/lossfuncs/selfsupervise.py, enabling TeFlow-style training signals (i.e., losses that rely on cross-frame consistency rather than only supervised labels).
Exposed/registered new loss components via src/lossfuncs/__init__.py so they can be selected/configured cleanly from training.
Updated training entrypoint and orchestration (train.py, src/trainer.py) to run TeFlow experiments end-to-end (config → dataset → forward → TeFlow losses → logging/metrics).

CUDA / plugin / infrastructure updates:

Refactored assets/cuda/chamfer3D/__init__.py to support batched Chamfer distance computation using CUDA streams, significantly improving multi-sample throughput and GPU utilization. Added new utility methods, improved docstrings, and streamlined the interface for both single and batched inputs. Enhanced the test section for batched processing and index validation.
Updated assets/cuda/chamfer3D/setup.py to bump the version from 1.0.5 to 1.0.6, reflecting the new features and optimizations.

Documentation:

Updated top-level docs and environment notes:
- README.md
- assets/README.md
Update evaluation mask to exclude ground points from the evaluation score. (This will change the validation score in the Argoverse 2 dataset) change link.

TeFlow got accepted by CVPR 2026. 🎉🎉🎉 Now I'm working on releasing the code.

Please check the progress in the forked teflow branch.

Once it's ready, I will merge it into the codebase with the updated README.

…ether. * reset fire class uage directly. * add save screenshot easily with multi-view. * sync view point through diff windows. * visual lidar center tf if set slc to True.

* as we found out the av2 provided eval_mask at first few frames sometimes it include ground points * and some ground points have ground truth flows because of bbx labeling etc. * but the method trend should be safe since all methods here set ground points flow as pose_flow so all of them have same error if we include ground points. the dataset changes maybe revert later since it's only for av2.

* Add teflowLoss into the codebase * update chamfer3D with CUDA stream-style batch busy compute. AI summary: - Added automatic collection of self-supervised loss function names in `src/lossfuncs/__init__.py`. - Improved documentation and structure of self-supervised loss functions in `src/lossfuncs/selfsupervise.py`. - Refactored loss calculation logic in `src/trainer.py` to support new self-supervised loss functions. - Introduced `ssl_loss_calculator` method for handling self-supervised losses. - Updated training step to differentiate between self-supervised and supervised loss calculations. - Enhanced error handling during training and validation steps to skip problematic batches.

update slurm and command for teflow

Kin-Zhang · 2026-03-12T08:47:15Z

assets/cuda/chamfer3D/__init__.py

+    def batched(self,
+                pc0_list: List[torch.Tensor],
+                pc1_list: List[torch.Tensor],
+                truncate_dist: float = -1) -> torch.Tensor:
+        """Parallel Chamfer loss via B CUDA streams.
+
+        Returns mean-over-samples: (1/B) * Σ_i [mean(dist0_i) + mean(dist1_i)].
+        ~1.14× faster than serial loop on RTX 3090 @ 88K pts/sample;
+        more importantly, keeps GPU busy with one sustained work block per frame.
+        """
+        B = len(pc0_list)
+        if B == 1:
+            return self.forward(pc0_list[0], pc1_list[0], truncate_dist)
+
+        streams  = self._ensure_streams(B)
+        main     = torch.cuda.current_stream()
+        per_loss: List[torch.Tensor] = [None] * B  # type: ignore[list-item]
+
+        for i in range(B):
+            streams[i].wait_stream(main)
+            with torch.cuda.stream(streams[i]):
+                d0, d1, _, _ = ChamferDis.apply(pc0_list[i].contiguous(),
+                                                 pc1_list[i].contiguous())
+                if truncate_dist <= 0:
+                    per_loss[i] = d0.mean() + d1.mean()
+                else:
+                    v0, v1 = d0 <= truncate_dist, d1 <= truncate_dist
+                    per_loss[i] = torch.nanmean(d0[v0]) + torch.nanmean(d1[v1])
+
+        for i in range(B):
+            main.wait_stream(streams[i])
+
+        return torch.stack(per_loss).mean()


Speed Performance: Stream CUDA vs For-loop

Quick demo benchmark (1 GPU, bz=8, 312 samples):

Stream CUDA: 1.14s/it → Epoch 1: 46%|███████▍ | 18/39 [00:20<00:23, 1.14s/it] For-loop: 1.29s/it → Epoch 1: 46%|███████▍ | 18/39 [00:23<00:27, 1.29s/it]

1.132× faster (~13.2% speedup)

Based on the previous full training run (8 GPUs, bz=16, 153,932 samples), this reduces self-supervised training time from 11 hours → ~9.5 hours on 8 gpus.

* update training script.

src/lossfuncs/selfsupervise.py

and update version to 1.0.6

Kin-Zhang and others added 8 commits January 10, 2026 18:40

chore(visualization): refactor the open3d visualization, merge fn tog…

3ba12a7

…ether. * reset fire class uage directly. * add save screenshot easily with multi-view. * sync view point through diff windows. * visual lidar center tf if set slc to True.

fix(flow): add index_flow for 2hz gt view etc.

cff7ce8

Merge branch 'KTH-RPL:main' into main

b797eaf

hotfix: voteflow cuda lib skip compile if pre-install already.

fafe30e

Merge branch 'fixlib' into feature/teflow

f47d070

docs(apptainer): update apptainer env for diff cluster env.

52a8844

update slurm and command for teflow

Kin-Zhang mentioned this pull request Mar 11, 2026

when TeFlow code released #35

Closed

Kin-Zhang added the new method new method involve label Mar 11, 2026

Kin-Zhang linked an issue Mar 11, 2026 that may be closed by this pull request

when TeFlow code released #35

Closed

update train with rename jobid if it's self-supervised loss.

235b801

Kin-Zhang commented Mar 12, 2026

View reviewed changes

Kin-Zhang added 6 commits March 18, 2026 22:13

feat: update v3 challenge format for a test also

2a9c474

fix(data): in case overlap bbx in nus, add groud seg in waymo

6173abf

* update training script.

Merge branch 'fix/av2-eval' into feature/teflow

f2cbb58

doc(rerun): update rerun visulization scripts

ce19d95

docs: update slurm file

ae48111

loss(ssl): update comment to paper link

2719eb3

Kin-Zhang commented Mar 31, 2026

View reviewed changes

src/lossfuncs/selfsupervise.py Show resolved Hide resolved

Kin-Zhang added 2 commits March 31, 2026 15:34

refactor(chamfer): streamline loss functions

f502084

and update version to 1.0.6

docs: update link comment for chamfer dis speed test

285310f

Kin-Zhang changed the title ~~[WIP] Merge TeFlow into codebase~~ Merge TeFlow into codebase Mar 31, 2026

Kin-Zhang merged commit 1c8fc5a into KTH-RPL:main Mar 31, 2026

Kin-Zhang mentioned this pull request Mar 31, 2026

Fix scene flow ground truth generation in Argoverse 2 #5

Merged

Kin-Zhang self-assigned this Mar 31, 2026

Kin-Zhang deleted the feature/teflow branch March 31, 2026 16:54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Merge TeFlow into codebase#36

Merge TeFlow into codebase#36
Kin-Zhang merged 17 commits intoKTH-RPL:mainfrom
Kin-Zhang:feature/teflow

Kin-Zhang commented Mar 11, 2026 •

edited

Loading

Uh oh!

Kin-Zhang Mar 12, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Kin-Zhang commented Mar 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

TeFlow Merged:

CUDA / plugin / infrastructure updates:

Documentation:

Uh oh!

Kin-Zhang Mar 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Speed Performance: Stream CUDA vs For-loop

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Kin-Zhang commented Mar 11, 2026 •

edited

Loading

Kin-Zhang Mar 12, 2026 •

edited

Loading