Reddit user seeks better strand clustering with YOLO and XGBoost
70% accuracy on 8-group strand detection leaves room for improvement…
A computer vision developer on Reddit is tackling a tricky clustering problem: after training a YOLO model to detect individual strands in video frames, they need to group those detections into clusters based on spatial separation and output a left-to-right string per column (e.g., '1-2-3' or '1-2-3-2-3'). The constraints are tight — at most 8 groups and each group can contain no more than 3 strands.
To solve this, they initially built an XGBoost classification model that currently hits only ~70% accuracy. However, the Bayes error analysis indicates that the inherent data separability should allow much better performance, prompting them to ask the community for alternative approaches. The project involves handling background detections (visible in the fourth column of the visualizations) and scaling detections by bounding box area, adding complexity to the clustering task.
- YOLO detects strands in video frames; XGBoost clusters them into groups with max 8 groups and 3 strands per group.
- Current accuracy ~70% despite Bayes error suggesting better possible performance.
- Output is a left-to-right string representing strand counts per cluster (e.g., '1-2-3').
Why It Matters
This problem mirrors real-world tracking and grouping challenges in industrial video analytics.