Step 1: Fast Local Scanning
The detector connects to your RTSP streams or video files. A fast YOLO model scans every frame looking for specific objects.
Step 2: Smart VLM Verification (Optional)
To filter out false alarms, you can optionally ask a Vision Language AI (like Gemini) a custom question about the footage (e.g., 'Is there really a person?').
Step 3: Instant Alerts
Confirmed detections can be saved to disk, posted to a webhook, or sent straight to your Telegram chat complete with images and video clips.
