Computer Vision Engineer
SportVot
2 - 5 years
Mumbai
Posted: 05/01/2026
Getting a referral is 5x more effective than applying directly
Job Description
Key Responsibilities
- Core Video Analysis: Implement and optimize fundamental video processing algorithms, including Optical Flow, Motion Estimation, and Background Subtraction to handle dynamic sports footage.
- Spatio-Temporal Modeling: Develop models that understand time as well as space, utilizing 3D CNNs, RNNs/LSTMs, or Temporal Transformers to recognize actions and events over sequences of frames.
- Object Tracking & Re-Identification: Build robust tracking pipelines using industry-standard algorithms (e.g., Kalman Filters, SORT, DeepSORT) to maintain identity of players and objects across occlusions.
- Advanced Architectures: Research and integrate state-of-the-art models, including Vision Transformers (ViT) and Attention Mechanisms, to improve accuracy beyond traditional CNN limits.
- Agentic AI Workflows: Assist in designing Agentic AI systems where autonomous agents plan multi-step video analysis tasks (e.g., deciding when to focus on specific game events) with minimal human intervention.
- Data & Pipeline Strategy: Manage video datasets and collaborate with the engineering team to deploy efficient inference pipelines.
Required Skills & Qualifications
- Education: Bachelors or Masters degree in Computer Science, AI, Data Science, or a related field.
- Core Computer Vision: Strong understanding of traditional CV concepts:
- Image Geometry & Camera Calibration
- Feature Extraction (SIFT, SURF, ORB)
- Image Filtering & Edge Detection
- Deep Learning for Video: In-depth knowledge of neural network architectures tailored for video:
- CNNs (ResNet, EfficientNet) for spatial features.
- Sequence Models (RNN, LSTM, GRU) for temporal dependencies.
- 3D CNNs (C3D, I3D, X3D) for spatiotemporal feature learning.
- Transformers & Attention: Understanding of Self-Attention mechanisms, Vision Transformers (ViT), and how they differ from convolutional approaches.
- Programming: Proficiency in Python with libraries like OpenCV, NumPy, Pandas, and Scikit-learn.
- Frameworks: Hands-on experience with PyTorch or TensorFlow.
Good to Have (Bonus)
- Experience with Agentic AI frameworks (e.g., LangChain) applied to visual tasks.
- Knowledge of Multimodal AI (Video + Audio/Text).
- Familiarity with model optimisation tools (TensorRT, ONNX) for real-time video inference.
Application Link:
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
