🔔 FCM Loaded

Computer Vision Engineer

SportVot

2 - 5 years

Mumbai

Posted: 04/01/2026

Getting a referral is 5x more effective than applying directly

Job Description

Key Responsibilities

  • Core Video Analysis: Implement and optimize fundamental video processing algorithms, including Optical Flow, Motion Estimation, and Background Subtraction to handle dynamic sports footage.
  • Spatio-Temporal Modeling: Develop models that understand time as well as space, utilizing 3D CNNs, RNNs/LSTMs, or Temporal Transformers to recognize actions and events over sequences of frames.
  • Object Tracking & Re-Identification: Build robust tracking pipelines using industry-standard algorithms (e.g., Kalman Filters, SORT, DeepSORT) to maintain identity of players and objects across occlusions.
  • Advanced Architectures: Research and integrate state-of-the-art models, including Vision Transformers (ViT) and Attention Mechanisms, to improve accuracy beyond traditional CNN limits.
  • Agentic AI Workflows: Assist in designing Agentic AI systems where autonomous agents plan multi-step video analysis tasks (e.g., deciding when to focus on specific game events) with minimal human intervention.
  • Data & Pipeline Strategy: Manage video datasets and collaborate with the engineering team to deploy efficient inference pipelines.

Required Skills & Qualifications

  • Education: Bachelors or Masters degree in Computer Science, AI, Data Science, or a related field.
  • Core Computer Vision: Strong understanding of traditional CV concepts:
  • Image Geometry & Camera Calibration
  • Feature Extraction (SIFT, SURF, ORB)
  • Image Filtering & Edge Detection
  • Deep Learning for Video: In-depth knowledge of neural network architectures tailored for video:
  • CNNs (ResNet, EfficientNet) for spatial features.
  • Sequence Models (RNN, LSTM, GRU) for temporal dependencies.
  • 3D CNNs (C3D, I3D, X3D) for spatiotemporal feature learning.
  • Transformers & Attention: Understanding of Self-Attention mechanisms, Vision Transformers (ViT), and how they differ from convolutional approaches.
  • Programming: Proficiency in Python with libraries like OpenCV, NumPy, Pandas, and Scikit-learn.
  • Frameworks: Hands-on experience with PyTorch or TensorFlow.

Good to Have (Bonus)

  • Experience with Agentic AI frameworks (e.g., LangChain) applied to visual tasks.
  • Knowledge of Multimodal AI (Video + Audio/Text).
  • Familiarity with model optimisation tools (TensorRT, ONNX) for real-time video inference.

  • Application Link:

    Services you might be interested in

    Improve Your Resume Today

    Boost your chances with professional resume services!

    Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.