Robotics Engineer
Awign
2 - 5 years
Bengaluru
Posted: 15/05/2026
Job Description
Were looking for a Robotics Perception Engineer to build end-to-end spatial perception systems that combine multi-camera vision, IMU data, and learning-based models into a unified 3D understanding of the world.
Youll work on problems spanning SLAM, 6DoF pose estimation, multi-device sensor fusion, and calibration, while also leveraging modern computer vision and ML techniques (e.g., monocular depth, action/skill understanding, VLA).
This role sits at the intersection of robotics, 3D computer vision, and applied ML.
What Youll Work On
- Build real-time perception pipelines combining:
- Multi-camera systems (head-mounted + wrist-mounted cameras)
- IMU + RGB fusion for accurate camera pose estimation
- Develop and optimize SLAM / visual-inertial odometry (VIO) systems
- Design multi-device sensor fusion to align multiple viewpoints into a single scene
- Implement 3D / 6DoF hand and object pose estimation from RGB / RGB-D inputs
- Implement object detection models
- Work on stereo + multi-view geometry pipelines
- Build robust camera calibration systems:
- Intrinsics / extrinsics
- Cross-device calibration
- Integrate or research ML models for:
- Monocular depth estimation
- Action / skill labeling
- Vision-Language-Action (VLA) systems
- Optimize pipelines for real-time performance and robustness
What Were Looking For
- Core Skills (Must-Have)Strong foundation in 3D Computer Vision & Geometry
- Multi-view geometry, epipolar geometry, transformations
- Experience with SLAM / VIO / sensor fusion
- Visual SLAM, visual-inertial fusion, state estimation
- Hands-on experience with camera calibration
- Intrinsics, extrinsics, stereo calibration
- Experience working with multi-camera systems
- Strong programming skills in C++ and/or Python
Good to Have (High Impact)
- Experience with hand pose / human pose estimation (2D/3D/6DoF)
- Familiarity with RGB-D / depth sensors
- Experience with learning-based vision models
- Monocular depth
- Pose estimation
- Action recognition
- Exposure to Vision-Language-Action (VLA) or embodied AI systems
- Experience optimizing for real-time systems (latency, memory, throughput)
- Familiarity with frameworks like:
- OpenCV, PyTorch, ROS, COLMAP, ORB-SLAM, OpenVINS, etc.
Nice to Have (Bonus)
- Experience with multi-device synchronization (time alignment, sensor clocks)
- Background in robotics, AR/VR, or embodied AI systems
- Experience deploying models on edge devices / mobile systems
What Makes This Role Unique
- Youll work on complex multi-sensor setups (not just single-camera CV)
- Ownership of end-to-end perception stack (not just modeling or infra)
- Blend of classical geometry + modern ML
- Opportunity to shape next-gen embodied / spatial AI systems
Ideal Candidate Profile
Someone who:
- Can move fluidly between math, systems, and ML
- Is comfortable debugging real-world sensor noise and calibration issues
- Has built or worked on real-time perception systems, not just offline models
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
