Were looking for a Robotics Perception Engineer to build end-to-end spatial perception systems that combine multi-camera vision, IMU data, and learning-based models into a unified 3D understanding of the world.

Youll work on problems spanning SLAM, 6DoF pose estimation, multi-device sensor fusion, and calibration, while also leveraging modern computer vision and ML techniques (e.g., monocular depth, action/skill understanding, VLA).

This role sits at the intersection of robotics, 3D computer vision, and applied ML.

What Youll Work On

Build real-time perception pipelines combining:
Multi-camera systems (head-mounted + wrist-mounted cameras)
IMU + RGB fusion for accurate camera pose estimation
Develop and optimize SLAM / visual-inertial odometry (VIO) systems
Design multi-device sensor fusion to align multiple viewpoints into a single scene
Implement 3D / 6DoF hand and object pose estimation from RGB / RGB-D inputs
Implement object detection models
Work on stereo + multi-view geometry pipelines
Build robust camera calibration systems:
Intrinsics / extrinsics
Cross-device calibration
Integrate or research ML models for:
Monocular depth estimation
Action / skill labeling
Vision-Language-Action (VLA) systems
Optimize pipelines for real-time performance and robustness

What Were Looking For

Core Skills (Must-Have)Strong foundation in 3D Computer Vision & Geometry
Multi-view geometry, epipolar geometry, transformations
Experience with SLAM / VIO / sensor fusion
Visual SLAM, visual-inertial fusion, state estimation
Hands-on experience with camera calibration
Intrinsics, extrinsics, stereo calibration
Experience working with multi-camera systems
Strong programming skills in C++ and/or Python

Good to Have (High Impact)

Experience with hand pose / human pose estimation (2D/3D/6DoF)
Familiarity with RGB-D / depth sensors
Experience with learning-based vision models
Monocular depth
Pose estimation
Action recognition
Exposure to Vision-Language-Action (VLA) or embodied AI systems
Experience optimizing for real-time systems (latency, memory, throughput)
Familiarity with frameworks like:
OpenCV, PyTorch, ROS, COLMAP, ORB-SLAM, OpenVINS, etc.

Nice to Have (Bonus)

Experience with multi-device synchronization (time alignment, sensor clocks)
Background in robotics, AR/VR, or embodied AI systems
Experience deploying models on edge devices / mobile systems

What Makes This Role Unique

Youll work on complex multi-sensor setups (not just single-camera CV)
Ownership of end-to-end perception stack (not just modeling or infra)
Blend of classical geometry + modern ML
Opportunity to shape next-gen embodied / spatial AI systems

Ideal Candidate Profile

Someone who:

Can move fluidly between math, systems, and ML
Is comfortable debugging real-world sensor noise and calibration issues
Has built or worked on real-time perception systems, not just offline models

Robotics Engineer

Awign

Job Description

Services you might be interested in

Improve Your Resume Today