We are building OS the intelligence layer that transforms every human conversation into action. Just as Windows made computers usable and Android made phones smart,OS makes every device conversation-aware.

We are actively engineering the future of ambient computing through, our wearable device that captures, understands, and acts on conversations in real-time. We are proving that ambient AI isn't science fictionit's engineering reality, validated by enterprise partnerships and real users in production.

The Role

You will own the entire AI stack. This is not about wrapping OpenAI APIs. You will be the architect of the infrastructure that powers this vision, moving from raw audio to high-level multi-agent reasoning. You will train custom models, optimize architectures from first principles, and build the "brain" that allows to remember and reason.

What You Will Build & Own

1. Custom Model Development (Beyond Foundation Models) You will build specialized models where generic APIs fail.

Indian Context: Train speaker diarization for Indian accents, multi-speaker environments, and Hindi-English code-mixing.
Embeddings: Fine-tune embedding models for semantic search across temporal contexts (memory).
NER: Develop custom Entity Recognition for noisy, mixed-language conversational transcripts.

2. Memory Architecture & Reasoning You will build the system that "remembers" everything.

Temporal Knowledge Graphs: Design bi-temporal data models that ingest conversations and extract entities/relationships over time.
Hybrid Retrieval: Combine dense embeddings with graph traversal to answer questions like "What did I promise Priya last week?"
Privacy: Implement privacy-preserving embeddings and federated learning approaches.

3. End-to-End Speech Pipeline You will own the audio processing stack.

Diarization & ASR: Fine-tune Whisper/wav2vec2 for 15+ Indian languages and overlapping speech.
Edge Optimization: Compress models (quantization, pruning, ONNX) to run on mobile devices with <100ms latency .
Quality Assessment: Build CNNs for audio quality assessment and Voice Activity Detection (VAD) in noisy environments.

4. Multi-Agent Orchestration You will design the system where specialized agents collaborate.

Routing: Train intent classification models to route queries to the right agent.
Action Planning: Build RL-based systems for multi-step action planning and conflict arbitration.

The "Tuesday Problems"

These aren't interview brain-teasers. These are the actual engineering challenges you will tackle:

The Cocktail Party: Train a diarization model that handles 4+ speakers in a Hindi-English code-mixed conversation with background noise.
Long-term Memory: Fine-tune an embedding model so that "What did Sarah say about the budget?" retrieves the correct segment from 3 months ago.
Latency: Optimize a Transformer model from 200ms to <50ms latency for mobile deployment without losing accuracy.
Temporal NER: Build a system that links "my manager" mentioned today to "Priya" from last week's conversation.

Technical Stack You Will Master

Frameworks: PyTorch (Primary), TensorFlow/Keras, JAX.
Speech/NLP: Transformers, wav2vec2, Whisper, BERT variants, LoRA/QLoRA.
Infrastructure: Python Async, Ray (distributed training), Docker/K8s, ONNX/TensorRT.
Data: Neo4j (Graph DB), Qdrant (Vector DB), Streaming pipelines.

Who You Are

2-5 Years Production Experience: You have built and deployed ML/DL models serving real users at scale.
Deep Learning Native: You understand architectures (Transformers, CNNs, RNNs) deeplyyou don't just import libraries, you debug gradients and optimize loss functions.
Full Stack ML: You own the loop: Data Pipeline Training Production Monitoring Iteration.
Hacker Spirit: You prefer shipping weekly and validating with data over endless theoretical debates.

Why This Role is Special

Greenfield Problems: You are solving problems that don't have pre-trained solutions on Hugging Face yet (e.g., Indian accent diarization + code-mixing).
Founding Team: Work directly with technical founders (IIT Madras AI background). No middle management, high autonomy.
Real Impact: Your models will power how families stay connected and professionals manage intelligence.
Resources: MacBook Pro + GPU workstation, full ML experimentation budget, and conference passes.

Artificial Intelligence Engineer

Recro

Job Description

Services you might be interested in

Improve Your Resume Today