Data Engineer
Live Connections
2 - 5 years
Pune
Posted: 17/12/2025
Job Description
Data Engineer Python + AI/ML Pipelines
Job Title: Data Engineer Python + AI/ML Pipelines
Company Location: Kharadi, Pune (Work-from-office / Hybrid as per company policy)
CTC: 15 LPA (including variables & statutory bonus)
Experience Required: 59 years of hands-on Data Engineering experience (candidates with 10+ years will not be considered)
Notice Period: Immediate 30 days joiners only
Key Highlights:
High-visibility role directly powering AI-driven automation projects
Modern AI/ML tech stack with heavy use of LLMs, RAG, vector DBs & real-time pipelines
Fast-growing product/SI organization with excellent learning & growth
Job Description:
We are looking for a strong Data Engineer with deep Python expertise to design, build and maintain scalable, production-grade data pipelines that fuel AI & automation initiatives for global clients.
Key Responsibilities:
Design & develop robust, scalable data pipelines for AI/ML and automation use cases
Build and optimize ETL/ELT workflows for model training, fine-tuning and inference
Ingest and integrate structured & unstructured data from multiple sources (APIs, databases, cloud storage, logs, etc.)
Collaborate closely with Data Scientists, AI Engineers & business teams to deliver clean, model-ready datasets
Implement data quality checks, validation frameworks and monitoring alerts
Support MLOps model deployment, versioning, monitoring and retraining pipelines
Work with real-time streaming (Kafka, Kinesis, Pulsar) and vector databases (Pinecone, Weaviate, Chroma, etc.)
Ensure data security, governance and compliance (GDPR, SOC2, etc.)
Continuously improve pipeline performance, cost and reliability
Must-Have Skills & Experience:
5+ years of hands-on Data Engineering experience
Expert-level Python programming (pandas, NumPy, PySpark, etc.)
Very strong SQL and database skills
Hands-on experience with at least one major cloud: Azure (preferred) / AWS / GCP
Experience with Azure Data Factory, Databricks, Synapse, or AWS Glue, EMR, etc.
Workflow orchestration tools: Airflow (must), dbt, Prefect, Dagster
Good understanding of AI/ML lifecycles and MLOps practices
Exposure to TensorFlow / PyTorch / LangChain / LlamaIndex is a big plus
Real-time data streaming (Kafka/Kinesis) and vector databases is highly desirable
Solid grasp of data modeling, warehousing and lake-house architectures
Good to Have:
Prompt engineering & RAG pipeline experience
Exposure to Kubernetes, Docker, Terraform
Contribution to open-source data/AI projects
Immediate joiners or candidates serving max 30 days notice will be given strong preference.
How to Apply
Mail your updated resume with the subject line:
Data Engineer Python AI Pipelines Pune (Your Name) (Current CTC) (Expected CTC) (Notice Period) to:
Regards,
Victor Paul
Talent Acquisition Team
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
