Data Engineer – Data & AI - Institutional Equities
Kotak Securities
2 - 5 years
Mumbai
Posted: 08/01/2026
Job Description
Role Summary :
The Data Engineer is responsible for building and maintaining scalable, reliable, and high-performance data platforms. The role is hands-on, with a strong focus on engineering solutions for data storage, real-time processing, and platform integrations. Collaboration with Data Architects and Cloud Engineers is key to operationalizing and optimizing core data infrastructure components.
Key Responsibilities :
- Design, build, and optimize data pipelines for batch and real-time processing using Spark, Python, and related technologies.
- Set up, configure, and manage databases such as Postgres, ClickHouse, MongoDB, DynamoDB, and other analytical or NoSQL systems.
- Develop and maintain data models, indexing strategies, partitioning, and schema management to support scalable data solutions.
- Engineer and manage data storage formats and lakehouse table systems such as Delta Lake, Iceberg, and Hudi for efficient data access and analytics.
- Integrate databases with cloud components (AWS services, Databricks, internal microservices) to enable seamless data flow across platforms.
- Work with real-time platforms such as Kafka and Flink for streaming ingestion, event processing, and low-latency data delivery.
- Collaborate with Cloud Engineers to ensure infrastructure provisioning, networking connectivity, containerization, and access controls are aligned with data engineering needs.
- Troubleshoot and optimize data pipeline performance, including slow queries, write amplification, compaction issues, indexing strategies, and cluster configurations.
- Support platform observability and monitoring by installing, configuring, and monitoring systems like Prometheus and Grafana.
Required Skills:
- Strong Python skills
- Experience with distributed table formats (Delta Lake, Iceberg, Hudi).
-Competency in Kafka (consumer groups, offsets, partitions) and Flink for stream processing.
- Experience with PySpark for data ingestion and transformation workflows
- Deep knowledge of Postgres (indexing, replication, partitioning, optimization)
- Hands-on with ClickHouse (setup, tuning, materialized views, TTLs)
- Familiarity with NoSQL (MongoDB, DynamoDB) schema design and access patterns
- Familiarity with AWS (EC2, S3, VPC, IAM, Glue, Lambda)
- Understanding of database security, encryption, role management, and backup strategies
Good-to-Have Skills:
- Experience with Java frameworks such as Spring Batch or Hibernate
- Experience with Databricks workflows, catalog integration, or table ingestion patterns.
- Exposure to containerization (Docker) for database sandboxing or API deployments.
- Knowledge of infrastructure orchestration (Terraform) for database provisioning.
- Ability to contribute to datastore benchmarking and performance testing.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
