Officer - Big Data Engineer - C11 - Hybrid - Chennai
Citi Bank
2 - 5 years
Chennai
Posted: 30/07/2025
Job Description
Responsible for designing, developing, and optimizing data processing solutions using a combination of Big Data technologies. Focus on building scalable and efficient data pipelines for handling large datasets and enabling batch & real-time data streaming and processing.
Responsibilities:
> Develop Spark applications using Scala or Python (Pyspark) for data transformation, aggregation, and analysis.
> Develop and maintain Kafka-based data pipelines: This includes designing Kafka Streams, setting up Kafka Clusters, and ensuring efficient data flow.
> Create and optimize Spark applications using Scala and PySpark: They leverage these languages to process large datasets and implement data transformations and aggregations.
> Integrate Kafka with Spark for real-time processing: They build systems that ingest real-time data from Kafka and process it using Spark Streaming or Structured Streaming.
> Collaborate with data teams: This includes data engineers, data scientists, and DevOps, to design and implement data solutions.
> Tune and optimize Spark and Kafka clusters: Ensuring high performance, scalability, and efficiency of data processing workflows.
> Write clean, functional, and optimized code: Adhering to coding standards and best practices.
> Troubleshoot and resolve issues: Identifying and addressing any problems related to Kafka and Spark applications.
> Maintain documentation: Creating and maintaining documentation for Kafka configurations, Spark jobs, and other processes.
> Stay updated on technology trends: Continuously learning and applying new advancements in functional programming, big data, and related technologies.
Proficiency in:
Hadoop ecosystem big data tech stack(HDFS, YARN, MapReduce, Hive, Impala).
Spark (Scala, Python) for data processing and analysis.
Kafka for real-time data ingestion and processing.
ETL processes and data ingestion toolsÂ
Deep hands-on expertise in Pyspark, Scala, Kafka
Programming Languages:
Scala, Python, or Java for developing Spark applications.
SQL for data querying and analysis.
Other Skills:
Data warehousing concepts.
Linux/Unix operating systems.
Problem-solving and analytical skills.
Version control systems
About Company
Citi Bank, officially known as Citibank, is a global financial institution and the consumer division of Citigroup, a leading multinational banking corporation. Established in 1812, Citibank provides a wide range of financial services, including retail banking, credit cards, personal loans, wealth management, and investment banking. With a strong presence in over 100 countries, it serves millions of customers worldwide, offering both individual and business banking solutions. Citibank is known for its digital banking innovations, global reach, and commitment to financial inclusion and economic growth.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).