Big Data (Spark, Scala) - C12 - CHENNAI
Citi Bank
2 - 5 years
Chennai
Posted: 27/02/2025
Job Description
The Applications Development Senior Programmer Analyst is an intermediate level position responsible for participation in the establishment and implementation of new or revised application systems and programs in coordination with the Technology team. The overall objective of this role is to contribute to applications systems analysis and programming activities.
• Design and Develop Scalable Data Pipelines: Lead the design, development, and maintenance of high-performance, scalable data pipelines using Apache Spark, Scala, and PySpark to handle large-scale datasets in the financial industry.
• ETL Process Implementation: Implement ETL (Extract, Transform, Load) processes for data integration, transforming complex data from multiple sources into structured, actionable insights.
• Data Optimization and Performance Tuning: Monitor, troubleshoot, and optimize the performance of data pipelines and applications, ensuring high availability, low-latency, and efficient resource usage.
• Data Workflow Orchestration: Use Apache Airflow to orchestrate and automate complex data workflows, ensuring seamless integration and efficient execution of tasks across systems.
• Real-Time Data Processing: Integrate real-time data streaming solutions using Apache Kafka for processing and managing large volumes of data in real-time.
• Collaboration with Business and Technical Teams: Work closely with business stakeholders and cross-functional teams to define data requirements and ensure solutions meet business objectives.
• Data Security and Compliance: Ensure that all data processing workflows adhere to relevant regulatory standards and security best practices.
• Cloudera Ecosystem Utilization: Leverage Cloudera Ecosystem tools (Hadoop, HDFS, Hive, Impala, Flume, etc.) to manage and process big data, ensuring that systems are scalable and efficient.
• Continuous Improvement: Identify opportunities for continuous improvement in data architecture and processing, contributing to the overall modernization of the organization’s data systems.
Responsibilities:
- Key Responsibilities:
• Design, develop, and optimize scalable distributed data processing pipelines using Apache Spark and Scala.
• Implement data integration and transformation processes for large-scale datasets from various sources.
• Collaborate with data architects and analysts to ensure efficient data processing and delivery for real-time and batch applications.
• Monitor and troubleshoot Spark jobs, ensuring performance optimization and error resolution.
• Work on the modernization of big data architecture to meet business requirements in a financial services context. - Conduct tasks related to feasibility studies, time and cost estimates, IT planning, risk technology, applications development, model development, and establish and implement new or revised applications systems and programs to meet specific business needs or user areas
- Monitor and control all phases of development process and analysis, design, construction, testing, and implementation as well as provide user and operational support on applications to business users
- Utilize in-depth specialty knowledge of applications development to analyze complex problems/issues, provide evaluation of business process, system process, and industry standards, and make evaluative judgement
- Recommend and develop security measures in post implementation analysis of business usage to ensure successful system design and functionality
- Consult with users/clients and other technology groups on issues, recommend advanced programming solutions, and install and assist customer exposure systems
- Ensure essential procedures are followed and help define operating standards and processes
- Serve as advisor or coach to new or lower level analysts
- Has the ability to operate with a limited level of direct supervision.
- Can exercise independence of judgement and autonomy.
- Acts as SME to senior stakeholders and /or other team members.
- Appropriately assess risk when business decisions are made, demonstrating particular consideration for the firm's reputation and safeguarding Citigroup, its clients and assets, by driving compliance with applicable laws, rules and regulations, adhering to Policy, applying sound ethical judgment regarding personal behavior, conduct and business practices, and escalating, managing and reporting control issues with transparency.
Qualifications:
- • Experience: 7+ years of hands-on experience in big data development, focusing on Apache Spark, Scala, and distributed systems.
• Proficiency in Functional Programming: High proficiency in Scala-based functional programming for developing robust and efficient data processing pipelines.
• Proficiency in Big Data Technologies: Strong experience with Apache Spark, Hadoop ecosystem tools such as Hive, HDFS, and YARN.
• Programming and Scripting: Advanced knowledge of Scala and a good understanding of Python for data engineering tasks.
• Data Modeling and ETL Processes: Solid understanding of data modeling principles and ETL processes in big data environments.
• Analytical and Problem-Solving Skills: Strong ability to analyze and solve performance issues in Spark jobs and distributed systems.
• Version Control and CI/CD: Familiarity with Git, Jenkins, and other CI/CD tools for automating the deployment of big data applications.
Desirable Experience:
• Real-Time Data Streaming: Experience with streaming platforms such as Apache Kafka or Spark Streaming.
• Financial Services Context: Familiarity with financial data processing, ensuring scalability, security, and compliance requirements.
• Leadership in Data Engineering: Proven ability to work collaboratively with teams to develop robust data pipelines and architectures.
Education:
- Bachelor’s degree/University degree or equivalent experience
About Company
Citi Bank, officially known as Citibank, is a global financial institution and the consumer division of Citigroup, a leading multinational banking corporation. Established in 1812, Citibank provides a wide range of financial services, including retail banking, credit cards, personal loans, wealth management, and investment banking. With a strong presence in over 100 countries, it serves millions of customers worldwide, offering both individual and business banking solutions. Citibank is known for its digital banking innovations, global reach, and commitment to financial inclusion and economic growth.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).