Sr PySpark Developer (Bigdata)AVP- C12 - CHENNAI
Citi Bank
2 - 5 years
Chennai
Posted: 4/15/2025
Job Description
We are looking for a highly skilled PySpark Developer with deep expertise in Distributed data processing. The ideal candidate will be responsible for optimizing Spark Jobs and ensuring efficient data processing in a Big Data platform. This role requires a strong understanding of Spark performance tuning, distributed computing, and Big data architecture.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data system.
- 8+ years of relevant experience in Apps Development or systems analysis and Ability to adjust priorities quickly as circumstances dictate
Key Responsibilities:
- Analyze and comprehend existing data ingestion and reconciliation frameworks
- Develop and implement PySpark programs to process large datasets in Hive tables and Big data platforms
- Perform complex transformations including reconciliation and advanced data manipulations
- Fine-tune Spark jobs for performance optimization, ensuring efficient data processing at scale.
- Work closely with Data Engineers, Architects, and Analysts to understand data reconciliation requirements
- Collaborate with cross-functional teams to improve data ingestion, transformation, and validation workflows
Required Skills and Qualifications:
- Extensive hands-on experience with Python, PySpark, and PyMongo for efficient data processing across distributed and columnar databases
- Expertise in Spark Optimization techniques, and ability to debug Spark performance issues and optimize resource utilization
- Proficiency in Python and Spark DataFrame API, and strong experience in complex data transformations using PySpark
- Experience working with large-scale distributed data processing, and solid understanding of Big Data architecture and distributed computing frameworks
- Strong problem-solving and analytical skills.
- Experience with CI/CD for data pipelines
- Experience with SnowFlake for data processing and integration
Education:
- Bachelor’s degree/University degree or equivalent experience in Computer science
- Master’s degree preferred
We are looking for a highly skilled PySpark Developer with deep expertise in Distributed data processing. The ideal candidate will be responsible for optimizing Spark Jobs and ensuring efficient data processing in a Big Data platform. This role requires a strong understanding of Spark performance tuning, distributed computing, and Big data architecture.
Create data tools for analytics and data scientist team members that assist them in building and optimizing our product into an innovative industry leader.
- Work with data and analytics experts to strive for greater functionality in our data system.
- 8+ years of relevant experience in Apps Development or systems analysis and Ability to adjust priorities quickly as circumstances dictate
Key Responsibilities:
- Analyze and comprehend existing data ingestion and reconciliation frameworks
- Develop and implement PySpark programs to process large datasets in Hive tables and Big data platforms
- Perform complex transformations including reconciliation and advanced data manipulations
- Fine-tune Spark jobs for performance optimization, ensuring efficient data processing at scale.
- Work closely with Data Engineers, Architects, and Analysts to understand data reconciliation requirements
- Collaborate with cross-functional teams to improve data ingestion, transformation, and validation workflows
Required Skills and Qualifications:
- Extensive hands-on experience with Python, PySpark, and PyMongo for efficient data processing across distributed and columnar databases
- Expertise in Spark Optimization techniques, and ability to debug Spark performance issues and optimize resource utilization
- Proficiency in Python and Spark DataFrame API, and strong experience in complex data transformations using PySpark
- Experience working with large-scale distributed data processing, and solid understanding of Big Data architecture and distributed computing frameworks
- Strong problem-solving and analytical skills.
- Experience with CI/CD for data pipelines
- Experience with SnowFlake for data processing and integration
Education:
- Bachelor’s degree/University degree or equivalent experience in Computer science
- Master’s degree preferred
About Company
Citi Bank, officially known as Citibank, is a global financial institution and the consumer division of Citigroup, a leading multinational banking corporation. Established in 1812, Citibank provides a wide range of financial services, including retail banking, credit cards, personal loans, wealth management, and investment banking. With a strong presence in over 100 countries, it serves millions of customers worldwide, offering both individual and business banking solutions. Citibank is known for its digital banking innovations, global reach, and commitment to financial inclusion and economic growth.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).