Dataproc Lead, Spark, OSS Technologies, Google Cloud
5 - 7 years
Bengaluru
Posted: 22/06/2025
Job Description
Minimum qualifications:
- Bachelor’s degree or equivalent practical experience.
- 5 years of experience with software development in one or more programming languages, and with data structures/algorithms.
- Experience in software development and engineering, incorporating design methodologies, leveraging open source technologies, and working with distributed computing systems, including Apache Spark, Apache Hadoop, and Apache Hive.
- Experience in Open Source technologies, Big Data, Data Analytics, Artificial Intelligence, Machine Learning, and Database Internals.
Preferred qualifications:
- Experience with database optimizations such as query and executor optimizations.
- Experience with data lakes like Apache Iceberg, Apache Hudi, Delta Lake, etc.
- Experience with Open Telemetry, JMX and other monitoring solutions.
- Experience with OSS projects like Spark, Hive, Trino, Ray, Flink etc.
- Experience working with data science tools such as Jupyter notebooks.
- Experience developing Cloud or SaaS products.
About the job
Google Cloud's software engineers develop the next-generation technologies that change how billions of users connect, explore, and interact with information and one another. We're looking for engineers who bring fresh ideas from all areas, including information retrieval, distributed computing, large-scale system design, networking and data storage, security, artificial intelligence, natural language processing, UI design and mobile; the list goes on and is growing every day. As a software engineer, you will work on a specific project critical to Google Cloud's needs with opportunities to switch teams and projects as you and our fast-paced business grow and evolve. You will anticipate our customer needs and be empowered to act like an owner, take action and innovate. We need our engineers to be versatile, display leadership qualities and be enthusiastic to take on new problems across the full-stack as we continue to push technology forward.
Cloud Dataproc enables open source data analytics users (Apache Hadoop, Spark, Trino, Flink, etc.) to lift and modernize their workloads into the cloud. Dataproc is a fast, easy-to-use, fully managed cloud service for running Apache Spark, Apache Hadoop and dozens of other OSS software in a simpler, performant and cost-efficient way. Dataproc also easily integrates with other Google Cloud Platform (GCP) services like BigQuery, Dataplex (governance, lineage), Catalog Stores to give a powerful and complete platform for data processing, analytics, and machine learning.
Google Cloud accelerates every organization’s ability to digitally transform its business and industry. We deliver enterprise-grade solutions that leverage Google’s cutting-edge technology, and tools that help developers build more sustainably. Customers in more than 200 countries and territories turn to Google Cloud as their trusted partner to enable growth and solve their most critical business problems.
Responsibilities
- Build high-impact customer-facing features which make Cloud Dataproc the best place to run Spark, Ray, Trino, Flink and newer technologies in the cloud.
- Define the roadmap for Open Source technologies like Spark, Ray, Trino, Flink, etc.
- Define and implement the next generation Data Lakes and Lake Houses focusing on technologies like Iceberg, Hudi and Delta.
- Optimize the open source technologies for performance and efficiency.
- Design and build software stack to take advantage of Google technologies for faster cluster setup, efficient cluster operations, comprehensive monitoring and observability.
About Company
Google is a multinational technology company founded in 1998 by Larry Page and Sergey Brin. It is best known for its search engine but also develops products and services in areas like online advertising (Google Ads), cloud computing (Google Cloud), operating systems (Android, Chrome OS), web browsers (Chrome), and consumer electronics (Pixel devices, Nest). Google is a subsidiary of Alphabet Inc., its parent company formed in 2015. It plays a major role in shaping the internet, AI, and digital innovation globally.
Services you might be interested in
One-Shot Campaign
Reach out to ideal employees in one shot!
The intelligent campaign for reaching out to the ideal audience to whom you can ask for help (guidance or referral).