Site Reliability Developer 4
Oracle
6 - 10 years
Bengaluru, Hyderabad
Posted: 03/03/2026
Job Description
Job Summary:
As a Principal Cloud Engineer (SRE), you will play a key role in ensuring the reliability, performance, and scalability of modern cloud-based data platforms. This position involves close collaboration with development, operations, and security teams to automate processes, monitor system health, and maintain optimal uptime for critical production workloads. You will leverage your technical expertise to design, automate, and maintain large-scale data pipelines and lakehouse infrastructure, supporting mission-critical data engineering and analytics initiatives.
Key Responsibilities:
- Design, implement, and maintain scalable, secure cloud infrastructure for large data platforms (data lakes, data warehouses, and lakehouse solutions) on OCI, AWS, Azure, or GCP.
- Collaborate with Data Engineering teams to build robust, automated ETL/ELT pipelines using tools such as Apache Spark, Databricks, Kafka, or Oracle Cloud Data Integration.
- Implement site reliability engineering best practices tailored for data systems: SLO/SLI definition, error budgeting, automated monitoring, data integrity validation, and incident response for data workloads.
- Design and optimize data storage solutions leveraging both structured and unstructured storage (object storage, data lake/lakehouse platforms like Delta Lake, Iceberg etc.,).
- Automate infrastructure provisioning and CI/CD deployments for data pipelines and analytic workloads with tools like Terraform, Ansible, or CloudFormation.
- Instrument and monitor data platform components for performance, availability, resource consumption, and data quality using observability tools (e.g., Grafana, Splunk).
- Troubleshoot and resolve complex data pipeline or infrastructure issues, conducting root cause analyses and post-incident reviews.
- Advocate for and implement security, governance, and compliance best practices—including data privacy, encryption, and access controls.
- Mentor junior team members and promote knowledge sharing around data platform reliability.
Qualifications:
- Bachelor’s or Master’s in Computer Science, Engineering, Data Science, or related field, or equivalent experience.
- 6 or more years experience in cloud engineering, SRE, or DevOps roles with at least 4 years supporting data engineering initiatives.
- Practical experience designing and operating large-scale cloud-based data platforms (data lakes, warehouses, or lakehouses).
- Strong hands-on skills with infrastructure-as-code (e.g., Terraform), automation (Python/Scala), and containerization (Kubernetes, Docker).
- Familiarity with data processing frameworks (Apache Spark, Databricks, Hadoop), as well as orchestration tools (Airflow, Oozie, or similar).
- Working knowledge of distributed storage, data formats (Parquet, Avro), and modern analytics platforms.
- Solid understanding of networking, cloud security, and regulatory compliance for data platforms.
- Strong analytical, troubleshooting, and communication skills.
- Preferred certifications: Cloud Architect/Engineer (OCI, AWS, Azure, GCP), Databricks, or relevant data engineering credentials.
About Company
Oracle Corporation is a global technology company best known for its enterprise software products and cloud solutions. It specializes in database management systems, cloud infrastructure, enterprise resource planning (ERP), customer relationship management (CRM), and supply chain management software. Oracle helps organizations of all sizes manage, store, and analyze data efficiently, offering both on-premise and cloud-based solutions.
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
