Solutions Architect AI Infrastructure Private Cloud
LTM
2 - 5 years
Bengaluru
Posted: 28/02/2026
Job Description
Job Description
We are seeking an experienced Solutions Architect with deep expertise in AI/ML infrastructure, High Performance Computing (HPC), and container platforms to join our dynamic team focused on delivering Cloud AI and Enterprise AI Factory solutions. This role is instrumental in architecting, deploying, and optimizing private cloud environments that support enterprisegrade AI workloads at scale, leveraging validated reference architectures and industrystandard frameworks.
The ideal candidate will bring strong technical expertise in AI infrastructure, container orchestration platforms, and hybrid cloud environments, and will play a key role in delivering scalable, secure, and highperformance AI platform solutions.
Key Responsibilities
1. Leadership and Strategy
- Provide delivery assurance and serve as the lead design authority for enterprisegrade container platforms such as Red Hat OpenShift and SUSE Rancher, aligned with customer AI/ML strategies and business objectives.
- Align solution architecture with modern Enterprise AI Factory design principles including modular scalability, GPU optimization, and hybrid cloud orchestration.
- Oversee planning, risk management, and stakeholder alignment throughout the project lifecycle.
2. Solution Planning and Design
- Architect and optimize endtoend solutions across container orchestration and HPC workload management, leveraging platforms such as Red Hat OpenShift, SUSE Rancher, and workload schedulers like Slurm and Altair PBS Pro.
- Ensure seamless integration of container and AI platforms with the broader software ecosystem, including opensource DevOps and AI/ML tools and frameworks.
3. Opportunity Assessment
- Lead technical responses to RFPs, RFIs, and customerdriven inquiries.
- Conduct ProofofConcept (PoC) engagements to validate performance, feasibility, and integration.
- Assess customer environments and recommend optimal configurations based on validated industry reference architectures and opensource integrations.
4. Innovation and Research
- Stay current with emerging technologies, industry trends, and best practices across HPC, Kubernetes, container platforms, hybrid cloud, and security domains.
5. CustomerCentric Mindset
- Serve as a trusted advisor to enterprise customers, aligning AI solutions with business objectives.
- Translate complex technical concepts into clear value propositions for technical and nontechnical stakeholders.
6. Team Collaboration
- Collaborate with crossfunctional teams, including experts in infrastructure components such as servers, storage, networking, and data science, to ensure cohesive delivery.
- Mentor technical consultants and contribute to internal knowledgesharing sessions, tech talks, and innovation initiatives.
Required Skills
1. HPC & AI Infrastructure
- Extensive knowledge of HPC technologies and workload schedulers such as Slurm and Altair PBS Pro.
- Experience with HPC cluster management tools (generic, without vendor references).
- Strong understanding of highspeed networking technologies such as InfiniBand and Ethernet.
- Experience with performance tuning of HPC components.
2. Containerization & Orchestration
- Handson experience with container technologies: Docker, Podman, Singularity.
- Proficient in at least two container orchestration platforms:
- CNCF Kubernetes
- Red Hat OpenShift
- SUSE Rancher
- RKE / K3S
- Canonical Charmed Kubernetes
- Strong understanding of GPUbased workload environments, including GPU health and performance monitoring frameworks (genericized).
3. Operating Systems & Virtualization
- Strong Linux system administration skills: package management, boot processes, troubleshooting, performance tuning, networking.
- Handson experience with at least two Linux distributions: RHEL, SLES, Ubuntu.
- Experience with virtualization technologies such as KVM and enterprise virtualization for hybrid cloud deployments.
4. Cloud, DevOps & MLOps
(Original text was incompletekept consistent based on context)
- Solid understanding of hybrid cloud deployments, cloud architecture patterns, and cloud automation frameworks.
- Experience with CI/CD, infrastructureascode, and automation tooling.
Skills
Mandatory Skills:
- Azure Cloud Architecture
- Cloud Solution Architecture
- Kubernetes
Good to Have Skills:
- Azure DevOps
- Network Migration
Services you might be interested in
Improve Your Resume Today
Boost your chances with professional resume services!
Get expert-reviewed, ATS-optimized resumes tailored for your experience level. Start your journey now.
