Job Description
Develop custom software solutions to design, code, and enhance components across systems or applications. Use modern frameworks and agile practices to deliver scalable, high-performing solutions tailored to specific business needs.
Job Title: Cloud, Infrastructure & AI Operations Manager Requirements
Proven experience managing hybrid cloud environments with hands-on expertise in AWS and Azure.
Strong background in Kubernetes orchestration, containerization, and cloud-native infrastructure.
Proficient in Splunk administration, including architecture, deployment, and performance tuning.
Solid understanding of AI implementations, including experience or exposure to Large Language Models (LLMs) and their integration into enterprise systems.
Proficiency in Python scripting for automation, infrastructure management, and AI workflows.
Familiarity with Infrastructure as Code (IaC) tools such as Terraform and Ansible.
Strong troubleshooting skills across network, system, and application layers.
Excellent communication and leadership skills with a focus on operational excellence and continuous improvement. Responsibilities
Lead the design, implementation, and management of cloud infrastructure across AWS and Azure platforms.
Oversee the deployment and lifecycle management of Kubernetes clusters and containerized applications.
Manage and optimize Splunk environments for log aggregation, monitoring, and alerting across cloud and on-prem systems.
Define and enforce infrastructure standards, policies, and best practices across environments.
Drive automation initiatives for provisioning, configuration management, and monitoring using Python, Ansible, and Terraform.
Collaborate with application and data teams to support AI workloads, including LLM-based applications and services.
Manage incident response, root cause analysis, and resolution of infrastructure-related issues.
Provide technical leadership and mentorship to engineering teams and ensure alignment with enterprise architecture.
Evaluate emerging technologiesincluding AI and cloud-native toolsand recommend solutions to improve operational efficiency and resilience.
Ensure compliance with security and governance standards across all infrastructure components. Preferred Qualifications
Strong experience in Python backend development for infrastructure automation and AI integration.
Demonstrated success in delivering Infrastructure as a Service (IaaS) or Platform as a Service (PaaS) solutions.
Experience migrating legacy or on-premise applications to Kubernetes-based cloud environments.
Exposure to AI/ML platforms, LLM APIs, or AI model deployment workflows is a plus.
AWS and/or Azure certifications (e.g., Solutions Architect, DevOps Engineer) are highly desirable.
Familiarity with CI/CD pipelines and DevOps practices is a plus. Minimum 7 year(s) of experience is required