AI/DevOps Engineer

universal access and systems solutions inc.

Philippines, Central Luzon

2-4 Years

Save

Posted a day ago
Be among the first 10 applicants

Early Applicant

Job Description

AI / DevOps Engineer

Job Description, Skills, Qualifications

Job Overview:

· The AI / DevOps Engineer is responsible for building, deploying, and operating AI-powered software and the infrastructure that runs it. The role combines hands-on software development with DevOps and platform engineering, with a strong focus on Large Language Model (LLM) applications, agentic systems, and workflow automation.

· The individual in this role designs and develops AI-enabled applications and automations, integrates LLM APIs and self-hosted models, and builds the pipelines, infrastructure, and observability needed to ship and run these systems reliably across cloud, on-premise, private cloud, and GPU environments.

· This position bridges development and operations across the full lifecycle, from requirements and system design through CI/CD, deployment, monitoring, security, and post-production support, while continuously evaluating emerging AI tools and practices to improve efficiency and quality.

· Continuous learning is essential, as the AI / DevOps Engineer must stay current with a fast-moving AI and infrastructure landscape, including new models, agentic coding tools, orchestration frameworks, and automation techniques.

Responsibilities:

AI & LLM Application Development

· Design, develop, and maintain AI-powered applications and services that integrate Large Language Models (LLMs) and other machine learning models into business workflows.

· Integrate LLM APIs (such as Anthropic Claude and other providers) as well as self-hosted and open-source models running on private GPU infrastructure.

· Build retrieval-augmented generation (RAG) pipelines using vector databases, embeddings, and semantic search to ground model outputs in enterprise data.

· Design and implement agentic systems and tool-use workflows, including integrations through the Model Context Protocol (MCP) and connections to internal and third-party services.

· Apply prompt engineering, evaluation, and guardrail techniques to improve accuracy, safety, reliability, and cost-efficiency of AI features.

· Write clean, efficient, reusable, and well-tested code following established standards and secure coding practices.

Software Development & Integration

· Design and develop scalable, secure backend services, APIs, and integrations that support AI and automation use cases.

· Perform application and data integration with internal and external systems using RESTful APIs, web services, webhooks, and message queues.

· Translate business requirements into functional and technical specifications, and participate in architecture and design discussions.

· Ensure solutions are compatible across multiple platforms and environments, including cloud, on-premise, and private cloud deployments.

Workflow & Process Automation

· Design, develop, and deploy automation workflows that combine traditional automation with AI-driven decision-making.

· Build automations using modern tools and platforms such as Power Automate, n8n, Zapier-style iPaaS, and custom scripts, replacing manual and repetitive processes.

· Develop and operate desktop and agentic automation, including AI desktop agents (for example, agent-based assistants such as Cowork / Open Claw-style tools) that perform tasks across applications.

· Implement web automation and data extraction where required using tools such as Playwright, Puppeteer, or Selenium.

· Use agentic coding tools such as Claude Code to accelerate development, automate engineering tasks, and build internal tooling.

· Automate IT and operational workflows such as provisioning, monitoring, alerting, ticketing, and incident response.

Infrastructure, Cloud & DevOps

· Build, maintain, and optimize infrastructure across cloud (AWS, GCP), on-premise, and private cloud environments for efficiency, scalability, and reliability.

· Provision and manage GPU compute for model inference and AI workloads, optimizing for performance and cost.

· Design and maintain CI/CD pipelines to automate building, testing, and deployment of applications, models, and automations.

· Manage source control and Git-based workflows on platforms such as GitHub, GitLab, or Bitbucket, including branching strategies, pull/merge requests, and code review processes.

· Containerize and orchestrate workloads using Docker and Kubernetes, and manage infrastructure as code (e.g., Terraform).

· Manage deployment, release, and configuration management, and support smooth promotion of changes from development to production.

· Administer Linux/Unix and Windows environments supporting development and production systems.

Monitoring, Reliability & Performance

· Implement monitoring, logging, alerting, and observability for applications, infrastructure, and AI/LLM workloads using tools such as Prometheus, Grafana, the ELK/Loki stack, Datadog, or cloud-native services (e.g., AWS CloudWatch).

· Track AI-specific metrics such as latency, token usage, cost, accuracy, and quality, and act on the results.

· Proactively identify, troubleshoot, and resolve performance issues and production incidents within agreed timelines.

· Participate in root cause analysis and drive preventive improvements to system reliability and stability.

Security & Compliance

· Apply security best practices across development, automation, and operations, including secrets management, access control, and network security.

· Address AI-specific security and governance concerns such as data privacy, prompt injection, safe handling of sensitive data, and responsible use of models.

· Ensure activities comply with organizational policies, security standards, and audit requirements, and maintain proper version control and documentation.

Collaboration & Continuous Improvement

· Work closely with developers, data and AI engineers, operations staff, business analysts, and other stakeholders to deliver end-to-end solutions.

· Participate in Agile/Scrum ceremonies including sprint planning, daily stand-ups, reviews, and retrospectives.

· Act as a liaison between technical teams and stakeholders, and communicate solutions, trade-offs, and results clearly.

Qualifications:

· Bachelor's degree in Computer Science, Information Technology, Software Engineering, or a related field (or equivalent practical experience).

· At least 2–4 years of combined experience across software development, DevOps, or automation; experience with AI/LLM-based solutions is strongly preferred.

· Demonstrated experience building and deploying applications in cloud, on-premise, or private cloud environments.

· Experience integrating APIs and third-party services, and building automated workflows.

· Working knowledge of Agile/Scrum methodologies and collaboration tools.

· Relevant certifications in cloud (AWS, GCP), DevOps, or AI/ML are an advantage.

Technical Skills

Programming Languages & Core Stack

· Python (required, primary): main language for AI/LLM development, automation, data work, and scripting; experience with frameworks and libraries such as FastAPI or Flask, plus LangChain, LlamaIndex, or the Anthropic and OpenAI SDKs.

· TypeScript / JavaScript (required): for backend services (Node.js) and front-end or full-stack work (React or similar), API integrations, and building agentic and MCP-based tooling.

· Bash / Shell scripting (required): for automation, CI/CD, and Linux/Unix system administration.

· SQL (required): for querying and managing relational databases such as PostgreSQL, MySQL, or SQL Server.

· PowerShell (preferred): for Windows administration and automation in mixed environments.

· Go and/or C# (advantageous): for performant backend services, infrastructure tooling (Go), or .NET-based enterprise integrations (C#).

· Configuration and infrastructure-as-code languages: YAML and JSON for pipelines and config, and HCL (Terraform) for provisioning infrastructure.

· Strong proficiency in the core languages above, with the ability to pick up additional languages as project needs evolve.

· Hands-on experience integrating LLM APIs (e.g., Anthropic Claude, OpenAI) and building AI features such as RAG, agents, and tool use.

· Familiarity with AI frameworks and libraries such as LangChain, LlamaIndex, or similar, and with vector databases (e.g., Pinecone, Weaviate, pgvector, or FAISS).

· Experience with the Model Context Protocol (MCP) and agentic coding tools such as Claude Code is an advantage.

· Strong knowledge of cloud platforms, especially AWS and GCP, plus on-premise and private cloud deployment.

· Experience provisioning and using GPU compute for model inference and training.

· Solid DevOps skills: CI/CD pipelines (e.g., GitHub Actions, GitLab CI/CD, Jenkins), Docker, Kubernetes, and infrastructure as code (e.g., Terraform).

· Experience with Linux/Unix and Windows administration and scripting (e.g., Bash, Python).

· Knowledge of RESTful APIs, JSON, webhooks, and system integration concepts.

· Experience with databases including PostgreSQL, MySQL, MongoDB, or SQL Server.

· Familiarity with workflow automation tools (e.g., Power Automate, n8n) and web automation (e.g., Playwright, Puppeteer, Selenium).

· Strong experience with Git and Git-based platforms such as GitHub, GitLab, and Bitbucket, including branching strategies, pull/merge requests, code review, and repository management, along with modern software development life cycle practices.

· Hands-on experience with monitoring and observability tooling such as Prometheus, Grafana, the ELK/Loki stack, Datadog, or cloud-native services (e.g., AWS CloudWatch), including metrics, logs, dashboards, and alerting for applications, infrastructure, and AI workloads.

· Awareness of AI safety, security, and governance considerations, including data privacy and prompt-injection risks.