Our client, a leading enterprise enterprise, is seeking a
Senior AI Engineer specializing in
Agentic Workflows and LLM Integration. This specialized engineering role sits at the cutting edge of AI innovation, commanding an organization-wide mandate to design, deploy, and own multi-step autonomous agent systems. The successful candidate will build robust backend infrastructures, orchestrate tool-calling logic, and manage advanced retrieval architectures to deliver resilient, production-grade AI applications within an enterprise framework.
Key Accountabilities
AI Architecture & Agent Workflow Engineering
- Agent Execution Patterns: Design, build, and test highly complex, multi-step agent workflows utilizing established advanced architectural design patterns such as ReAct, planner-executor, and complex tool-chaining.
- LLM Core Integration: Integrate flagship Large Language Models (including Anthropic Claude and OpenAI) with legacy enterprise APIs and internal microservices, engineering robust fault tolerance for retries, edge cases, and degraded ecosystem states.
- Orchestration & Failure Resilience: Implement programmatic tool calling, function orchestration pipelines, and automated compensating actions to guarantee agent workflows remain stable under catastrophic or unexpected third-party failure conditions.
- Human-in-the-Loop Controls: Architect and deploy conditional human-in-the-loop validation frameworks, including automated executive approvals, smart escalations, and exception-handling logic mandated by business governance or risk considerations.
Data Engineering, Prompts & Retrieval (RAG)
- Context & Memory Architecture: Build and manage advanced agent memory retention layers and data retrieval mechanisms utilizing vector databases and Retrieval-Augmented Generation (RAG), tuning indexing schemas to ensure relevant context.
- Prompt Management: Develop, maintain, optimize, and version-control complex prompt logic, semantic routing rules, and supporting technical documentation in accordance with strict enterprise engineering standards.
Production Deployment, Security & Observability
- Cloud Operations: Deploy mission-critical AI services into production cloud environments, actively monitoring logs, distributed traces, and telemetry metrics to rapidly isolate and patch behavioral anomalies.
- Enterprise Governance: Ensure all deployed solutions strictly mirror enterprise-grade security controls, identity management requirements, and rigorous data governance protocols.
- Reliability Engineering: Partner with QA and Core Operations teams to continually upgrade automated test coverage, build out operational runbooks, establish incident response protocols, and drive system performance optimizations.
Requirements
Education & Experience
- Technical Tenure: Minimum of 4 years of hands-on experience developing backend or service-based software architectures using C# and/or Python.
- AI Specialization: At least 1 year of production-level experience working directly with large language models, structured prompt engineering frameworks, or agentic AI-enabled systems.
- Analytical Reasoning: Elite debugging skills with a proven capacity to reason across distributed APIs, asynchronous data flows, and non-deterministic AI system behaviors.
Technical Skills (Required)
- Programming Ecosystems: Production-grade fluency in C# and/or Python, including async workflow patterns, service building, and automated test frameworks.
- Cloud Platform: Microsoft Azure (encompassing compute, scalable storage, IAM identity models, and automated deployment pipelines).
- LLM Integration & Tooling: Direct integration with Claude and/or OpenAI APIs (handling tool calling, prompt tokenization, rate limit mitigation, and state error handling).
- Agent Orchestration Frameworks: Experience using Azure AI Foundry or Microsoft Agent Framework. Hands-on knowledge of LangGraph or LangChain is highly valued.
- Vector Architectures: Experience with Pinecone or equivalent vector databases (handling structural indexing, query retrieval, and semantic relevance tuning).
- APIs & Identity: RESTful API design, enterprise-grade service authentication, and secure service-to-service integrations.
- Observability Suites: Distributed application logging, performance metrics tracking, and transaction tracing in production scales.
Preferred Qualifications (Desirable)
- Containerization deployments utilizing Docker and Azure Container Services (AKS/ACA).
- Experience architecting relational databases (SQL Server) and NoSQL document stores (Azure Cosmos DB).
- Prior experience operating within highly regulated, compliance-driven, or audit-heavy corporate environments.
- Foundational understanding of emerging AI Governance and AI Security (OWASP Top 10 for LLMs) paradigms.