Job Summary
- Responsible for leading the transition of our Generative AI initiatives from experimental proofs-of-concept to resilient, enterprise-scale production systems. This role is pivotal in establishing the architectural standards and operational rigor required to maintain high-performing, secure, and cost-effective LLM applications within our cloud and on-premise environment.
How will you contribute
- 1. LLM Experimentation & Evaluation Frameworks
- - Architect and maintain systematic environments for prompt versioning, hyperparameter tuning, and model comparison.
- - Establish automated evaluation pipelines utilizing metrics such as faithfulness, relevancy, and toxicity to benchmark model iterations.
- - Implement Human-in-the-loop (HITL) workflows to validate automated metrics against domain-specific banking requirements.
- 2. Architectural Oversight & Integration
- - Define and govern the end-to-end architecture for LLM-based systems, specifically focusing on Retrieval-Augmented Generation (RAG) and Agentic workflows.
- - Ensure seamless synchronization between data ingestion layers, vector databases, andinference endpoints.
- 3. Production Deployment
- - Design and implement CI/CD/CT (Continuous Testing) pipelines tailored for the non-deterministic nature of Generative AI.
- - Manage model serving infrastructure, focusing on optimizing throughput, minimizinglatency, and implementing robust auto-scaling strategies.
- - Ensure proper audit-ready documentation of Gen AI / LLM assets and testing standards.
- 4. Adversarial Red Teaming & Security Governance
- - Lead proactive red teaming exercises to identify and mitigate risks including prompt injection, jailbreaking, and inadvertent PII exposure.
- - Deploy real-time guardrails . PII masking/redaction layers and moderation layers to ensure alignment with corporate compliance and safety standards.
- - Conduct due-diligence on 3rd party model, tool, infrastructure and Gen AI solution providers.
- 5. GenAI Lifecycle & Cost Management:
- - Monitor and optimize the Total Cost of Ownership (TCO) by analyzing token consumption, cache hit ratios, and model performance-to-cost efficiency.
- - Oversee version control and lifecycle management for foundational models, fine-tuned adapters, and embedding models.
What will make you successful
- Degree in Mathematics, Statistics, Computer Science, Engineering, Physics or related field.
- 5+ years of total experience in Data Science, Software Engineering, or DevOps.
- Actual experience in deploying scalable, secure LLM applications in a production-grade cloud and hybrid environment
- Experience in utilizing MLOps, DevOps frameworks and tools.
- Core Engineering: Expert proficiency in Python and SQL. Competency in API development (e.g. FastAPI/Flask). Orchestration & Workflow Management: Mastery of LangChain, LlamaIndex, etc for managing complex application logic.
- Vector Database Operations: Advanced administration of vector database operations, including index optimization and metadata filtering strategies.
- LLM Observability: Proficiency in specialized monitoring platforms for trace analysis and drift detection.