
Search by job, company or skills
AI ENGINEER
SHIFT: DAY / MORNING PHT
TYPE: FULL TIME, INDEPENDENT CONTRACTOR
Our Client a is the travel retail business of the Lagardère group in Australia and New Zealand. They operate stores across three business lines — Travel Essentials & Specialty, Duty Free & Luxury, and Food Service — in every major airport in the region. Our purpose: to create magical moments for travellers, every day.
The Transformation Office leads Data & AI, Digital, IT, Master Data, and Supply Chain Planning across the group. They are mid-way through a significant uplift: Duty Free is in 3PL transition (NZ went live April 2026, AU goes live November 2026), and our Data & Analytics platform on Microsoft Fabric is becoming the operating spine of the business. AI is the next layer.
Why this role exists
We need someone who can take an idea — forecast passenger spend by category, agent that drafts replenishment proposals from sell-through data, copilot for category managers querying Fabric in natural language — from prototype to production. Not a researcher. Not a prompt engineer. An engineer who ships AI systems that real operators use, on the Microsoft stack we already run.
You will work directly with Harry and the broader Data & Analytics function. Your work will be visible at executive level within 90 days.
What you will do
In the first 90 days
• Ship one production-grade LLM application (agent or copilot) end-to-end — backend, evaluation harness, web front-end — addressing a specific Duty Free or Travel Essentials business problem.
• Stand up an evaluation and observability layer (LangFuse, Langsmith, or equivalent) for all LLM-powered features going forward.
Ongoing
• Design, build, and operate LLM-powered agents and applications on the Microsoft AI stack — primarily Azure AI Foundry, Microsoft Fabric (notebooks, OneLake, semantic models), and Azure OpenAI; supplemented with Anthropic and open-source models where they win on quality or cost.
• Build the web front-ends end-users actually interact with (React + TypeScript; FastAPI or .NET back-ends; auth via Entra ID).
• Apply rigorous data science where the problem warrants it: forecasting, classification, anomaly detection, uplift modelling. Know when a SQL query, a tree-based model, and an LLM are each the right tool — and when none of them is.
• Own the eval harness for every model you ship. No production deployment without a measurable quality bar.
• Pair daily with Sydney-based data engineers and analytics engineers; write and review code in our shared repos using Claude Code.
• Stay close to the commercial reality of the business: passenger segments, category performance, supplier dynamics. The best AI use cases come from understanding what operators actually do.
What you must have
Technical — non-negotiable
• 6+ years of professional software, data, or ML engineering experience, of which at least 1.5 years has been hands-on building and operating production systems that use large language models (RAG, agents, structured generation, tool use, function calling). Hobby projects, hackathons, and proof-of-concepts that never saw real users do not count for this part.
• Demonstrable production experience with at least one agent framework — LangChain / LangGraph, Microsoft Semantic Kernel, AutoGen, the Anthropic Agents SDK, Pydantic AI, or equivalent — and the judgement to know when an agent is the wrong abstraction. Many problems we solve don't need agents; the engineers who recognise this are the ones we want.
• Production experience with at least one evaluation framework (LangFuse, Langsmith, Ragas, DeepEval, Phoenix, or a serious in-house equivalent). You have built golden datasets, defined task-specific metrics beyond vibes, run regression evals in CI, and can talk about an LLM regression you caught before it shipped.
• Strong RAG fundamentals: chunking strategies, embedding model selection, hybrid search, reranking, evaluation of retrieval quality independent of generation quality. You have moved a RAG system from a demo that worked to a system that handles edge cases at scale.
• Strong data science foundations: comfortable with the maths behind the methods, not just the API calls. Python (Pandas, scikit-learn, PyTorch or TensorFlow), SQL at depth (window functions, CTEs, query optimisation, dimensional modelling), and at least one notebook-driven workflow on a serious dataset. You can pick the right model for a tabular problem without reaching for an LLM.
• Full-stack web development capability: you have personally built and shipped at least two front-end applications with non-trivial state — React, Next.js, or Vue. You can wire authentication
(OAuth2, OIDC, or session-based), deploy to a cloud, instrument with telemetry, and debug a production incident on your own.
• Deep familiarity with the Microsoft data and AI ecosystem: Azure (App Service, Functions, Container Apps), Azure OpenAI, Azure AI Foundry, Microsoft Fabric (Lakehouse, notebooks, semantic models, Data Factory pipelines, DirectLake), Entra ID for auth. Equivalent AWS or GCP experience plus a credible willingness to retool will be considered, but Microsoft is the home stack and you must be productive on it within 60 days.
• Fluent with AI-assisted development tooling. Our team uses Claude Code daily for engineering work. You are productive with it (or with Cursor or equivalent agentic coding tools) on day one — not learning it for the first time. You can describe the workflow patterns you use, what you trust an agent to do, and where you still review every diff.
• Comfortable with the full engineering lifecycle: Git workflows, code review, CI/CD pipelines (GitHub Actions or Azure DevOps), containerisation (Docker), basic infrastructure-as-code (Bicep, Terraform, or equivalent), observability (logs, metrics, traces). You don't throw work over a wall to a separate ops team — there isn't one.
• Production-grade engineering habits: meaningful tests (not coverage theatre), structured logging, error handling that distinguishes between transient and permanent failures, retry and backoff logic, cost monitoring for LLM API calls, prompt versioning, and an awareness of the security and privacy surface of any system that handles enterprise data.
• Professional written and spoken English. You will work asynchronously with an Australian team across a time-zone gap; ambiguity, technical specifications, code reviews, and incident write-ups all happen in English. We will assess this directly in the interview process.
Behavioural — non-negotiable
• Autonomous. You do not need to be managed by the hour. You take an ambiguous problem, scope it, propose an approach, check in at the right moments, and ship — without anyone in Sydney chasing you. The Data & Analytics Manager is here to unblock you, not to assign your tasks day-to-day. If you need close direction to do good work, this is not the role for you.
• Disciplined documenter. You write down what you build, why you made the choices you made, and what would have to be true for the next engineer to maintain it. Architecture decision records, runbooks, prompt libraries, eval results, post-incident write-ups — these are not afterthoughts; they are part of the work. We have just experienced what it looks like to lose institutional knowledge with a departing engineer; we are not doing that again.
• Async-first, written-first communicator. You document decisions in writing, leave context for the next reader, and don't rely on synchronous calls to make progress. Across a time-zone gap with Sydney, this is the difference between someone who is productive and someone who is constantly blocked.
• Evidence-driven. You instrument what you build and you can name the metric that says it is working. Vibes are not a deployment criterion.
• Bias to shipping. You would rather have something imperfect in production with a feedback loop than something perfect on your laptop. You know the difference between not yet good enough and perfect is the enemy of done.
• Comfortable in a small, senior team where there is no one to escalate ambiguity to. You scope your own work.
What would set you apart
• Production experience with Anthropic's API, including Claude Code, Computer Use, the Files API, or MCP server development.
• Retail, travel retail, FMCG, or consumer-goods domain experience — particularly category management, demand forecasting, or assortment planning.
• Experience operating in a cross-border data context (data residency, masked-data pipelines for development environments, handling personally identifiable information across jurisdictions).
• Open-source contributions in the LLM / agent / RAG space, or speaking at AI engineering communities.
• Microsoft Fabric certifications (DP-600, DP-700) or Azure AI Engineer Associate (AI-102).
What you get from us
• A small, senior, English-speaking team that values shipping. No middle management between you and the CTO-level sponsor (the Chief Transformation Officer).
• Real business problems with measurable commercial impact, on a Microsoft Fabric platform we are still actively shaping — your architectural choices will compound.
• Compensation benchmarked to senior technology bands in your local market, with annual review.
• Full statutory benefits package via our Employer of Record in your country of residence.
• Hardware and the productivity and AI tooling stack the team uses.
Home Office Requirements:
Bachelors/ Degree
Kinetic Innovative Staffing
Job ID: 148736841
Skills:
Git, React, Apis, Code, Python, AWS, GitHub Copilot, AI tools, SQL-based systems, Playwright, Claude
Skills:
Power Platform, MLops, Azure Functions, Terraform, App Services, Arm, Vms, Python, Azure DevOps, RAG workflows, vector search, security governance, AKS, Azure OpenAI, Azure AI Services, knowledge integration, Compliance, GitHub Actions, Bicep, Azure Monitor, Application Insights, Copilot Studio
Skills:
Typescript, automation, Authentication, Python, Databases, Apis, Anthropic Claude, logs, browser agents, tool calling, vector databases, LangGraph, structured outputs, workflow automation tools, OpenAI APIs, API integrations, LangChain, CrewAI, Vercel AI SDK, Production Systems, AutoGen, agentic workflows, LLM applications, queues, RAG, MCP, function calling
Skills:
Typescript, automation, Authentication, Python, Databases, Apis, Anthropic Claude, logs, tool calling, browser agents, vector databases, LangGraph, structured outputs, workflow automation tools, OpenAI APIs, API integrations, LangChain, CrewAI, Vercel AI SDK, Production Systems, AutoGen, agentic workflows, LLM applications, queues, RAG, MCP, function calling
Skills:
Erp, Rest Api, Bpm, Sap, Oracle, Netsuite, Python, Pandas, Sql, Etl, Powerbi, epicor, dmt, Airflow
We don’t charge any money for job offers