Senior DevOps Engineer APAC/EMEA (with Level 1 & 2 Support Role)
This role is for a Linux-strong, operationally mature DevOps Engineer who can own incidents, improve messy systems, and support clients confidently in a high-autonomy fintech environment.
Role Overview
This is a hands-on DevOps / Operations Engineer role focused on stabilizing systems, improving runbooks, and supporting clients during live operational windows.
In the first 90 days, success is defined less by shipping large infrastructure projects and more by:
- Improving operational readiness
- Reducing alert noise
- Strengthening runbooks and tooling
- Confidently handling client-facing incidents during assigned coverage hours
This is a high-autonomy, remote-first role in a growing fintech environment.
What You'll Be Responsible For
1. Operations & Incident Handling (Primary Focus)
- Monitor and respond to production alerts during your coverage window
- Diagnose issues using logs, system data, and runbooks
- Decide when to investigate further vs escalate to other teams
- Communicate clearly with customers during incidents, even when a fix is not yet known
- Help evolve incident response processes and postmortems
2. Runbooks, Tooling & Operational Maturity
- Improve and expand runbooks and documentation
- Identify gaps, unclear procedures, and recurring pain points
- Create small scripts or utilities to speed up investigations and reduce manual work
- Help clean up and rationalize alerting and monitoring
3. Observability & Platform Support
- Work with existing and evolving observability tools
- Contribute to a custom, in-house observability platform (backend-focused)
- Use monitoring data to guide investigation and automation decisions
4. Customer-Facing Support (as part of operations)
- Handle inbound client issues during your coverage hours
- Triage, diagnose, and route issues appropriately
- Communicate calmly and clearly with technically and emotionally stressed customers
- Understand customer workflows well enough to assess severity and urgency
Required Technical Skills (Must-Have)
Years of Related DevOps Experience
Linux & Systems
- Strong Linux experience in production environments
- Comfortable working directly on servers via SSH
- Confident navigating filesystems and logs
- Skilled with shell tools (grep, awk, sort, etc.)
- Able to create small shell or Python scripts to streamline work
- Basic understanding of how hardware and CPU affinity appear in the OS
Networking & Connectivity
- Basic networking troubleshooting (connectivity, DNS, VPN)
- Comfortable validating service reachability and dependencies
Python (Practical, Not Academic)
- Python 3 experience
- Used for scripting, automation, and backend support
- Expected to contribute at a high level (logic, rules, integration)
- Python work is initially 1020% of the role
C (Reading, Not Writing)
- Able to read C code for troubleshooting and understanding message flows
- No expectation to write or modify C code
Fintech Experience
- Capital markets or fintech exposure (familiarity with terminologies to understand client needs and time sensitivity)
- Mission-critical nature
Nice to Have (Trainable)
- Git and CI/CD familiarity
- Postgres basics (connectivity, simple queries, data inspection)
- Observability tools (ELK or similar)
- Windows troubleshooting for customer-facing applications
- FIX protocol familiarity (conceptual understanding is sufficient)
Working Style & Autonomy (Critical)
- Comfortable working independently with minimal supervision
- Makes sound decisions during incidents and escalates appropriately
- Communicates progress and issues clearly in an asynchronous environment
- Improves systems rather than just running them
- Values simplicity and clarity over over-engineering
Communication & Professional Maturity
- Clear spoken and written English
- Calm, structured communication under pressure
- Able to explain what is known, unknown, and next steps during incidents
- Demonstrates ownership and accountability in a fully remote setup
On-Call & Coverage Expectations
- Participate in operational coverage aligned with APAC / EMEA time zones
- Handle routine incidents independently
- Escalate appropriately, including across time zones when needed
- Participate in post-incident reviews and improvements