Linux Infrastructure Engineering
- Design, deploy, and manage a globally distributed fleet of Linux servers (primarily Ubuntu 24.04 LTS)
- Own the full infrastructure lifecycle, including provisioning, configuration, patching, performance tuning, and decommissioning
- Develop and maintain standardised server builds, templates, and configuration baselines
- Optimise systems for performance, reliability, scalability, and cost efficiency
Automation & Configuration Management
- Use SaltStack extensively for:
- Configuration management
- Orchestration
- Automated deployments
- State enforcement and drift correction
- Develop automation to support rapid, repeatable infrastructure deployment across multiple providers
- Automate operational workflows and eliminate manual intervention wherever possible
- Contribute to infrastructure-as-code standards and practices
Networking & Systems Integration
- Configure and troubleshoot complex networking environments, including:
- IPv4 and IPv6 addressing and routing
- Policy routing
- DNS configuration and troubleshooting
- Firewall management (ufw / iptables / nftables / provider-level firewalls)
- VPN connectivity where required
- Diagnose and resolve network-level and system-level performance issues
Monitoring & Operational Excellence
- Work with our custom in-house monitoring and observability platform
- Monitor infrastructure health, system metrics, logs, and alerts
- Perform root cause analysis and implement long-term solutions
- Continuously improve monitoring coverage, alert quality, and operational visibility
CI/CD & Deployment Integration
- Collaborate with software engineering teams to integrate infrastructure with CI/CD pipelines
- Support automated deployment processes across development, staging, and production environments
- Ensure infrastructure supports rapid and reliable code deployment
Troubleshooting & Incident Resolution
- Act as a senior escalation point and SME for complex infrastructure issues
- Diagnose problems across the full stack, including:
- Operating system
- Network
- Storage
- Virtualisation
- Provider-specific issues
- Implement permanent fixes and preventative improvements
Continuous Improvement
- Identify opportunities to improve reliability, automation, performance, and cost efficiency
- Maintain clear technical documentation and operational procedures
- Contribute to infrastructure strategy and architecture decisions
Technical Environment
You will work with technologies including:
- Ubuntu Linux (24.04 LTS)
- SaltStack
- Multi-provider cloud infrastructure, including:
- OVH
- Linode
- Digital Ocean
- Hetzner
- AWS
- Other global providers
- IPv4 and IPv6 networking
- Custom monitoring and observability platform
- CI/CD pipelines and automation tooling
- Virtualisation platforms including VMware, KVM and provider-managed solutions
- Dedicated physical servers
Required Skills & Experience
- Core Linux & Infrastructure
- Including safe in-place upgrades for new Ubuntu LTS versions
- 5+ years of hands-on Linux systems engineering experience
- Deep knowledge of Linux internals, including:
- System startup and systemd
- Networking stack
- Disk and filesystem management
- Process management and performance tuning
- Proven experience managing large-scale Linux environments
Automation & Configuration Management
- Strong experience with SaltStack (or equivalent transferable skill such as Ansible, Puppet, or Chef)
- Experience implementing infrastructure as code
- Strong scripting ability in:
- Bash (essential)
- Perl (essential)
- Python (preferred)
Networking
- Strong practical networking knowledge including:
- TCP/IP fundamentals
- IPv4 and IPv6 addressing and routing
- Provider failover IP's, floating IP's
- IPv6 subnet routing advanced
- DNS
- Provider API automation for forward and reverse zones
- Firewall configuration
- ufw
- Iptables
- nftables
- Network troubleshooting and diagnostics
Cloud & Virtualisation
- Experience working with cloud infrastructure providers
- Understanding of virtualised environments and underlying infrastructure
Monitoring & Reliability
- Experience working with monitoring and alerting systems
- Strong troubleshooting and root cause analysis skills
Desirable Skills
- Experience with CI/CD tools such as Jenkins
- Experience with containerisation (Docker, Kubernetes)
- Experience with Squid or other proxy technologies
- Experience working with globally distributed infrastructure
- Experience working in performance-sensitive environments
Personal Attributes
- Strong problem-solving ability and technical curiosity
- Ability to work independently and take ownership
- Strong attention to detail
- Clear communication skills
- Ability to work effectively with engineering and product teams