Search by job, company or skills

August 99

Site Reliability Engineer Night Shift

new job description bg glownew job description bg glownew job description bg svg
  • Posted 6 days ago
  • Be among the first 10 applicants
Early Applicant

Job Description

Overview

Site Reliability Engineering (SRE) is an engineering discipline that combines software and systems engineering to build and run production systems. SRE ensures that our servicesboth our internally critical and our externally-visible systems, e.g. GitLab/developer tooling and hosted client sites for the companyhave reliability and uptime appropriate to users needs and a fast rate of improvement while keeping an ever-watchful eye on capacity and performance.

Key Responsibilities

  • Oversee the full lifecycle of services from design and deployment to ongoing operations and continuous improvement.
  • Build, configure, and maintain reliable server infrastructure to meet performance and security standards.
  • Monitor system health, performance, and capacity to ensure uptime and quick issue resolution.
  • Strengthen system security through proactive monitoring, threat assessment, and implementation of best practices.
  • Support incident response with a blameless culture and continuous learning from postmortems.
  • Optimize resources and fine-tune performance to enhance reliability and efficiency.
  • Apply system updates, patches, and service enhancements regularly to maintain stability and security.
  • Develop automation tools and configuration procedures to streamline operations.
  • Design, debug, and enhance software programs and tools to support database, application, and network needs.
  • Lead and mentor developers, providing technical guidance and implementing architectural improvements.
  • Identify and mitigate potential vulnerabilities by staying updated on emerging security threats.
  • Exercise independent judgment in managing complex projects and collaborating across teams.

Minimum Qualifications

  • Degree in computer science or similar IT degree program.
  • Working experience with multiple POSIX operating systems (e.g. CentOS, Ubuntu, macOS).
  • Advanced knowledge of at least one server-grade GNU/Linux distribution (e.g. CentOS, Ubuntu).
  • Advanced knowledge of database optimization and SQL queries (specifically MySQL/MariaDB).
  • Good scripting skills using POSIX scripting toolkits (bash, sed, awk, python, perl, etc). Knowledge of general purpose programming languages such as PHP, C, C++, and Java a plus.
  • Expertise/advance knowledge with Wordpress setup and configuration.
  • Demonstrated experience working with monitoring and analytics tools (e.g. Sysdig, Papertrail, Nagios, Cacti, Splunk).
  • Knowledge of best practices in regards to security/encryption and service configuration (SSL/TLS, SFTP, password management, access restrictions, firewalls, ports, etc.).
  • Basic knowledge of AWS, Rackspace, or Google Cloud services and tools.
  • RHCSA/RHCSE, or a comparable certification, is required

More Info

Job Type:
Industry:
Employment Type:

About Company

Job ID: 134852451