Airlock Digital

Site Reliability Engineer

Reposted 3 Days Ago

Be an Early Applicant

In-Office

Melbourne, Victoria

Senior level

In-Office

Melbourne, Victoria

Senior level

The Senior Site Reliability Engineer ensures reliability and performance of systems, collaborating with teams to address issues and improve processes.

The summary above was generated by AI

Location: Australia - RemoteWho Are We? About Airlock Digital:

Airlock Digital is a global leader in application control and allowlisting. We seek to empower every organization to run only what they trust and operate free from malware and ransomware. 

With rapid growth across Australia, North America, and EMEA. We are committed to our core values, respect, determination, and integrity. We support a diverse and expanding global customer base. At Airlock, we pride ourselves on being a team of humble, collaborative, and driven professionals who support one another and share a passion for cybersecurity.

What We are Looking For:

The Senior Site Reliability Engineer (SSRE) is responsible for ensuring the reliability, scalability, performance and efficiency of our systems, applications and services. Working closely with cross-functional teams such as development, operations, and infrastructure to proactively identify, troubleshoot and resolve issues to ensure optimal performance and uptime.

Key Responsibilities:

Design, implement, and maintain highly available, scalable, and fault-tolerant systems and services.
Introduce best practices into Airlock Digital around observability, SLO’s and reliability.
Continuously monitor the performance, availability and security of Airlock Digital systems and services and proactively identify and resolve issues.
Identify areas for improvement across the organisation and drive engineering-wide technical change in the field of site reliability.
Collaborate with cross-functional teams to implement and maintain deployment pipelines, monitoring tools and automated testing frameworks.
Develop and maintain document of systems, processes and procedures to ensure knowledge transfer and continuity.
Lead incident response, root cause analysis and post-mortem activities to identify and address underlying issues.
Work with Software Developers to design and implement scalable and resilient applications services and infrastructure.
Participate in on-call rotation to ensure 24/7 support for critical systems and services.
Use Zabbix for monitoring system health, performance, and availability to proactively identify and address issues.

Required Skills & Qualifications:

5+ years of hands-on experience in Site Reliability and Observability Engineering, DevOps or Infrastructure Engineering, debugging, diagnosing and resolving high-severity incidents.
Commercial experience in in at least one programming language such as Python, or Go.
Solid experience with automation tools such as Ansible and containerisation tools like Docker and Podman.
Deep understanding of distributed systems, networking, operating systems, and cloud computing.
Strong troubleshooting and problem-solving skills, and experience in incident response, root cause analysis, and post-mortem activities.
Systematic problem-solving approach, coupled with effective communication skills and a sense of ownership and drive.
Excellent communication skills and the ability to share your ideas and opinions through respectful proposals, presentations, and discussions in a collaborative environment.
Experience with Splunk for log management and data analysis.
Experience in leading and mentoring team members is advantageous

What We Offer:

We don’t think money is everything, but we know it is an important part of your decision to apply for a role. Additional factors considered in extending an offer include responsibilities of the job, education, location, experience, knowledge, skills, abilities, and internal equity, alignment with market data, or applicable laws. 

Flexible Work Environment, Hybrid or Remote – Time Off - Paid Volunteering Time - Birthday Leave - Paid parental Leaves - Home Office Allowance

Our Commitment:

We believe in supporting our team members both personally and professionally. Named one of the Australia’s Greatest Places to Work and 5th best technology company for 2025, we value flexibility, trust, and a work environment that empowers our team to do their best work.

We will be assessing applications as they come in, so we encourage you to send your resume through to us as soon as possible. All official job offers from our company are extended directly by our recruitment team and will be sent through an official BambooHR document for your review and signature. Please be aware that we do not ask for any personal information in the process of extending offers of employment, such as financial details. Upon acceptance of any offer, we will request such information as part of the onboarding process prior to or on your first day of employment, and only after completing a National Police Check through an authorized third-party vendor. If you receive any communication asking for personal details outside of these processes, please contact us immediately to verify the authenticity of the request. Your security is important to us, and we are committed to a safe and transparent hiring experience. No contact from recruitment agencies, thank you. #LI-SS1 #LI-Remote

Top Skills

Ansible

Docker

Podman

Python

Splunk

Zabbix

Similar Jobs

Xero

Principal Engineer

7 Days Ago

Hybrid

Expert/Leader

Cloud • Fintech • Information Technology • Machine Learning • Software

As a Principal Engineer for Site Reliability, you will lead technical direction, improve systemic reliability, and implement performance optimization strategies to ensure critical systems can accommodate growth without service degradation.

Top Skills: Cloud PlatformsDistributed SystemsMicroservices Architecture

Leidos

Site Reliability Engineer

11 Hours Ago

In-Office or Remote

Mid level

Information Technology • Software

The Site Reliability Engineer will maintain critical production systems, ensure operational excellence, manage incident response, and improve CI/CD processes in a secure environment.

Top Skills: AnsibleAWSAzureBashChefGCPGitlab CiGoJenkinsLinux/UnixPuppetPythonTerraformVMware

Culture Amp

Site Reliability Engineer

10 Days Ago

Easy Apply

In-Office

Melbourne, Victoria, AUS

Easy Apply

Mid level

Software

As a Site Reliability Engineer, you will enhance and maintain services, collaborate with engineers, and ensure security standards while supporting infrastructure patterns.

Top Skills: AWSGCPGoKafkaKotlinPythonRubyTypescript

What you need to know about the Melbourne Tech Scene

Home to 650 biotech companies, 10 major research institutes and nine universities, Melbourne is among one of the top cities for biotech. In fact, some of the greatest medical advancements were conceptualized and developed here, including Symex Lab's "lab-on-a-chip" solution that monitors hormones to predict ovulation for conception, and Denteric's vaccine for periodontal gum disease. Yet, the thousands of people working in the city's healthtech sector are just getting started, to say nothing of the tech advancements across all other sectors.