Illumio

Staff Site Reliability Engineer

Reposted 17 Days Ago

Be an Early Applicant

Australia

Senior level

Australia

Senior level

The Staff Site Reliability Engineer will design and maintain scalable, secure cloud infrastructure, focusing on reliability, automation, and collaboration with development teams to ensure high availability of critical applications.

The summary above was generated by AI

Location: This role will be remote in AustraliaOnwards Together!

Illumio, the pioneer and market leader of Zero Trust segmentation, prevents breaches from becoming cyber disasters. Illumio protects critical applications and valuable digital assets with proven segmentation technology purpose-built for the Zero Trust security model. Illumio ransomware mitigation and segmentation solutions see risk, isolate attacks, and secure data across cloud-native apps, hybrid and multi-clouds, data centers, and endpoints, enabling the world’s leading organizations to strengthen their cyber resiliency and reduce risk.

Illuminate the future with Illumio and join a team that’s passionate about developing cutting-edge security solutions that protect the world's most critical assets.

Our Team's Vision:

Our Engineering team is driven by a culture that thrives on visionary leadership, autonomy, and ownership, creating a dynamic synergy that drives us forward in the ever-evolving landscape of cybersecurity.

When you join our team, you become part of the leader in Zero Trust Segmentation. You will work with a cutting-edge technology stack that spans operating systems, distributed applications, and immersive UI/visualization tools.

We're shaping the future of cybersecurity. And together, we will continue to build world-class products—led by people with different perspectives, backgrounds, and a commitment to innovation in a time when the world faces its greatest cybersecurity threats in history.

The Cloud Operations team at Illumio is working to deploy and manage our SaaS services by reducing human error, aggressively focusing on automation, and providing deep insight into application behaviour and health! We do that by incorporating aspects of software engineering and applying them to infrastructure and operations problems to create and manage scalable and reliable distributed software systems.

Your Impact:

We are looking for a backend/platform or SRE engineer with a demonstrated track record of building secure, large scale, highly available services using automation and Infrastructure as Code, who is well versed in cloud architecture (with a focus on Kubernetes), and loves to delight the engineers they support.

This engineer will be an essential member of our Operations team, collaborating with the Platform and Data engineers to deliver the latest Illumio products.

The Cloud Platform SRE Engineer will be responsible for designing and deploying scalable, reliable, and secure cloud infrastructure. This individual must have a thorough understanding and experience with AWS and/or Azure clouds. The platform will be based on Kubernetes and is built using cloud native technologies. The Cloud Platform SRE is responsible for building, operating, and maintaining this platform. They are responsible for defining and meeting Platform SLOs, capacity utilization, cost visibility, security compliance etc. They are highly critical to the success of the Multi-cloud Platform.

Driving reliability improvements back into applications
Building code to resolve reliability/resiliency issues
Mentor and educate team members to aid in strengthening technical expertise
Collaborate closely with cloud architects to drive cloud solutions
Curating proper SLI/SLOs to accurately measure or assess error budgets
Embed with the development teams to assist with cloud methodologies when developing products to ensure that the deliverable is as reliable as possible
Work with development teams to build and strengthen application security and compliance
Manage high impact situations that involve technically challenging issues across diverse audiences and drive to find the root cause, mitigate, and identify a solution
Focus on observability

Your Toolkit:

Bachelor's degree in Computer Science, Engineering, or related field; or equivalent work experience
6+ years of relevant SRE, DevOps, Platform or Infrastructure Engineering experience.
4+ years in production support role in a fast-paced industry/organization
Experience deploying, tuning, and maintaining Linux-based, highly available, fault-tolerant web platforms in public cloud providers such as AWS, Azure, and GCP
Common monitoring, log aggregation, and metrics gathering platforms experience (Icinga, Sensu, Splunk, Telegraf/InfluxDB, et. al.)
Configuration management & orchestration tools experience like Chef, Ansible, and AWS Services & APIs, or equivalent
Experience scripting/coding with Python, Java, Ruby and/or Go.
Experience with MySQL, PostgreSQL, Redis, or similar
Solid knowledge of Linux operating system, Ubuntu, RHEL, OEL7 is required
EKS and/or AKS frameworks
Knowledge/Experience of Incident Management/on-call: PagerDuty
Knowledge of Database Technologies, Release Management, REST, SRE, etc.
Load balancers/ Traffic manager knowledge
Experience working with Kubernetes, Docker, or other virtualization & containerization technologies
Networking basics and trouble shooting skills
Good understanding of Production deployment, Distributed Environments required
Strong problem solving and operational process skills, attention to detail
Application support and debugging experience in a dynamic fast-paced production environment
Experience with SDLC principles, architecture and operations.
Experience working with senior leadership both inside and outside of engineering.
Ability to manage multiple tasks and competing priorities to deliver projects on schedule
Azure certifications such as Azure Administrator, Azure Developer, or AWS/GCP certifications are a plus

Benefits:

At Illumio we offer a wide range of benefits to our eligible team members. Our benefit programs vary by location and can include Medical, Dental, Vision Coverage – Health and Dependent Savings Accounts – Life and Disability Programs – Paid Parental Leave – Voluntary Benefit Programs – Company Sponsored Wellness Program – Wellness Reimbursement Program - Retirement Savings – Equity Opportunities – Paid time off and Paid Holidays – Employee Incentive Program. #LI-SL1 #LI-REMOTE

Our Commitment:

Illumio believes that an environment of unique backgrounds, experiences, viewpoints, and individual contributions drives our success and makes us stronger together. We are dedicated to creating and maintaining a diverse culture and emphasizing inclusion and belonging.

All official job offers from our company are extended directly by our recruitment team and will be sent through an official DocuSign document for your review and signature. Please be aware that we do not ask for any personal information in the process of extending offers of employment, such as financial details or social security numbers. Upon acceptance of any offer, we will request such information as part of the onboarding process prior to or on your first day of employment, and only after completing a background check through an authorized third-party vendor. If you receive any communication asking for personal details outside of these processes, please contact us immediately to verify the authenticity of the request. Your security is important to us, and we are committed to a safe and transparent hiring experience.

Top Skills

Ansible

AWS

Azure

Chef

Docker

GCP

Java

Kubernetes

MySQL

Postgres

Python

Redis

Ruby

Similar Jobs

Commonwealth Bank

Staff Site Reliability Engineer

2 Days Ago

Senior level

Fintech • Financial Services

As a Staff Site Reliability Engineer, you'll lead technical design and implementation for scalable solutions, influencing senior stakeholders with SRE best practices to ensure reliability and operational excellence.

Top Skills: AppdynamicsAWSAws CloudwatchAzureGCPGitGoGrafanaJenkinsLinuxOctopus DeployPrometheusPythonSplunkTeamcity

VGW

Staff Site Reliability Engineer

11 Days Ago

Perth, Western Australia, AUS

Senior level

Gaming • Mobile • Esports

The Staff Site Reliability Engineer will improve system reliability and scalability, mentor team members, and lead incident response efforts. Responsibilities include utilizing observability tools and collaborating across teams.

Top Skills: Amazon AuroraAmazon EksCloudfrontDynamoDBGithub ActionsGrafanaHoneycombS3Terraform

Airtasker

Senior Site Reliability Engineer

Yesterday

Sydney, New South Wales, AUS

Senior level

Information Technology • Software

As a Senior Site Reliability Engineer, you will manage infrastructure, optimize services, ensure system reliability, and support product teams with deployment tools and incident response.

Top Skills: AWSCC++GoJavaKotlinKubernetesNode.jsPythonRubyTerraform

What you need to know about the Melbourne Tech Scene

Home to 650 biotech companies, 10 major research institutes and nine universities, Melbourne is among one of the top cities for biotech. In fact, some of the greatest medical advancements were conceptualized and developed here, including Symex Lab's "lab-on-a-chip" solution that monitors hormones to predict ovulation for conception, and Denteric's vaccine for periodontal gum disease. Yet, the thousands of people working in the city's healthtech sector are just getting started, to say nothing of the tech advancements across all other sectors.