JPMorganChase Logo

JPMorganChase

Lead Site Reliability Engineer

Posted 11 Days Ago
Be an Early Applicant
Hybrid
Fort Worth, TX
Senior level
Hybrid
Fort Worth, TX
Senior level
As a Lead Site Reliability Engineer, guide teams in reliability practices, automate processes with AI, lead incident management, and mentor engineers to optimize operational performance.
The summary above was generated by AI
Job Description
Assume a critical role in defining the future of a globally recognized firm and have a direct and significant effect in a realm tailored for top achievers in site reliability.
As a Lead Site Reliability Engineer at JPMorgan Chase within the Corporate Technology within Global Finance Tech team, you hold a leadership role in your team, demonstrate strong knowledge across multiple technical domains, and advise others on the technical and business issues facing them. Take lead and conduct resiliency design reviews, break up complex problems into digestible work for other engineers, act as a technical lead for medium to large-sized products, and provide advice and mentoring to other engineers.
Job responsibilities
  • Advocate and embody site reliability principles, fostering a culture of excellence and technical influence within your team.
  • Leverage AI tools to enhance operational effectiveness and automate processes, ensuring high-quality customer service.
  • Spearhead projects aimed at enhancing the reliability and stability of applications and platforms.
  • Utilize data-promoten analytics and AI technologies to automate detection, diagnosis, resolution processes, elevate service levels and promote continuous improvement.
  • Engage stakeholders to establish realistic service level objectives and error budgets, ensuring alignment with customer expectations.
  • Exhibit advanced technical proficiency in one or more domains, proactively addressing technology-related bottlenecks.
  • Employ AI-promoten solutions to streamline processes and enhance operational efficiency.
  • Serve as the primary contact during major incidents, demonstrating the ability to swiftly identify and resolve issues to prevent financial losses.
  • Act as a culture carrier by documenting and disseminating knowledge through internal forums and communities of practice.
  • Mentor team members, guiding them in the strategic adoption of AI technologies to enhance operational effectiveness and customer service.

Required qualifications, capabilities, and skills
  • Formal training or certification on site reliability engineering concepts and 5+ years applied experience.
  • Proven success in an SRE or senior DevOps role, with deep knowledge of service level indicators/objectives (SLIs/SLOs), incident management, postmortem analysis, and systems reliability.
  • Expert with observability stacks (e.g., Prometheus, Grafana, Splunk, OpenTelemetry), including deep experience correlating telemetry across services and time.

  • Hands-on skills in coding (at least one high-level programming language), cloud platforms (AWS or GCP), container orchestration (Kubernetes), infrastructure as code (Terraform), and resilient CI/CD pipelines.
  • Active experience or deep curiosity in applying AI to operations-such as LLM-based copilots, anomaly detection, automated runbooks, autonomous agents (e.g. CrewAI, LangGraph), or Retrieval-Augmented Generation (RAG) workflows for support.
  • A track record of delivering under pressure. You finish what you start, adapt to uncertainty, and thrive in high-accountability environments.

  • You deconstruct complexity, organize effectively, and drive clarity into ambiguous operational environments. Documentation and design are second nature.

  • Outstanding communication, empathy, and professionalism-especially during incidents. You recognize that great systems serve real people.

Preferred qualifications, capabilities, and skills
  • Experience with operational and compliance rigor in banking, fintech, or similar.
  • Practical use of LLM frameworks (e.g. LangChain, Semantic Kernel), AI orchestration tools, vector databases, or custom agents supporting reliability workflows.
  • Experience with game days, chaos experiments, or failure-mode analysis to improve service robustness.
  • A background in mentoring engineers or leading technical knowledge-sharing, especially around AI and SRE best practices.

#LI-RB3
About Us
Chase is a leading financial services firm, helping nearly half of America's households and small businesses achieve their financial goals through a broad range of financial products. Our mission is to create engaged, lifelong relationships and put our customers at the heart of everything we do. We also help small businesses, nonprofits and cities grow, delivering solutions to solve all their financial needs.
We offer a competitive total rewards package including base salary determined based on the role, experience, skill set and location. Those in eligible roles may receive commission-based pay and/or discretionary incentive compensation, paid in the form of cash and/or forfeitable equity, awarded in recognition of individual achievements and contributions. We also offer a range of benefits and programs to meet employee needs, based on eligibility. These benefits include comprehensive health care coverage, on-site health and wellness centers, a retirement savings plan, backup childcare, tuition reimbursement, mental health support, financial coaching and more. Additional details about total compensation and benefits will be provided during the hiring process.
We recognize that our people are our strength and the diverse talents they bring to our global workforce are directly linked to our success. We are an equal opportunity employer and place a high value on diversity and inclusion at our company. We do not discriminate on the basis of any protected attribute, including race, religion, color, national origin, gender, sexual orientation, gender identity, gender expression, age, marital or veteran status, pregnancy or disability, or any other basis protected under applicable law. We also make reasonable accommodations for applicants' and employees' religious practices and beliefs, as well as mental health or physical disability needs. Visit our FAQs for more information about requesting an accommodation.
Equal Opportunity Employer/Disability/Veterans
About the Team
Our Consumer & Community Banking division serves our Chase customers through a range of financial services, including personal banking, credit cards, mortgages, auto financing, investment advice, small business loans and payment processing. We're proud to lead the U.S. in credit card sales and deposit growth and have the most-used digital solutions - all while ranking first in customer satisfaction.

Top Skills

AI
AWS
GCP
Grafana
Kubernetes
Opentelemetry
Prometheus
Splunk
Terraform

Similar Jobs at JPMorganChase

Yesterday
Hybrid
Fort Worth, TX, USA
Senior level
Senior level
Financial Services
The Senior Lead Site Reliability Engineer will integrate non-functional requirements, mentor engineers, implement reliability designs, and manage risk for applications.
Top Skills: .NetAWSDatadogDockerDynatraceEcsGitlabGrafanaJava Spring BootJenkinsKubernetesPrometheusPythonSplunkTerraform
11 Days Ago
Hybrid
Houston, TX, USA
Senior level
Senior level
Financial Services
Lead the site reliability team at JPMorgan Chase, ensure service reliability, optimize systems, mentor junior members, and drive DevOps practices.
Top Skills: AIAWSAzureCyberarkDatadogDevOpsDockerDynatraceGCPGrafanaHashi VaultHashicorpKubernetesMlPrometheusSplunk
14 Days Ago
Hybrid
Fort Worth, TX, USA
Senior level
Senior level
Financial Services
Lead site reliability engineering efforts by mentoring teams, enhancing application reliability, and leveraging AI tools to optimize operational effectiveness.
Top Skills: AIAWSDatadogDynatraceElkGCPGrafanaKubernetesOpentelemetryPrometheusSplunkTerraform

What you need to know about the Melbourne Tech Scene

Home to 650 biotech companies, 10 major research institutes and nine universities, Melbourne is among one of the top cities for biotech. In fact, some of the greatest medical advancements were conceptualized and developed here, including Symex Lab's "lab-on-a-chip" solution that monitors hormones to predict ovulation for conception, and Denteric's vaccine for periodontal gum disease. Yet, the thousands of people working in the city's healthtech sector are just getting started, to say nothing of the tech advancements across all other sectors.

Sign up now Access later

Create Free Account

Please log in or sign up to report this job.

Create Free Account