It's fun to work in a company where people truly BELIEVE in what they're doing!Job DescriptionYou will be a technical expert on our team, responsible for architecting, building, and scaling our infrastructure to meet the demands of a rapidly growing business. You will drive key initiatives, mentor other engineers, and set the standards for reliability and operational excellence. This role requires deep technical expertise and the ability to influence both the SRE team and the broader engineering organization.What You'll Do:Technical Leadership: Lead the design and implementation of large-scale, complex infrastructure projects on AWS and GCP. You will own critical reliability goals and drive their execution.Architecture & Strategy: Define and evolve the technical roadmap for our cloud infrastructure, focusing on scalability, security, and cost optimization.Mentorship & Guidance: Act as a senior mentor to junior engineers, helping them grow their skills and navigate technical challenges. You will set a high bar for operational best practices.Tooling & Automation: Champion the development of advanced automation and internal tooling that empowers the entire engineering organization to build and deploy with greater efficiency and reliability.Continuous Improvement: Proactively identify systemic weaknesses and lead initiatives to address them, whether through new technology adoption, architectural changes, or process improvements.Incident Management: Lead major incident response efforts, serving as a technical commander during critical events. You will be responsible for shaping our post-mortem culture and ensuring we learn from every incident.RequirementsWhat We're Looking For:Experience: More than 6 years of relevant experience in a senior Site Reliability, DevOps, or System Engineering role, with a proven track record of leadership and impact.Cloud Expertise: Deep hands-on experience with both AWS and GCP at scale. You should be able to architect and manage complex, multi-region cloud deployments.System Design: Strong understanding of distributed systems and a history of designing and implementing highly available and resilient infrastructure.Scripting & Automation: Expert-level skills in a language like Python or Go, used for building complex automation and internal services.IaC: Extensive experience with Terraform, managing infrastructure across multiple environments.Leadership: Demonstrated experience leading technical projects, mentoring other engineers, and influencing technical direction.Communication: Exceptional communication skills, with the ability to articulate complex technical concepts to both technical and non-technical audiences.If you like wild growth and working with happy, enthusiastic over-achievers, you'll enjoy your career with us!