Site Reliability Engineer

Snowplow Analytics

Snowplow Analytics

Software Engineering
London, UK · Greater London, UK
Posted 6+ months ago
Site Reliability Engineer
Located in UK / Europe - Remote
#LI-Remote
About Snowplow
Snowplow, the leader in next-generation customer data infrastructure (CDI), empowers data-driven organizations to own and unlock the full potential of their customer behavioral data.
The Snowplow platform fuels AI, advanced analytics, and personalized experiences by enabling companies like Burberry, Strava, and Auto Trader to collect, manage, and operationalize real-time event data from their central data platform of choice. This empowers analytics, data science, product, and marketing teams to gain deeper insights into customer journeys, predict behaviors, deliver unique customer experiences, and detect fraud.
With thousands of companies relying on Snowplow worldwide, we are at the forefront of transforming how data-driven organizations leverage their customer behavioral data.
Following our $40 million Series B funding led by global venture capital firm NEA, known for investments in Databricks, MongoDB, and Elastic, we are seeking creative and innovative individuals to help us shape the future of Snowplow.
The Opportunity
Our Private SaaS offering has grown significantly over the past year and we now orchestrate and monitor Snowplow event pipelines across more than 150 customer-owned AWS & GCP sub-accounts. Each account has its own individualised and optimised stack and all are capable of processing many billions of events per month. We have a new trial experience that helps prospects self-serve a cut down experience of Snowplow, also in their own cloud.
We are looking for another SRE to help us grow to managing 1,000 and then 10,000 AWS, GCP & Azure accounts. You will be pioneering solutions to managing estates of this size through cutting edge monitoring and automation. You’ll work closely with our SRE Lead on all aspects of our proprietary deployment, orchestration and monitoring stacks. Within all of our domains (full service, trial and our own) we are striving to increase service reliability, fulfil customer requests in a timely fashion, and automate recurring tasks. Task automation is essential, given our infrastructure estate scales linearly with our customer numbers, unlike most software businesses.
The challenge of automating the maintenance and deployment of thousands of individualised stacks is an enormously ambitious undertaking and a hugely exciting infrastructure automation challenge.
What you’ll be doing:
● Maintaining and developing our growing Terraform infrastructure-as-code stacks which we use to deploy infrastructure for all internal and client use cases
● Maintaining our internal infrastructure stacks which include the HashiCorp suite as well as our Snowplow BDP and VPNs
● Participating in our on-call rotation to help us serve our client base 24/7
● Taking rotations of L3 Technical Support where you will be responsible for triaging and dealing with infrastructure issues
● Handling high-severity internal or customer incidents, ensuring we meet all SLAs
We’d love to hear from you if:
● Has worked with AWS and/or GCP in a production capacity (Azure is a bonus)
● Has worked with Terraform, CloudFormation or some form of infrastructure-as-code tooling
● Any experience with the HashiCorp stack (Vault, Consul, Nomad) and understanding their role in infrastructure automation is a bonus
● Has worked with Docker and is familiar with container-based architectures (Kubernetes is a bonus)
● Knowledgeable about the Linux operating system and how to manage servers in a production capacity
● Knowledgeable about Cloud networking principles and how to troubleshoot issues in this space
● Comfortable scripting in one or more of: Bash, Python, Ruby or Perl
● Comfortable programming in one or more of: Java, Scala, Golang or Python
● Experience working with online marketplaces would be a bonus
What you get in return for being awesome:
💰 A competitive package, including share options
🧘 Flexible working
🏖️ A generous holiday allowance no matter where you are in the world
🫂 Mental health support including therapy sessions
💻 MacBook and home office equipment allowance
🫶 1 week of volunteering a year for a cause you feel passionate about
👪 Enhanced maternity, paternity, shared parental and adoption leave
Snowplow is dedicated to building and supporting a brilliant, diverse and hugely inclusive team. We don't discriminate against gender, race, religion or belief, disability, age, marital status or sexual orientation. Whatever your background may be, we welcome anyone with talent, drive and emotional intelligence.