Site Reliability Engineer (SRE)

Our SRE team is responsible for the overall performance and reliability of Evernote’s service and products. This includes over 200 million passionate and engaged users around the world, with billions of notes and files. We are looking for a Site Reliability Engineer to help us in the ongoing mission of delivering an outstanding service to our users.

 

We participate in all aspects of running our platform at scale, focusing on both the service as it runs today and ensuring we can deliver new and exciting features rapidly to users. We have a real passion for automation and we continually seek to improve. We work hand-in-hand with product teams to help them ship production-ready services and get new features in our users' hands. We use Service Level Objectives (SLOs) based on Key Performance Indicators (KPIs) for each of our services and use them to allow us to move quickly while maintaining the quality service our users expect.

What you’ll do

  • Work closely with engineering teams to maintain and scale our existing production platform
  • Help us evolve what it means to be an SRE at Evernote
  • Evolve and implement production readiness standards for new services
  • Champion our SLOs and look to continuously improve them
  • Develop and maintain automation to reduce operations toil for the team
  • Participate in an on-call rotation for our production services

What we’re looking for

  • You possess a contagious sense of ownership and the tenacity to always find a way
  • You focus on quality to build manageable, scalable, and maintainable systems
  • You know that perfection is the enemy of done and when to make trade-offs
  • You emphasize the importance of making decisions based on data
  • You enjoy solving tough technical problems
  • You exercise judgement in a way which reduces risks
  • You share enthusiastically to reduce disconnects and communication breakdowns
  • You always want to understand the why in order to better see patterns and improve quality

What you’ve done

  • You know Linux systems like the back of your hand
  • You’ve managed production environments at scale in a public cloud environment (AWS or GCP)
  • You have a strong familiarity with web applications including MySQL, Java, Apache
  • You’ve attained a deep understanding of networking protocols (e.g. TCP/IP, HTTP, DNS, etc)
  • You’ve implemented and used third-party metrics and monitoring platforms such as DataDog and PagerDuty
  • You possess the ability to wrangle problems quickly using the tools available at your disposal
  • You’ve used configuration management and orchestration tools and you understand why they’re important
  • You’ve built extensible and maintainable automation (Shell, Python, or Go preferred)
  • You’ve run containerized microservices using Kubernetes

Skills that are particularly meaningful to us

  • Google Cloud Platform: GLB, Pub/Sub, Spanner, GCS, App Engine, and GKE
  • Monitoring: PagerDuty, DataDog, Splunk
  • Tools: Ansible, Puppet, Helm, Jenkins, Cloud Deployment Manager, Terraform
  • Infrastructure: HAProxy, Envoy, ElasticSearch, Consul
  • Languages/Libraries: Go, Python, Java, Thrift, gRPC
We are committed to an inclusive and diverse Evernote. We believe that different perspectives lead to better ideas, and better ideas allow us to better understand the needs and interests of our diverse, global Evernote Community. We welcome people of different backgrounds, experiences, abilities and perspectives and are an equal opportunity employer.

Apply for Site Reliability Engineer (SRE)

About Evernote

Evernote is a place for individuals and teams to assemble, nurture, and share ideas in any form.

We’ve assembled an incredibly talented, diverse, and spirited team to build products that impact the lives of millions of people around the world. Our employees enjoy access to the best tools available, an open and collaborative work environment, and end each day knowing that they’ve made a tangible impact.

The Evernote app is available across platforms on desktop, mobile, or on the web, meaning your ideas are always with you, always accessible, and always in sync. We believe that no idea is too big to lead or too small to matter, and continually develop a service and reputation based on innovation and trust.