Site Reliability Engineer
About G2 - Our People
G2 was founded to create a place where people will love to work. We strive to create meaning in work and provide more than just a job: a true calling. At the heart of our community and culture are our people. Our global G2 team comes from a wide range of backgrounds and experiences, and that’s what makes our G2 community strong and vibrant. We want everyone to bring their authentic selves to work, and we do this through our company and team events, our G2 Gives charitable initiatives, and our Employee Resource Groups (ERGs).
Our employee-led, leadership-supported ERGs celebrate the diversity of our team, foster inclusivity and belonging, and create a space to connect to each other. Through connections and understanding, we build a stronger and more dynamic global team and help every person reach their personal peak.
We support our employees by offering generous benefits, such as flexible work, parental leave, Insurance, Wellness Benefits etc . Click here to learn more about our benefits.
About G2 - The Company
When you join G2, you join the global team behind the largest and most trusted software marketplace. Every month, 5.5 million people come to G2 to inform smarter software decisions based on honest peer reviews. Authenticity is our focus, and every day we help thousands of companies, and hundreds of employees, propel their potential. Ready for meaningful work that starts and ends with compassion and heart? You’ve come to the right place.
G2 is going through exciting growth! We’ve recently secured our Series D funding of $157 million, which will further allow us to grow and develop our product and people. Read about it here!
About The Role
G2 is looking for a Site Reliability Engineer you will be responsible for ensuring the reliability, availability, and performance of our SaaS services and infrastructure. You will work closely with cross-functional teams to design, automate, and maintain systems that can withstand the demands of our customers while minimizing downtime and maximizing scalability through world-class observability.
Our highly-collaborative Platform team combines creativity, curiosity, and passion to solve business problems and to continuously get better at our craft.
In This Role, You Will:
- Improve the incident response time and knowledge-base for dealing with operational issues.
- Monitor and repair infrastructure to enable new levels of distributed incident management.
- Identify tech debt reduction stories in the DevOps and SRE Jira board for self-service capabilities.
- Grow, maintain, and optimize our production, staging and test infrastructure-as-code developed in Terraform and hosted on AWS.
- Work with our Information Security team to implement various controls, resolve infrastructure vulnerabilities, disaster recovery and develop a proactive security monitoring and mitigation strategy.
- Build dashboards, alerts, and scripting for automated incident remediation.
- Ensure local development is frictionless to provide a satisfying customer experience and customer delight.
- Contribute to building a culture of infrastructure-as-code and self-help tooling.
- Improve upon the logging and monitoring systems for capacity, alerting and performance.
- Build monitoring systems in PagerDuty to redefine, improve and control our V2MOM SLAs.
- Support and facilitate emergency incident response procedures.
- 2+ years of experience in Cloud/DevOps/ SRE Engineering.
- 1+ years of experience with managing cloud infrastructure, AWS preferred.
- Proficient in at least one scripting language (Python, Bash, etc.).
- Strong knowledge of relational database indexing, queries and commands.
- Strong skills with git version control, git branching and git workflows.
- Experience with multi-tier distributed systems involving load balancers, caching layers, real-time event processing, and software-defined networking.
- Demonstrated continuous learning in software development skills.
- Eagerness to take on responsibility and capability to work creatively with a team.
- Participate in all Platform Agile Ceremonies.
- Excellent collaboration skills, ability to influence across G2, especially during incident response swarm meetings.
- Excellent oral and written communication skills with a keen sense of customer service.
- Strong time management skills.
What Can Help Your Application Stand Out:
- Experience with Docker or containerization technology.
- Experience in building dashboards and aggregating metrics.
- Experience with Terraform as Infrastructure-as-Code written in a modular, version controlled and reusable fashion preferred.
- Elasticsearch, Sidekiq, and Kafka are nice to have.
- Prior pairing or mentoring experience.
Our Commitment to Inclusivity and Diversity
At G2, we are committed to creating an inclusive and diverse environment where people of every background can thrive and feel welcome. We consider applicants without regard to race, color, creed, religion, national origin, genetic information, gender identity or expression, sexual orientation, pregnancy, age, or marital, veteran, or physical or mental disability status. Learn more about our commitments here.