- Work alongside extremely accomplished Engineers on a truly hard problem: scaling a distributed, multi-tenant, high performance compute system.
- Write software, from system automation to network services, to scale our platform.
- Utilize your deep experience and problem solving skills to help prevent and investigate production issues.
- Participate in the design and implementation of new system layers utilizing a first-principles understanding of high complexity compute environments.
- Participate in a shared on-call rotation.
- Capacity planning and management
- You will drive the company through - Disaster Recovery Tests-, where we manually turn down pieces of infrastructure to test Jivox's overall resiliency to failures
Our ideal Site Reliability Engineer will have:
- Extremely strong problem solving / troubleshooting skills.
- Excellent interpersonal skills
- You are willing to - carry the pager- but strive to build a system reliable enough that you don- t get paged.
- Strong programming skills (Java / Python / Shell scripting / Ruby). Must have CS fundamentals and a track record of implementing highly reliable software. A formal CS degree is not required.
- Prior experience working alongside an Engineering team developing and supporting a complicated technical product.
- Fundamental understanding of - NIX systems.
- Strong networking troubleshooting skills.
- Prior experience on a large LAMP stack implementation, as well as Java based microservices.
- Prior experience supporting an Internet facing application at scale. Think scale at Petabytes, 100s of nodes, Terabytes per day of content served.
- Prior experience (atleast 5 years) with a Cloud infrastructure service provider (AWS, Azure, GCP). Certification at the level of Solution Architect in AWS or equivalent required.
- Prior experience with container technologies like Docker and resource managers like Kubernetes / Mesosphere Marathon.
- Prior experience with time-series based monitoring / observability systems like Nagios, Wavefront, Datadog as well as Indexing systems like ELK.
- Prior experience with high performance networks.
Interested candidates can get in touch with us over email on firstname.lastname@example.org or call us on +919742896707