Do you want to bring game-changing Monitoring and Insturmentation practices to a dynamic and globally distributed engineering organization? The Engineering Productivity team builds the systems and processes that enable Xandr engineers to do their best work every day. We own the build, deployment & test systems that power our next-generation ad-tech platform. The Engineering Productivity team is at the forefront of adopting new tools & techniques for the entire Xandr engineering community. We're hands-on agents of change who exemplify and create the next wave of best practices. We're highly collaborative and our highest values include quality, scalability, automation, and team work.
We’re on the lookout for someone to build and implement scalable, high-performance instrumentation solutions, enabling intelligent operations through continuous observation. This highly-visible role will work closely with engineering, ops, and product teams to maximize observability and proactively minimize the impact of application performance issues.
As a Software Engineer joining this role, the candidate can be expected to:
- Own a project/feature from inception to deployment and rollout.
- Continuously learn & apply industry wide observability practices into our Xandr Observability Stack.
- The projects are centered around building new products (open source & 3rd party), improving existing tools (automations, scaling, optimization) for solving Xandr wise need for monitoring and instrumentation.
- Our team owns the building, administration and to certain extent operations of the tools that we own.
- The projects are based on multi-environment scenario such as baremetals, Kubernetes, cloud and hybrid-cloud.
- Some of our projects also include writing custom code for fetching data from API’s, data warehousing, analytics side of things.
- We also perform to good extent infra wide automation such as Puppet, Terraform, Ansible etc
Roles & responsibilities:
• Developing an enterprise-wide instrumentation strategy to support real time observability, health checks and escalations
• Enabling engineering teams to quickly set up instrumentation tools and interfaces by automating observability where possible
• Writing integration, plugin, and configuration solutions
• Consuming and integrating REST APIs
• Defining observability standards, documentation and best practices
• Creating operational dashboards to track KPI performance
• Providing monitoring and instrumentation support and maintenance
• Conducting presentations and training engineers on observability tool usage
- 5+ years’ experience engineering in a distributed, high-availability environment
- Knowledge of Data Structures & Algorithm
- Working knowledge of system and application metric monitoring tools
- Extensive experience with Kubernetes from both an operational and development perspective
- Excellent written and oral communication skills with the insight to translate business goals into tech requirements
- Bachelor’s degree or higher in Computer Science or a related technical field, or tenured related experience
- A collaborative and customer-centric mindset with excellent analytical and troubleshooting skills
- Flexibility to fix production monitoring issues, and work in real time with team members around the globe
- Flexible to work in 1 to 10 PM IST work timings
- Experience designing and developing customized tools, scripts and dashboards
- Experience building robust SaaS monitoring and instrumentation frameworks
- Experience working with CI (Continuous Integration)/ /CD (Continuous delivery)
- Exposure to Cloud
- Preferred coding experience in Python
- Experience with ELK stack or similar log management tool
- Exposure or hands-on experience with timeseries database such as Prometheus, InfluxDB, Graphite etc
Work timings: 1 to 10 PM IST
Job ID 2141883X Date posted 07/22/2021
More about you:
• You are passionate about a culture of learning and teaching. You love challenging yourself to constantly improve, and sharing your knowledge to empower others
• You like to take risks when looking for novel solutions to complex problems. If faced with roadblocks, you continue to reach higher to make greatness happen
• You care about solving big, systemic problems. You look beyond the surface to understand root causes so that you can build long-term solutions for the whole ecosystem
• You believe in not only serving customers, but also empowering them by providing knowledge and tools