About the Company:
At AT&T, we’re connecting the world through the latest tech, top-of-the-line communications and the best in entertainment. Our groundbreaking digital solutions provide intuitive and integrated experiences for millions of customers across online, retail and care channels. Join our mission to deliver compelling communication and entertainment experiences to customers around the world as we continue to evolve as a technology-powered, human-centered organization. As part of our team, you’ll transform the way we deliver a seamless customer experience with digital at the center of all you do. In our world, digital is much larger than just an eCommerce channel, we are transforming all channels to digitally perform as one team to create a better customer experience. As we move through 2021, the digital transformation will revolutionize the digital space and you can build a career that will propel your future.
About the Team:
The mission of our Digital Operations team is to operate a fault resilient, customer-centered, proactive DevOps team. The team is responsible for supporting systems that deliver AT&T’s customer experience, across multiple internet-facing eCommerce applications, databases, platforms and technology stacks. Our customer-journey centric Ops team is made up of Ops Engineers as well as Site Reliability Engineers (SREs) who are all focused on ensuring a highly available, resilient, performant and secure customer experience.
We’re looking for an energetic self-starter and quick learner Tier1 Monitoring and Alerting Specialist with attention to detail and constant focus to meet our application availability and mean time to restore goals. This This specialist will work on mission critical, highly available sales applications. Our goal is to expand on the functions of our Production support Operation team with resources with familiarity with application, does quick troubleshooting using log analysis, dashboards and monitoring tools and communicates effectively about issues and outages to get the right attention to meet the application mean time to restore goals.
Roles and Responsibilities:
- 24 x 7 Production support and first level trouble shooting of incidents
- 24 x 7 first level outage response
- 24 x 7 Application performance monitoring, troubleshooting and corrective actions
- Support incident management and problem management
Shift timing (if any):
- Shift falls typically between 6 AM to 12AM India standard time (2 rotational shifts). Occasionally may have to work long hours in situations when it is needed.
- Preferably looking for candidates with short notice period.
Primary / Mandatory skills:
- Overall experience: 3+ years experience performing Production Support for Mission Critical, high performance applications (Telecom and eCommerce experience preferred)
- Experience using Docker, Kubernetes and Cloud environments, Unix, Networking and troubleshooting knowledge: 2 - Novice (limited experience)
- Experience in Customer Experience Analytics tool like Quantum Metric or TeaLeaf: 2 - Novice (limited experience)
- Experience in Relational & NoSQL databases like Oracle & Cassandra. Excellent knowledge of SQL.: 2 - Novice (limited experience)
- Excellent written and verbal English communication skills to work in a Global team
Secondary / Desired skills:
- Experience with visualization tools like Kibana and Grafana. EFK stack experience preferred.: 2 - Novice (limited experience)
- Production support troubleshooting experience: 2 - Novice (limited experience)
- Experience mentoring & training others
- Experience with Site Reliability Engineering preferred 2 - Novice (limited experience)
Additional information (if any): Willing to work in Shift Duties, Willingness to learn is very important as AT&T offers excellent environment to learn Digital Transformation skills such as cloud, Big data, AI, Full stack etc.
Education Qualification: Bachelor’s/ Masters degree in Computer Science or related field
Certifications (if any specific): Any Certification related to Primary / Mandatory Skills
- Kubernetes Certified Engineer or equivalent certification
- Azure / AWS certification
- 3+ years experience in Production Support / Operations environment
- 2+ years of strong Unix, Networking and troubleshooting knowledge
- 2+ years of experience in Customer Experience Analytics tool like Quantum Metric or TeaLeaf
- Solid understand and experience in Application Performance Monitoring tools like Dynatrace, AppDynamics, Introscope, etc.
- Experience providing data/information to business leaders
- Experience working in a large scale technically diverse organization
AT&T is leading the way to the future – for customers, businesses and the industry. We're developing new technologies to make it easier for our customers to stay connected to their world. Together, we’ve built a premier integrated communications and entertainment company and an amazing place to work and grow. Team up with industry innovators every time you walk into work, creating the world you always imagined.
Ready to #transformdigital with us? Apply now!
Job ID 2130734I Date posted 06/02/2021