About the Company
At AT&T, we’re connecting the world through the latest tech, top-of-the-line communications and the best in entertainment. Our groundbreaking digital solutions provide intuitive and integrated experiences for millions of customers across online, retail and care channels. Join our mission to deliver compelling communication and entertainment experiences to customers around the world as we continue to evolve as a technology-powered, human-centered organization. As part of our team, you’ll transform the way we deliver a seamless customer experience with digital at the center of all you do. In our world, digital is much larger than just an eCommerce channel, we are transforming all channels to digitally perform as one team to create a better customer experience. As we move through 2021, the digital transformation will revolutionize the digital space and you can build a career that will propel your future.
About the Team
Our SPT Operations Alerting and Monitoring Enablement team is looking for an experienced Senior Software Engineering Manager to help us deliver best in class monitoring and alerting capabilities across our online digital ecosystem. As a Senior Software Engineering Manager you will be responsible for overseeing daily operations and execution aspects of alerting and monitoring enablement and act as a product owner helping to manage, prioritize and execute the backlog of new feature enhancement requests using Agile/Scrum methodologies.
About the Job
This is a fast-paced critical position; the conversion is being done very quickly and will require strong technical, managerial, and educational experience with a solid understanding of digital ecommerce, self-service capabilities and various tools used to manage and support our critical online customer journeys across the att.com platform and the Native (iOS/Android) application.
As part of the technical team, you’ll ensure alerts and monitors for our applications are effective and proactive. This will require investigating the backlog of product delivery work to identify upcoming changes required for alerts and monitors, to coincide with the release of those new features, and to work regularly with the SPT Operations Incident Management Tier 1 & 2 teams to look for new opportunities, and to evaluate existing alerts and monitors regularly.
The teams are moving to Site Reliability Engineering principles, so you’ll be an evangelist for that shift. This also means other teams will create and maintain alerts; you’ll ensure effective governance of those changes.
Another area of responsibility is synthetic monitoring. You’ll help grow this practice, help it mature, and deliver value to leadership and to the Incident Management teams.
Experience with an Agile methodology., and the ability to oversee all parts of Agile delivery, will be another key to success.
Responsibilities and Day-to-Day View
• Provide leadership, strategic direction and oversight for alerting and monitoring teams
• Champion and drive Site Reliability Engineering (SRE) best practices
• Interact with teams to identify new requirements and provide expertise across tool suites
• Oversee improvements in monitoring and alerting capabilities across our digital platforms
• Develop and deploy monitoring for ensuring application reliability and stability across the customer journeys
• Maintain awareness of current monitoring technology, applicability and capabilities of tools
• Manage internal customer relationships, and collaborate effectively across organizations
• Ensure the effectiveness of alerts, dashboards, events and synthetic monitoring
• Ensure reports are accurate and timely
• Act as product owner liaison for tools and engagement of external vendors as required
• Participate in all aspects of Agile/Scrum (stand ups, grooming, retrospect’s, etc.)
Qualifications
• 2(+) Years of experience with Site Reliability Engineering and operations for internet/eCommerce applications especially in large, multi-data center environments
• 2(+) Years in a lead or supervisory position, coaching and mentoring engineers
• 2(+) Years of experience in large scale site operations
• Extensive experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint
• Solution-oriented with proven success in a fast-paced environment
• Strong organization and time management skills
• Ability to communicate clearly and effectively with teammates and all levels of management
• Excellent troubleshooting, analytical and problem-solving skills with demonstrated initiative of going the extra mile
• Experience training customers on how to leverage tools to drive business value
• Strong understanding and proven knowledge of Scrum/Agile methodologies
Preferred Qualifications
• A Bachelor's degree in Computer Science, Information Systems, or related field from an accredited College or University
• 5(+) Years in a lead or supervisory position, coaching and mentoring engineers
• 5(+) Years of experience in large scale site operations
• 5(+) Years of experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint
• 1(+) Years of experience in architecture and design of systems using Microservices architecture
• 2(+) Years of experience in cloud technologies: AWS, Azure, OpenStack, Docker, Kubernetes etc.
• Excellent written and verbal communication skills with demonstrated ability to present complex technical information in a clear manner to peers, developers, and senior leaders
• Experience with Continuous Integration and Continuous Delivery concepts and tools
AT&T is leading the way to the future – for customers, businesses and the industry. We're developing new technologies to make it easier for our customers to stay connected to their world. Together, we’ve built a premier integrated communications and entertainment company and an amazing place to work and grow. Team up with industry innovators every time you walk into work, creating the world you always imagined. Ready to #transformdigital with us? Apply now!
Click here to view this job description in Career Intelligence.
Job Code - 40491310
Job ID 2164571-2 Date posted 01/31/2022