Skip to main content
Browser Warning

Enhance Your Experience Get a newer, better browser (it's free!)

Ready for an upgrade? A different browser will keep you connected to the full website experience – and protect you from security risks.

Close Search

Senior Software Engineering Manager - Alerting and Monitoring

El Segundo, California

Apply Now


Discover the undiscoverable.

"AT&T allows me to work on projects that will be seen by millions of customers."

Megan T. — Sr. Specialist, Software Engineer

"I find it incredibly rewarding to be out and see customers enjoying a product I spent my time perfecting."


Don't just imagine the future. Create it.

Innovation is the power to Think Big

We live our values without question or compromise.

Strategic Platform Teams

Join our digital transformation!

A Look at Our Design Team

A day in our UX/UI team.

Revolutionize Business in our Digital Team

Transform how employees and customers connect

About the Company

At AT&T, we’re connecting the world through the latest tech, top-of-the-line communications and the best in entertainment. Our groundbreaking digital solutions provide intuitive and integrated experiences for millions of customers across online, retail and care channels. Join our mission to deliver compelling communication and entertainment experiences to customers around the world as we continue to evolve as a technology-powered, human-centered organization. As part of our team, you’ll transform the way we deliver a seamless customer experience with digital at the center of all you do. In our world, digital is much larger than just an eCommerce channel, we are transforming all channels to digitally perform as one team to create a better customer experience. As we move through 2021, the digital transformation will revolutionize the digital space and you can build a career that will propel your future.

About the Team

Our SPT Operations Alerting and Monitoring Enablement team is looking for an experienced Senior Software Engineering Manager to help us deliver best in class monitoring and alerting capabilities across our online digital ecosystem.  As a Senior Software Engineering Manager you will be responsible for overseeing daily operations and execution aspects of alerting and monitoring enablement and act as a product owner helping to manage, prioritize and execute the backlog of new feature enhancement requests using Agile/Scrum methodologies.

About the Job

This is a fast-paced critical position; the conversion is being done very quickly and will require strong technical, managerial, and educational experience with a solid understanding of digital ecommerce, self-service capabilities and various tools used to manage and support our critical online customer journeys across the platform and the Native (iOS/Android) application.

As part of the technical team, you’ll ensure alerts and monitors for our applications are effective and proactive. This will require investigating the backlog of product delivery work to identify upcoming changes required for alerts and monitors, to coincide with the release of those new features, and to work regularly with the SPT Operations Incident Management Tier 1 & 2 teams to look for new opportunities, and to evaluate existing alerts and monitors regularly.

The teams are moving to Site Reliability Engineering principles, so you’ll be an evangelist for that shift. This also means other teams will create and maintain alerts; you’ll ensure effective governance of those changes.

Another area of responsibility is synthetic monitoring. You’ll help grow this practice, help it mature, and deliver value to leadership and to the Incident Management teams.

Experience with an Agile methodology., and the ability to oversee all parts of Agile delivery, will be another key to success.

Responsibilities and Day-to-Day View

•    Provide leadership, strategic direction and oversight for alerting and monitoring teams

•    Champion and drive Site Reliability Engineering (SRE) best practices

•    Interact with teams to identify new requirements and provide expertise across tool suites

•    Oversee improvements in monitoring and alerting capabilities across our digital platforms

•    Develop and deploy monitoring for ensuring application reliability and stability across the customer journeys

•    Maintain awareness of current monitoring technology, applicability and capabilities of tools

•    Manage internal customer relationships, and collaborate effectively across organizations

•    Ensure the effectiveness of alerts, dashboards, events and synthetic monitoring

•    Ensure reports are accurate and timely

•    Act as product owner liaison for tools and engagement of external vendors as required

•    Participate in all aspects of Agile/Scrum (stand ups, grooming, retrospect’s, etc.)


•    2(+) Years of experience with Site Reliability Engineering and operations for internet/eCommerce applications especially in large, multi-data center environments

•    2(+) Years in a lead or supervisory position, coaching and mentoring engineers

•    2(+) Years of experience in large scale site operations

•    Extensive experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint

•    Solution-oriented with proven success in a fast-paced environment

•    Strong organization and time management skills

•    Ability to communicate clearly and effectively with teammates and all levels of management

•    Excellent troubleshooting, analytical and problem-solving skills with demonstrated initiative of going the extra mile

•    Experience training customers on how to leverage tools to drive business value

•    Strong understanding and proven knowledge of Scrum/Agile methodologies

Preferred Qualifications

•    A Bachelor's degree in Computer Science, Information Systems, or related field from an accredited College or University

•    5(+) Years in a lead or supervisory position, coaching and mentoring engineers

•    5(+) Years of experience in large scale site operations

•    5(+) Years of experience with various site operations monitoring tools like Splunk, ElasticSearch, Dynatrace, Quantum Metric, Adobe and Catchpoint

•    1(+) Years of experience in architecture and design of systems using Microservices architecture

•    2(+) Years of experience in cloud technologies: AWS, Azure, OpenStack, Docker, Kubernetes etc.

•    Excellent written and verbal communication skills with demonstrated ability to present complex technical information in a clear manner to peers, developers, and senior leaders

•    Experience with Continuous Integration and Continuous Delivery concepts and tools

AT&T is leading the way to the future – for customers, businesses and the industry. We're developing new technologies to make it easier for our customers to stay connected to their world. Together, we’ve built a premier integrated communications and entertainment company and an amazing place to work and grow. Team up with industry innovators every time you walk into work, creating the world you always imagined. Ready to #transformdigital with us? Apply now!

Click here to view this job description in Career Intelligence.

Job Code - 40491310

Job ID 2164571-1 Date posted 01/31/2022
Apply Now


Invested in your satisfaction and continued success.

We take care of our own here (hint: that could be you). Our benefits and rewards mean we cover some of your biggest needs with some of the coolest offerings. We already think we’re a pretty great place to work. We’re just trying to rack up some bonus points.

Let’s start with the big one: Your work gets rewarded with competitive compensation and benefits. It really does pay to be on our team.


Paid Time Off

Our people have class. Literally. We can help you out on approved education costs with our tuition assistance plan.


Here’s another reason to breathe easy: You and your family get access to excellent medical, dental and vision insurance options.

Insurance Options

Wanna make your friends really jealous? You’ll get discounted access to the latest and greatest AT&T products and services — plus other awesome items, like tickets to live events.


You strike us as an over-achiever (don’t worry, it’s a compliment). Our training and development programs are your ticket to expert status in your job.

Training & Development

When the day comes that you get some much needed R&R (not that you’d ever want to leave #LifeAtATT) you’ll know your future is set with the AT&T Retirement Savings Plan (ARSP).


Give back to your community and connect with colleagues through social and team-building events, and annual paid time off for volunteer efforts of your choice.

Community & Team Events

The Hiring Process

Step 1

Complete a quick application online and check your status often.

Step 2

Virtual or in-person

Dress professionally and ensure good WiFi interviewing virtually.

Step 3

Job Offer

After a background check, you're part of the team.

Step 4

Welcome! Onboarding
and Training Begins

Our training and certification programs set you up for success.

Here are similar jobs, or

New Search

Discover more at AT&T

Sign up for job alerts, updates and more.

Interested In

  • Engineering, El Segundo, California, United StatesRemove
  • Technology, El Segundo, California, United StatesRemove
  • IT \ Engineering \ Technology, El Segundo, California, United StatesRemove
  • Digital, El Segundo, California, United StatesRemove
  • Digital Transformation, El Segundo, California, United StatesRemove

AT&T Info and Alerts. Max 12 messages/month. Privacy Policy. You may opt-out at anytime by sending STOP to short code 20013. Msg & data rates may apply.

The values we live by.
  • Live True

    Do the right thing, no compromise.

  • Think Big

    Innovate and get there first.

  • Pursue Excellence

    In everything, every time.

  • Inspire Imagination

    Give people what they don't expect.

  • Stand for Equality

    Speak with your actions.

  • Embrace Freedom

    Press, speech, beliefs.

  • Make a Difference

    Impact your world.

  • Be There

    When customers & colleagues need you most.

Back to top