- Design and implement pipelines leveraging text embeddings for semantic search, classification, clustering, and document retrieval.
- Work with embedding techniques such as TF-IDF, Word2Vec, GloVe, FastText, and transformer-based models including BERT, Sentence-BERT, OpenAI, and Azure OpenAI embeddings.
- Apply dimensionality reduction methods (PCA, t-SNE, UMAP) to analyze and visualize embedding spaces.
- Use cosine similarity, Euclidean distance, and approximate nearest neighbor algorithms like FAISS and ScaNN for similarity search and clustering.
- Integrate embedding outputs into downstream applications such as intent detection, topic modeling, semantic deduplication, document ranking, and retrieval systems.
- Build and deploy predictive models with logistic/linear regression, random forests, gradient boosting techniques (XGBoost, LightGBM), SVM, Naive Bayes, k-means, and hierarchical clustering.
- Employ statistical inference techniques including hypothesis testing, confidence intervals, bootstrapping, Bayesian inference, multicollinearity diagnostics, residual analysis, and time series forecasting (ARIMA, SARIMA).
- Evaluate model performance using ROC/Precision-Recall curves, AUC, confusion matrices, F1-score, lift/gain charts, and KS statistics.
- Conduct feature selection via Lasso/Ridge regression, recursive feature elimination (RFE), and SHAP values for interpretability.
**Experimentation & Causal Inference**
- Design and analyze A/B and multivariate tests, DOE experiments, and sophisticated causal inference methods including propensity score matching, causal forests, and difference-in-differences.
- Translate experimental results into clear, actionable business insights that drive measurable outcomes.
**Data Engineering & Productionization**
- Develop scalable data pipelines using PySpark, SQL, and Azure Data Factory on platforms including Azure Data Lake, Databricks, MongoDB, and Cosmos DB.
- Deploy machine learning solutions with FastAPI, Docker containers, and Azure App Services endpoints, while monitoring model health with MLflow and model drift.
**Collaboration & Leadership**
- Partner effectively with engineering, product, and business teams to define problem statements and deliver impactful solutions.
- Lead technical discussions, perform code reviews, and mentor junior data scientists to foster technical growth.
- Communicate complex analytical insights clearly to both technical and non-technical stakeholders.
Required Skills and Qualifications
Hands-on experience in machine learning, statistical modeling, and NLP applications.
- Deep expertise in text embeddings and their real-world applications.
- Proficiency in Python, PySpark, and SQL.
- Strong foundation in statistical inference, model diagnostics, and evaluation metrics.
- Experience working with Azure cloud ecosystem, Databricks, and production deployment of ML models.
- Proven ability to design, execute, and interpret experiments with statistical rigor.
Preferred (Good-to-Have) Skills
- Familiarity with transformer-based large language models (LLMs), LangChain, or OpenAI APIs.
- Experience with MLOps tools such as MLflow and Github Actions CI/CD pipelines with Azure App Services.
- Exposure to graph analytics, retrieval-augmented generation (RAG) pipelines, or agent-based systems.
Day-to-Day Responsibilities
You will architect and implement advanced NLP and machine learning pipelines leveraging diverse text embeddings for semantic search, classification, and clustering tasks. Applying sound statistical modeling and causal inference techniques, you will lead experimentation efforts and build scalable data workflows using PySpark, SQL, and Azure services. Cross-functional collaboration will be a core part of your role as you translate analytical insights into strategic business outcomes.
It is the policy of AT&T to provide equal employment opportunity (EEO) to all persons regardless of age, color, national origin, citizenship status, physical or mental disability, race, religion, creed, gender, sex, sexual orientation, gender identity and/or expression, genetic information, marital status, status with regard to public assistance, veteran status, or any other characteristic protected by federal, state or local law. In addition, AT&T will provide reasonable accommodations for qualified individuals with disabilities. AT&T is a fair chance employer and does not initiate a background check until an offer is made.
This one's for the grads and early careerists: Our leading internship and development program recruiters weigh in on how to prepare for and handle your interview.
Learn more
September 19, 2024ArticleCareer AdviceRelated Content
T&T’s India Development Centers (IDC) plays a pivotal role in AT&T’s connectivity strategy, and no one is better suited to speak to that importance more than Santosh Bijur, Vice President of the India Development Center
In our India Development Center (IDC), we’re building a talented technology team. By offering essential resources and the chance to work alongside industry leaders, our goal is to support the next generation of innovators in India.