We have the following position for Data Scientist--Remote.
Role: Data Scientist
Location: Remote
Duration: Long Term
Mandatory Areas:
- Data Scientist with LLM models, NLP, GenAI, ML - 6+ Years
- Python– 6+ Yrs of Exp
- Pyspark – 6+ Yrs of Exp
- Pytorch – 6+ Yrs of Exp
- GCP – 3+ Yrs of Exp
- Docker – 4+ Years
- KubeFlow - 4+ Years
Job Description:
Client is looking for a highly energetic and collaborative Senior Data Scientist with experience building enterprise level GenAI applications and designing and developing MLOps pipelines. The ideal candidate should have a deep understanding of the NLP field, hands-on experience in the design and development of NLP models, and experience in building LLM-based applications. Excellent written and verbal communication skills with the ability to collaborate effectively with domain experts and IT leadership team are key to being successful in this role. We are looking for candidates with expertise in Python, Pyspark, Pytorch, Langchain, GCP, Web development, Docker, Kubeflow, etc.
Key Responsibilities:
- Work with Walmart's AI/ML Platform Enablement team within the eCommerce Analytics team, instrumental in enabling the broader team's vision.
- Collaborate with other Data Scientists to assist with production models and maintain them in production.
- Deploy and configure Kubernetes components for production clusters, including API Gateway, Ingress, Model Serving, Logging, Monitoring, Cron Jobs, etc. Improve the model deployment process for MLE for faster builds and simplified workflows.
- Serve as a technical leader on various projects across platforms and contribute hands-on to the entire platform's architecture.
- Lead operational excellence initiatives in the AI/ML space, focusing on efficient resource use, identifying optimization opportunities, and forecasting capacity.
- Design and implement various architectures to enhance system performance and resiliency.
- Develop capability requirements and transition plans for the next generation of AI/ML enablement technology, tools, and processes.
Tools/Skills (hands-on experience is a must):
- Ability to transform designs from the ground up and lead innovation in system design.
- Deep understanding of GenAI applications and the NLP field.
- Hands-on experience in the design and development of NLP models.
- Experience in building LLM-based applications.
- Design and development of MLOps pipelines.
- Fundamental understanding of parameterized and non-parameterized algorithms in data science.
- Knowledge of AI/ML application lifecycles and workflows.
- Experience in designing and developing ML pipelines using containerized components.
- Experience with at least one Kubernetes cloud offering (EKS/GKE/AKS) or on-prem Kubernetes.
- Programming experience in Python, Pyspark, Pytorch, Langchain, Docker, Kubeflow.
- Ability to use observability tools (Splunk, Prometheus, Grafana) to diagnose issues within the system.
- Experience with Web development.
Education & Experience:
- 6+ years relevant experience in roles with responsibility over data platforms and operations dealing with large volumes of data in cloud-based distributed computing environments.
- Graduate degree preferred in a quantitative discipline (e.g., computer engineering, computer science, economics, math, operations research).
- Proven ability to solve enterprise-level data operations problems at scale requiring cross-functional collaboration for solution development, implementation, and adoption.
ROBOTICS TECHNOLOGIES LLC is an equal opportunity employer inclusive of female, minority, disability, and veterans (M/F/D/V). Hiring, promotion, transfer, compensation, benefits, discipline, termination, and all other employment decisions are made without regard to race, color, religion, sex, sexual orientation, gender identity, age, disability, national origin, citizenship/immigration status, veteran status, or any other protected status.
#J-18808-Ljbffr