Company:
Acceler8 Talent
Location: San Francisco
Closing Date: 04/11/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description
Revolutionize AI with Us by Helping Everyone Save Time
Join our mission to redefine human-computer collaboration and automate workflows with cutting-edge AI products. Be part of a team shaping the future of enterprise operations, leveraging Large Language Models (LLMs) to elevate organizational impact.
Your Impact:
- Collaborate on delivering captivating experiences through Large Language Models.
- Architect Scalable ML Systems:
- Design and implement scalable machine learning and distributed systems for LLMs.
- Optimize Under the Hood:
- Innovate at lower stack levels, creating high-performing infrastructure with custom kernels.
- Master Parallelism Methods:
- Develop parallelism methods for large-scale LLM distribution training.
Your Skills:
- Experience in training LLMs using Megatron, DeepSpeed, etc., and deploying with vLLM, TGI, TensorRT-LLM, etc.
- Possesses a strong grasp of the architectures of cutting-edge AI accelerators such as TPU, IPU, HPU, and their associated tradeoffs.
- Proficient in working under-the-hood with kernel languages like OAI Triton, Pallas, and compilers like XLA.
- Proven hands-on experience in tuning LLM workloads. Familiarity with MLPerf or production workloads is a plus.
If you're passionate about driving AI innovation and pushing the boundaries of what's possible, we invite you to join our collaborative and forward-thinking team.
Share this job
Acceler8 Talent