Machine Learning Systems

Company:  Acceler8 Talent
Location: San Francisco
Closing Date: 24/10/2024
Hours: Full Time
Type: Permanent
Job Requirements / Description

Revolutionize AI with Us by Helping Everyone Save Time


Join our mission to redefine human-computer collaboration and automate workflows with cutting-edge AI products. Be part of a team shaping the future of enterprise operations, leveraging Large Language Models (LLMs) to elevate organizational impact.


Your Impact:

  • Collaborate on delivering captivating experiences through Large Language Models.
  • Architect Scalable ML Systems:
  • Design and implement scalable machine learning and distributed systems for LLMs.
  • Optimize Under the Hood:
  • Innovate at lower stack levels, creating high-performing infrastructure with custom kernels.
  • Master Parallelism Methods:
  • Develop parallelism methods for large-scale LLM distribution training.


Your Skills:

  • Experience in training LLMs using Megatron, DeepSpeed, etc., and deploying with vLLM, TGI, TensorRT-LLM, etc.
  • Possesses a strong grasp of the architectures of cutting-edge AI accelerators such as TPU, IPU, HPU, and their associated tradeoffs.
  • Proficient in working under-the-hood with kernel languages like OAI Triton, Pallas, and compilers like XLA.
  • Proven hands-on experience in tuning LLM workloads. Familiarity with MLPerf or production workloads is a plus.


If you're passionate about driving AI innovation and pushing the boundaries of what's possible, we invite you to join our collaborative and forward-thinking team.

Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙