Senior Research Engineer

Company:  The Rundown AI, Inc.
Location: Palo Alto
Closing Date: 22/10/2024
Salary: £150 - £200 Per Annum
Hours: Full Time
Type: Permanent
Job Requirements / Description

We are looking for engineers with significant problem solving experience in PyTorch, CUDA and distributed systems. You will work with Research Scientists to build & train cutting edge foundation models on thousands of GPUs.

Responsibilities

  • Ensure efficient implementation of models & systems for data processing, training, inference and deployment
  • Identify and implement optimization techniques for massively parallel and distributed systems
  • Identify and remedy efficiency bottlenecks (memory, speed, utilization) by profiling and implementing high-performance CUDA, Triton, C++ and PyTorch code
  • Work closely together with the research team to ensure systems are planned to be as efficient as possible from start to finish
  • Build tools to visualize, evaluate and filter datasets
  • Implement cutting-edge product prototypes based on multimodal generative AI

Experience

  • Experience training large models using Python & Pytorch, including practical experience working with the entire development pipeline from data processing, preparation & data loading to training and inference.
  • Experience optimizing and deploying inference workloads for throughput and latency across the stack (inputs, model inference, outputs, parallel processing etc.)
  • Experience with profiling CPU & GPU code in PyTorch, including Nvidia Nsight or similar.
  • Experience writing & improving highly parallel & distributed PyTorch code, with familiarity in DDP, FSDP, Tensor Parallel, etc.
  • Experience writing high-performance parallel C++. Bonus if done within an ML context with PyTorch, like for data loading, data processing, inference code.
  • Experience with high-performance Triton / CUDA and writing custom PyTorch kernels. Top candidates will be able to utilize tensor cores; optimize performance with CUDA memory and other similar skills.
  • Good to have experience working with Deep learning concepts such as Transformers & Multimodal Generative models such as Diffusion Models and GANs.
  • Good to have experience building inference / demo prototype code (incl. Gradio, Docker etc.)
  • Please note this role is not meant for recent grads.

In addition to cash base pay, you'll also receive a sizable grant of Luma's equity.

The pay range for this position is for Bay Area. Base pay offered may vary depending on job-related knowledge, skills, candidate location, and experience .

Your applications are reviewed by real people.

#J-18808-Ljbffr
Apply Now
An error has occurred. This application may no longer respond until reloaded. Reload 🗙