Research Engineer, Speech Generation - FAIR

Company: META

Location: Menlo Park

Closing Date: 08/11/2024

Hours: Full Time

Type: Permanent

Apply Now

Job Requirements / Description

Summary:Our team has released the Seamless Communication models at the end of 2023, the very first massively multilingual, streaming and expressive multimodal translation systems. We are looking for a Research Engineer, expert in speech generation to take these models to the next level by making them production ready.Overtime, this project will be transitioned fully to an infrastructure team, and the role will support our next research vision to build a personalizable, controllable foundation model for synchronous, multimodal and expressive behavior generation.Meta Fundamental AI Research (FAIR) is a research organization committed to advancing open AI research, and we will push the boundaries of human-centric understanding and generation. Our team's technology will enable next-generation human-to-human and human-to-machine communication.Required Skills:Research Engineer, Speech Generation - FAIR Responsibilities:Collaborate, and execute on research that pushes forward the state of the art in human-centric understanding and generation.Directly contribute to experiments, including designing experimental details, writing reusable code, running evaluations, and organizing results.Develop methodology and benchmarks to evaluate different approaches.Work with a large and globally distributed team.Minimum Qualifications:Minimum Qualifications:Bachelor's degree in Computer Science, Computer Engineering, relevant technical field, or equivalent practical experience.Masters Degree in Computer Science or relevant technical field3+ years of industry, academic or government laboratory experienceExperience holding an industry, faculty, or government researcher positionExperience developing machine learning algorithms or machine learning infrastructure in PythonExperience writing software and executing complex experiments involving large AI models and datasetsExperience in speech generation and text-to-speechPreferred Qualifications:Preferred Qualifications:A PhD in AI, computer science, data science, or related technical fields.Direct experience in generative AI, and LLM research.First author publications experience at peer-reviewed AI conferences (NeurIPS, CVPR, ICML, ICLR, ICCV, ACL, EMNLP, Interspeech, etc.))Experience in multimodal generation modeling, in particular human motion generation modeling.Public Compensation:$70.67/hour to $208,000/year + bonus + equity + benefitsIndustry: InternetEqual Opportunity:Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at

Apply Now

Share this job