Remote JobsRemote CompanyBlog
Sign In
Sign Up
Back to all jobs

Senior Software Engineer II, ML Foundations

US Remote
$152,000 - $223,500
Software Development
Python
C++
Machine Learning
Distributed Systems
Cloud Infrastructure
The Machine Learning Foundation team builds and supports the critical Distributed Training Framework and tools for every machine learning engineer at Cruise. Our goal is to greatly accelerate the development cycle of machine learning models across the whole company, empowering machine learning engineers to focus on improving the car’s safety and performance, instead of worrying about their infrastructure. We care about performance, ease of use and reliability of our products. We are driven by the success of our partner teams who rely on our work to build the most advanced driverless cars in the world.
What you'll be doing :
  • Leading the design and implementation of modeling performance tools including profiler, benchmark systems, telemetry systems, etc.
  • Own technical projects from start to finish and be responsible for major technical decisions and tradeoffs. Effectively engage in team’s planning, code reviews and design discussions
  • Consider the effects of projects across multiple teams and proactively manage prioritization. Work closely with partner teams to ensure they are benefiting from the systems we built.
  • Influencing partner team's tech solutions to push training performance as a tenet/standard/principle and land as products/features/processes in designs, implementation and operations
  • Conduct technical interviews with well-calibrated standards and play an essential role in recruiting activities. Effectively onboard and mentor junior engineers and/or interns.
What you must have:
  • 5+ years of experience building large-scale distributed applications with high-quality API design
  • Experience with ML development lifecycle and ML Ops
  • Strong coding in Python or C++
  • Experience with distributed training
  • Experience with optimizing model training performance
  • Experience to scale model training to large number of GPUs/CPUs or other accelerators
  • Passionate about self-driving technology and its potential impact on the world
  • BS, MS or PhD in CS, Math or equivalent real-world experience
  • Can do attitude and willingness to code
Bonus points!
  • Knowledge and experience with machine learning algorithms
  • Experience building distributed systems on cloud infrastructure
  • Deep learning frameworks like PyTorch, TensorFlow, etc
  • Building frameworks with high quality lasting APIs
  • Understanding of SOTA training optimization algorithms, their performance profiles and their effects on model convergence
  • Experience scaling model performance optimization work across many teams
  • Experience with build systems (Bazel, Buck, Blaze or Cmake)
  • Experience working with Docker and Kubernetes

The salary range for this position is $152,000 - $223,500. Compensation will vary depending on location, job-related knowledge, skills, and experience. You may also be offered a bonus, and benefits. These ranges are subject to change.

 Apply this job
Please mention that you found this job on remotewlb.com. Thanks & good luck!
 Apply
 Save
Share to :

Cruise

New Job Alert

COMING SOON~
Follow us on
Give a ⭐ on
Similar Jobs
Find more remote jobs
Do you love using our product?

Share a testimonial/suggestion.We'd love to hear about it!

Click to submit✍️
logo of sitemark
Featured on LaunchIgniter

Copyright © RemoteWLB 2025

Remote Dev JobsRemote Support JobsRemote Design JobsRemote Sales JobsRemote Product JobsRemote Business JobsRemote Data JobsRemote Devops JobsRemote Finance JobsRemote Legal JobsRemote HR JobsRemote QA JobsRemote Write JobsRemote Edu JobsRemote Market JobsRemote Management JobsRemote Others Jobs