Remote JobsRemote CompanyBlog
Sign In
Sign Up
Back to all jobs

Software Engineer - ML Reliability

United States (Remote)
$155,000 - $220,000
Software Development
Machine Learning
Python
Go
Cloud Computing
Software Engineering

AS A BACKEND ENGINEER ON THE CONVERSION ML TEAM, YOU WILL:

  • Design, implement, and maintain robust ML architecture to ensure high availability, reliability, and performance of ML models in production.
  • Implement monitoring tools and processes to track the performance of machine learning models in production, identifying any issues or degradation over time.
  • Provides best practices and running proof-of-concepts for automated and efficient model operations on a large scale.
  • Lead and participate in incident response efforts, conducting root cause analysis and implementing corrective actions to prevent recurrence.
  • Create and maintain comprehensive documentation for ML infrastructure, processes, and best practices.
  • Work closely with cross-functional teams, including data scientists, software engineers, and product managers, to align on goals and deliverables.
  • Contribute to an “engineering excellence” culture through state-of-the-art tools, risk-driven testing, explainable systems, and code review.
  • Join a nimble, consistently excellent, and experienced engineering team.

Responsibilities:

  • Have end-to-end ownership of projects, and collaborate with a small team of world-class engineers with diverse backgrounds.
  • Ship code multiple times a day, and within seconds see its quantified impact on millions of users and our business's revenue.
  • Be part of an “engineering excellence” culture through state-of-the-art tools, risk-driven testing, explainable systems, and code review.
  • Become an authority in Clojure, Go, and the many other cutting-edge open source technologies that maximize our development velocity.
  • Join a nimble, consistently excellent, and experienced engineering team.

Requirements: 

  • 5+ years of software engineering experience.
  • 3+ years of experience in machine learning, software engineering, or reliability engineering, with a focus on production systems.
  • Solid core CS fundamentals (data structures, algorithms, architecting systems).
  • Proficiency in Python, Go, or similar programming languages.
  • Experience with ML frameworks (e.g., TensorFlow, PyTorch), cloud platforms (e.g. AWS, GCP, Azure).
  • Experience with ML monitoring tools (e.g. Prometheus, Grafana).
  • Experience in big data engines such as Trino and Spark is a big plus.
  • Strong problem-solving skills and the ability to work collaboratively across teams.
  • Excited to work on large scale ML and data systems.
  • Ability to lead across team and role boundaries to effect large scale change in culture and systems.
  • A healthy sense of fun!

Nice to have:

  • Experience in ML systems for training Transformer models, CTR prediction models.
  • Experience in AdTech is a strong plus.

Location:

This role eligible for full-time remote work in one of our entities: CA, CO, ID, IL, FL, GA, MA, MN, MO, NC, NJ, NV, OR, PA, RI, TX, UT, and WA. We are a remote-first company with US hubs in Redwood City, Los Angeles, and NYC.

Liftoff offers all employees a full compensation package that includes equity and health/vision/dental benefits associated with your country of residence. Base compensation will vary based on candidate location and experience. The following are our base salary ranges for this role:

SF Bay Area, NYC, Los Angeles/Orange County: $180,000 - $220,000
Seattle/Olympia, Austin, San Diego, Santa Barbara, Boston: $165,000 - $205,000
All other cities and towns in our approved states: $155,000 - $190,000

#LI-VM1

#LI-Remote

We use Covey as part of our hiring and / or promotional process for jobs in NYC and certain features may qualify it as an AEDT. As part of the evaluation process we provide Covey with job requirements and candidate submitted applications. We began using Covey Scout for Inbound on January 22, 2024.

Please see the independent bias audit report covering our use of Covey here.

 Apply this job
Please mention that you found this job on remotewlb.com. Thanks & good luck!
 Apply
 Save
Share to :

Liftoff

New Job Alert

COMING SOON~
Follow us on
Give a ⭐ on
Similar Jobs
Find more remote jobs
Do you love using our product?

Share a testimonial/suggestion.We'd love to hear about it!

Click to submit✍️
logo of sitemark

Copyright © RemoteWLB 2025

Remote Dev JobsRemote Support JobsRemote Design JobsRemote Sales JobsRemote Product JobsRemote Business JobsRemote Data JobsRemote Devops JobsRemote Finance JobsRemote Legal JobsRemote HR JobsRemote QA JobsRemote Write JobsRemote Edu JobsRemote Market JobsRemote Management JobsRemote Others Jobs