What You’ll Do
- Help design, build and improve our new inference and ML computation platform, acting as technical lead for one or more of our implementation teams.
- Work with management, product and other internal business partners to drive technical decisions based on business and market needs
- Work on the architecture of our distributed systems to ensure best-in-class reliability and efficiency, while helping to minimize operational costs and toil work
- Provide your team empathetic leadership as well as mentorship to grow their own skills and abilities
- Build products around a large range of ML models and types, including industry-leading research
- Help build safety and fraud systems, around both inference and other ML systems
- Handle interesting and dynamic scaling, hardware and scheduling challenges in a very dynamic and rapidly changing industry sector
You
- Are an experienced lead software engineer with ten or more years of working on business-critical distributed systems.
- Have a history of leading projects from inception to production, including making technical decisions, authoring design and decision documents, and advising on staffing needs.
- Have significant experience architecting systems around relational databases, document databases, queue datastores, block storage, object storage, unreliable networks, and caches.
- Have a deep understanding of the balance between initial build costs and operational costs, and what it takes to launch a product quickly but with a good technical foundation.
- Can write both Go and Python to a high level, and can pick up other languages as needed.
- Are very familiar with building integrated test frameworks and using CI/CD systems
- Are product-oriented and focused on great user experiences, and are invested in building the best product possible for users.
- Are good at working cross-functionally and solving problems across teams, including empathetic conflict resolution when working alongside teams with different priorities.
- Have recent team leadership experience (on a team of four or more people)
Nice to Have
- Experience writing Kubernetes operators or other Kubernetes integrations
- Experience running ML/GPU workloads in production
- Experience with computation dispatch and orchestration systems
- Bare-metal hardware experience
Salary Range Information
Based on market data and other factors, the salary range for this position is $186,000 - $294,000. However, a salary higher or lower than this range may be appropriate for a candidate whose qualifications differ meaningfully from those listed in the job description.