Staff Data & Machine Learning Engineer

The Team:

Our team is responsible for building the Celonis’ end-to-end Task Mining solution. Task Mining is the technology that allows businesses to capture user interaction (desktop) data, so they can analyze how teams get work done, and how they can do it even better. We own all the related components, e.g. the desktop client, the related backend services, the data processing capabilities, and Studio frontend applications.

The Role:

Celonis is looking for a Staff Data & Machine Learning Engineer to improve and extend our existing Task Mining ETL pipeline as well as build production ready AI based features into the Task Mining product. You will be owning the solution to simplify the extraction of insights from task mining data. This role demands a blend of expertise in data engineering, software development, and machine learning, utilizing Python.

The work you’ll do:

Design, build, and maintain robust, scalable data pipelines that facilitate the ingestion, processing, and transformation of large datasets
Drive the development of AI-powered features and applications from scratch within the Task Mining product
Implement data strategies and develop data models
Collaborate with other engineering teams to implement, deploy, and monitor ML models in production, ensuring their performance and accuracy
Leverage machine learning techniques to provide actionable insights and recommendations for process optimization
Write performant, scalable and easy to understand SQL queries and optimize existing ones
Learn PQL (Process Query Language – Celonis’ own language for analytical formulas and expressions) and use it to query data from our process mining engine
Own the implementation of end to end solutions: leading the design, implementation, build and delivery to customers
Provide technical leadership and mentorship to other engineers and team members
Lead design discussions, code reviews, and technical planning sessions to ensure high standards and knowledge sharing

The qualifications you need:

8+ years of practical experience in a Computer Science/Data Science related field
Or PhD in Data Science/AI/ML area with 5+ years of practical experience
Experience with building production ready and scalable AI/ML applications in the python ecosystem
Ability to optimize data pipelines, applications, and machine learning models for high performance and scalability
Understanding of ETL jobs, data warehouses/lakes, data modeling, schema design
Excellent command of SQL, including query optimization principles
Ability to assess dependencies within complex systems, quickly transform your thoughts into an accessible prototype and efficiently explain it to diverse stakeholders
Experience with containerization and CI/CD pipelines (e.g. Docker, Github Actions)
Interest in learning new technologies (e.g. PQL language and Object Centric Process Mining)
Strong communication and collaboration skills (English is a must)
Able to supervise and coach mid-level and senior colleagues
Knowledge of Column-oriented DBMS (e.g. Vertica) and its specific features would be beneficial
Nice to have is knowledge in the frameworks Tensorflow, Pytorch, Langchain, FastAPI, SQLAlchemy