The Team:
Our team is responsible for building the Celonis’ end-to-end Task Mining solution. Task Mining is the technology that allows businesses to capture user interaction (desktop) data, so they can analyze how teams get work done, and how they can do it even better. We own all the related components, e.g. the desktop client, the related backend services, the data processing capabilities, and Studio frontend applications.
The Role:
Celonis is looking for a Staff Data & Machine Learning Engineer to improve and extend our existing Task Mining ETL pipeline as well as build production ready AI based features into the Task Mining product. You will be owning the solution to simplify the extraction of insights from task mining data. This role demands a blend of expertise in data engineering, software development, and machine learning, utilizing Python.
The work you’ll do:
- Design, build, and maintain robust, scalable data pipelines that facilitate the ingestion, processing, and transformation of large datasets
- Drive the development of AI-powered features and applications from scratch within the Task Mining product
- Implement data strategies and develop data models
- Collaborate with other engineering teams to implement, deploy, and monitor ML models in production, ensuring their performance and accuracy
- Leverage machine learning techniques to provide actionable insights and recommendations for process optimization
- Write performant, scalable and easy to understand SQL queries and optimize existing ones
- Learn PQL (Process Query Language – Celonis’ own language for analytical formulas and expressions) and use it to query data from our process mining engine
- Own the implementation of end to end solutions: leading the design, implementation, build and delivery to customers
- Provide technical leadership and mentorship to other engineers and team members
- Lead design discussions, code reviews, and technical planning sessions to ensure high standards and knowledge sharing
The qualifications you need:
- 8+ years of practical experience in a Computer Science/Data Science related field
- Or PhD in Data Science/AI/ML area with 5+ years of practical experience
- Experience with building production ready and scalable AI/ML applications in the python ecosystem
- Ability to optimize data pipelines, applications, and machine learning models for high performance and scalability
- Understanding of ETL jobs, data warehouses/lakes, data modeling, schema design
- Excellent command of SQL, including query optimization principles
- Ability to assess dependencies within complex systems, quickly transform your thoughts into an accessible prototype and efficiently explain it to diverse stakeholders
- Experience with containerization and CI/CD pipelines (e.g. Docker, Github Actions)
- Interest in learning new technologies (e.g. PQL language and Object Centric Process Mining)
- Strong communication and collaboration skills (English is a must)
- Able to supervise and coach mid-level and senior colleagues
- Knowledge of Column-oriented DBMS (e.g. Vertica) and its specific features would be beneficial
- Nice to have is knowledge in the frameworks Tensorflow, Pytorch, Langchain, FastAPI, SQLAlchemy