Job posting

ML Ops Lead

One of our US-based offices - Bay Area / Detroit

As the ML Ops Lead at Cavnue, you’ll set us up for operational excellence and continuous learning from the early days of our Machine Learning lifecycle, so that we’re prepared for the road ahead. We need someone who has detangled streams of data and landed them in high-availability systems, and who has worked to understand how the rate of change in the real world relates to the frequency at which you should re-evaluate your algorithms. We want someone who can help our team understand what the algorithm is doing whether they’re choosing between loss functions or explaining choices to a board member. Help us create the pipelines, infrastructures, and tools needed to integrate high-performance training and inference with unit-testable, and validated software.

Responsibilities:

  • Working across the software engineering team and machine learning team to harmonize DevOps best practices and Machine Learning lifecycle best practices
  • Architecting and implementing ML Ops systems that enable end users at each stage in the ML model lifecycle, designing, training, testing, and deploying ML models while making decisions reviewable and transparent to senior staff.
  • Contributing to a culture of operational excellence that accelerates cadence throughout the engineering lifecycle
  • Working with the Product and Engineering teams to ensure that the right data sets are being collected for relevant tasks at varying geospatial and temporal scales and are appropriately available for ML training processes
  • Shepherding build/buy/borrow analysis for ML Ops related functions (e.g., explainable AI, model validation, labeling pipelines) and implement preferred solutions
  • Monitoring and report on overall ML Infrastructure usage, maintain job scheduling systems with an eye on critical deliverables.
  • Designing high-quality ML infrastructure and data pipelines, defining production code standards, conducting code reviews, and working alongside infrastructure, reliability, and hardware engineering teams.
  • Expanding Cavnue’s competitive advantage through understanding, applying, or inventing novel ML and CS techniques

Requirements:

  • MS or Ph.D. in Computer Science, Electrical Engineering, Statistical Signal Processing (or related field)
  • 5+ years of work experience with complex, multi-modal data in hyperscale systems
  • Understanding of multi-modal data and data harmonization across complex schemes
  • Expertise in scalable scientific computing and model training pipelines, familiarity with latest in algorithms, particularly deep learning, reinforcement learning, classification, and pattern recognition
  • Firm grasp of large-scale data structures and data pipelines, data modeling, software architectures, and the latest ML libraries and frameworks e.g. TensorFlow, Pytorch
  • C++ and Python experience required. Jupyter experience ideal

Cavnue is an Equal Opportunity Employer and prohibits discrimination or harassment of any kind. All employment decisions at Cavnue are based on business needs, job requirements, and individual qualifications, without regard to race, color, national origin, sex, gender, age, religion or belief, disability, sexual orientation, family or parental status, veteran status, or any other status protected by law.

Apply Here