Machine Learning Engineer: Evaluation
Bedrock Robotics
Location
San Francisco, CA
Employment Type
Full time
Location Type
Hybrid
Department
Engineering
Join the team bringing advanced autonomy to the built world
At Bedrock, we’re moving AI out of the lab and into the real world. Our team is composed of industry veterans who helped launch Waymo, scaled Segment to a $3.2B acquisition, and grew Uber Freight to $5B in revenue. Today, we’re deploying autonomous systems on heavy construction machinery across the country, accelerating project schedules of billion-dollar infrastructure projects and improving safety on job sites. Backed by $350M in funding, we’re working quickly to close the gap between America's surging demand for housing, data centers, manufacturing hubs, and the construction industry's growing labor shortage.
This is where algorithms meet steel-toed boots. You’ll collaborate with construction veterans and world-class engineers to solve physical-world problems that simulations can’t touch. If you're ready to apply cutting-edge technology to solve meaningful problems alongside a talented team—we'd love to have you join us.
Machine Learning Engineer: Evaluation
Bedrock is bringing autonomy to the construction industry! We’re a group of veterans from the autonomous vehicle industry who are passionate about bringing the benefits of automation to areas in the construction industry currently underserved by the market.
We’re looking for a highly motivated engineer with experience evaluating complex ML systems deployed in the real world. Your Mission: Translate the infinite nuance of the built world into actionable, AI-native evaluations that accelerate Bedrock Operator adoption.
The ideal candidate has hands-on experience in building evaluation systems and designing and executing statistical tests to gauge performance deltas between system iterations. More importantly, you’ve iterated on complex ML systems run in production environments, and you understand the complexities that come with it.
What you’ll do:
Design and maintain eval systems:
Build pipelines for measuring system performance – across open loop and closed loop simulation, hardware in the loop systems, and field data from Bedrock Operator equipped machinery. Excite other teams to gain insights earlier in the development cycle through streamlined workflows.
Develop metrics:
Connect product goals and system behavior - by bridging real-world specification to measurable indicators from logged data. Empower confident decision making from parameter tuning to program planning by slicing through the noise and delivering objective insights.
Classify data sources for training and testing:
Implement infrastructure and classifiers - to self-annotate data and allow creation of datasets for a variety of training and evaluation use cases. Leverage models to source rich annotations for massive datasets to accelerate model iteration.
Predict system performance:
Model metrics and interpret results - from various sources ranging from raw sensor data to key leading indicators. Determine whether new construction sites pose hidden challenges and drive business decisions about deployment readiness.
What we’re looking for:
Engineers who are currently Senior or Staff level with 5+ years of professional software engineering, data science, or research experience
2+ years of professional experience analyzing modern ML or robotics system performance on real-world problems
Proficiency in Python and a data warehouse query language and comfort with development on infrastructure within parallelized cloud-based frameworks
Strong statistical analysis skills (e.g. classification, model fit bias determination, hypothesis testing, and uncertainty quantification)
Experience working with large datasets
***Bonus points: We’re especially interested in engineers who have applied statistical backgrounds to ML research or real-world robotics applications.
Our roles are often flexible. If you don't fit all the criteria, or are in another location (especially one where we have an office like SF or NY) please apply anyway! We'd love to consider you.
Our roles are often flexible. If you don't fit all the criteria, or are in another location (especially one where we have an office like SF or NY) please apply anyway! We'd love to consider you.