Under submission to ICRA 2024!
To create a more sustainable future, we need to work towards robots that can assist in farms and gardens to reduce the mental and physical workload of farm workers in a time of increasing crop shortages. We tackle the problem of intelligent object retrieval in a farm environment, providing a method that allows a robot to semantically reason about the location of an unseen goal object among a set of previously seen objects in the environment using a Large Language Model (LLM). We leverage object-to-object semantic relationships to best determine the best location to visit in order to most likely find our goal object. We deployed our system on the Boston Dynamics Spot Robot and found a success rate of 79%, with more improvements on the way.
This project presents a robust method for localizing objects in 3D space using Signed Distance Fields (SDFs). SDFs represent an object's surface as a volumetric field where each point's value corresponds to its signed distance from the surface, enabling efficient geometric and normal-based reasoning. Given a point cloud of an object and its corresponding SDF, the approach estimates the object's 6DOF pose (rotation and translation) by minimizing both the distances to the SDF surface and the angle between surface normals derived from the SDF gradients and the point cloud. The method is evaluated on objects from the YCB dataset, demonstrating high accuracy on full and partial point cloud views, even for symmetric and complex geometries.
Presented at the 2024 Michigan AI Symposium!
A vision pipeline for grape state estimation to support automated robotic harvesting. This pipeline consists of a grape bunch and stem segmentation model built using PyTorch and the WGISD dataset, and a 3D reconstruction algorithm that combines the grape masks and a depth map to reconstruct a grape point cloud for pose estimation. Read more about it on the project website!
Since open-vocabulary detectors suffer from noisy outputs when presented with slightly cluttered scenes, we combine the open-vocabulary detector with a probabilistic filter to calculate better object state estmates to use for robotic grasping. In this case, we combined OWL-ViT and FAST Segment-Anything and leveraged fast CUDA computations to calculate multple segmentation masks of an object given just a text prompt, and then combined the estimates using a probabilistic filter, resulting in a highly reliable and accurate object tracking systems that robots can use for open-vocabulary grasping!
Presented at the 2024 Michigan Robotics Undergraduate Research Symposium!
Crosswalk Buddy is an independent research project under UM Robotics with the goal of developing a robot that will increase safety in pedestrian spaces. The aim of this robot is to increase driver visibility of pedestrians in low visibility scenarios. There are two phases - HRI Research and Autonomy Software Development. We developed a simulation of the robot in a city environment to construct human trials to gauge the best proximities and positions of the robot that create a comfortable experience for the pedestrian. Parallely, we have been developing the autonomy stack for a real robot platform. Above, we use YOLOv3 and simple 1D Kalman Filtering to construct a vision system for pedestrian state estimation.