To create a more sustainable future, we need to work towards robots that can assist in farms and gardens to reduce the mental and physical workload of farm workers in a time of increasing crop shortages. We tackle the problem of intelligent object retrieval in a farm environment, providing a method that allows a robot to semantically reason about the location of an unseen goal object among a set of previously seen objects in the environment using a Large Language Model (LLM). We leverage object-to-object semantic relationships to best determine the best location to visit in order to most likely find our goal object. We deployed our system on the Boston Dynamics Spot Robot and found a success rate of 79%, with more improvements on the way. Research paper and website coming soon!
Since open-vocabulary detectors suffer from noisy outputs when presented with slightly cluttered scenes, we combine the open-vocabulary detector with a probabilistic filter to calculate better object state estmates to use for robotic grasping. In this case, we combined OWL-ViT and FAST Segment-Anything and leveraged fast CUDA computations to calculate multple segmentation masks of an object given just a text prompt, and then combined the estimates using a probabilistic filter, resulting in a highly reliable and accurate object tracking systems that robots can use for open-vocabulary grasping!
A vision pipeline for grape state estimation to support automated robotic harvesting. This pipeline consists of a grape bunch and stem segmentation model built using PyTorch and the WGISD dataset, and a 3D reconstruction algorithm that combines the grape masks and a depth map to reconstruct a grape point cloud for pose estimation. Read more about it on the project website!
Crosswalk Buddy is an independent research project under UM Robotics with the goal of developing a robot that will increase safety in pedestrian spaces. The aim of this robot is to increase driver visibility of pedestrians in low visibility scenarios. There are two phases - HRI Research and Autonomy Software Development. We developed a simulation of the robot in a city environment to construct human trials to gauge the best proximities and positions of the robot that create a comfortable experience for the pedestrian. Parallely, we have been developing the autonomy stack for a real robot platform. Above, we use YOLOv3 and simple 1D Kalman Filtering to construct a vision system for pedestrian state estimation.