Bagels, Robots, and Metro Adventures: Part 2

bagel with overlay of detection dimensions
Bagel detection
Estimated reading time: 5 minutes

Inside ACN’s Montreal-Inspired Viam Hackathon

(This is part 2 of a two-part series. You can read part 1 here).

I appreciate my Roomba so much more now”

– Flo Herra-Vega, CEO and CTO, AlleyCorp Nord
Monday, May 23, 2023. Hackathon day.

Participants created a Viam account, requested access to our org, and tried connecting to robots via ssh. However, most tasks were accomplished via the Viam SDK with the credentials accessible through the Viam Application, where the two rovers and the large arm had been made available. Most participants used the Python SDK , but there was one rebel who chose Golang.

Software engineer at work, debugging code.

The Green Line

We marked out a “metro simulation” on the floor using green tape.

Unfortunately, the tutorial code used an outdated version of Viam’s library. However, we successfully managed to update it to get our robot up and running.

Our first challenge was to fine-tune the green detector’s sensitivity. We selected the colour from an image taken by the robocam and adjusted the hue accordingly. Before we learned how to digitally crop images to ignore irrelevant background noise, the robot ended up wearing a paper plate hat, much to everyone’s amusement. Eventually, the robot successfully followed the line from one end to the other.

Julien checking the rover’s camera view of the route.

To indicate stop stations visually, we initially used pink post-its. We defined the pink hue in the colour detector’s vision service. However, we encountered a problem—the camera wouldn’t start. After some tinkering, we discovered that our chosen hue was too close in temperature to colours that are not detected (white, gray, and black). By selecting a hue closer to red, we finally detected the sticky notes.

Two software engineers focused on their laptops with a manager supervising.
Hackathons are challenging but really fun when things finally work.

We faced an “early stop” problem where the robot would interpret pink hues on the brown carpet as stations. We overcame this by increasing the size threshold for detection. But this introduced a new issue: when the robot was almost at the station, the marker would move partially out of frame, and the in-frame portion wasn’t large enough to trigger detection. To resolve this, we switched to using blue station markers, increasing contrast with the carpet.

A remaining issue was that sometimes the stop station was detected at a slight distance from the actual station. To fix this, we implemented a state machine.

Our robot ended up wearing a paper plate hat…

A State Machine for Metro Station Stops

We defined four states: drive, approach, station, and bagel.

In the drive state, the robot follows the green line while scanning for bagels and stations. Upon bagel detection, it enters the bagel state. If a station is detected, it moves to the approach state.

In the bagel state, we take a pause to ponder the glory of bagels before returning to the drive state. We had to manually remove the bagel from the frame, or the contemplation session would restart immediately.

The approach state operates similarly to the drive state. The robot still searches for bagels, but the station detection is reversed. We wait until a station is no longer detected, assume this indicates we are over the station, and enter the station state.

The station state is a brief pause accompanied by the harmonious tunes of the STM, followed by a return to drive state.

If there is no line or station detected for three consecutive cycles, the robot becomes lost and ceases operation. We later introduced a lost state where the robot tries to locate the track for a set number of cycles before giving up.

Our company mascot, Nordy, taking a joy ride on the rover.

Sound

We configured a bluetooth speaker to work with each of the rovers. We had two custom sounds: The metro chime to play at stops, and the Montreal snow plow siren to play as the second rover zoomed around the room. We managed to play sound via the command line and using python libraries executed from the raspberry pi.

To play sounds on the rover via the SDK while running the rover scripts from our laptops, we attempted to write a sound service to be called via viam-server. We had some issues here as viam-server did not have access to the libraries we had installed on the pi, but there was no clear documentation on these limitations or default configurations for speakers or sound. We also encountered issues with the way custom modules were implemented, requiring us to manually quit a crashing service to prevent the entire system from failing.

Bagel Detection

Viam allows the addition of “detector” services that use an ML model to return the bounding boxes of identified objects. Here’s how we got that working:

Being resourceful ML engineers, we searched for pre-trained computer vision models in the right format (Tensorflow Lite! aka ‘model.tflite’). TF Hub had a tiny EfficientNet, great for RPIs. Trained on the COCO dataset, this model had labels for donuts, cakes, and pizzas—bagel’s close-enough culinary relatives! Details are available for fine-tuning.

  • We created a Model with Viam’s UI and uploaded the weights (tip: match the Model Name and filename!)
  • Under the Services tab, we created a Model that connected to those model weights.
  • We crafted a Vision Service that leveraged that ML Model (basically: Model → ML Model → Vision Service with ML Model).
  • We added the camera transform to plot the bounding boxes and voila, instant bagel detector!
bagel with overlay of detection dimensions
Bagel detection

For the extra-curious, here are more time-lapse videos of the hackathon.

Next Goals

We have plenty of ideas for future projects, such as getting the Yahboom arm working, setting up our custom arm/servos, creating a custom command line as a service, and creating some custom components such as a temperature sensor.

From a project standpoint, we’d love to have a dedicated office Pi, that we can keep upgrading into a little assistant.

To learn more about how your organization could incorporate robotics and machine learning, reach out to see how we can help!

Thanks to Michelle King and Hailey Ellbogen for handling event logistics.

Thanks also to Arnaud Fosso Pouangue, Becks Simpson, Brock Jenken , Eren Bilgin, Julien Blin, Kim Phan, Ksenia Nadkina, Louis-Philippe Perron, Mona Ghassemi, Raphael Souza, and Rob Linton for their participation in the event and for contributing their notes to this blog post.

Our code: https://github.com/AlleyCorpNord/viam-rover

By Mona Ghassemi

Software Developer/DevOps, AlleyCorpNord