Creativity

Student Showcase: Interacting with MiRo via Auditory-Mediated Sensations

The start of our Student Showcase blog series which highlights some of the amazing student projects utilising MiRo. These articles are written by the students about their projects and this piece is written by Logan Miller from the University of Sheffield.


During my 3rd year at the University of Sheffield my team and I were tasked with adding human-robot interactions to the MiRo robot. The MiRo has a soft tail and ears which produce a specific crunchy noise when pinched or stroked. We planned to use the four microphones present in the MiRo’s body to detect the noise from stroking or pinching these body parts. This could then be used to train the robot, such as stroking to reward it and pinching to punish it – a common machine learning technique called reinforcement learning.

To detect if MiRo has been stroked or pinched, and where it happened, we used a classifier. A classifier is a form of supervised learning which uses labelled training data to map the various sounds to the actions that produced them.

We used the MiRo’s four microphones to record five minutes of us stroking the tail and ears, followed by five minutes of us pinching them and then five minutes of background noise in our lab. After recording our training data, we used Google’s VGGish library in Python to convert our audio files into semantically meaningful high-level 128-dimension embed files which produced better results when fed into a classification model compared to the audio files.

We then mapped each 128D embed file from the training data to a corresponding action and trained a classification model using a linear support-vector machine SVM classifier from the Ski-Kit Learn Python library. As our project focused on detecting the pinch or stroke the MiRo responded with either a blue, green or red LED for if it detected background noise, stroking, or pinching respectively.

Images of MiRo demonstrating the three LED colour results

Background noise can cause a lot of errors in audio classification, so removing the background noise from the pinching and stroking recording would improve the classifier’s results. One way of achieving this is a noise gate, which defines a specific sound level that audio can pass. If this level is not met, no audio signal is recorded (Hodgson, 2010).

We opted to use a slightly more advanced technique where the noise gate is adapted into a spectral noise gate. This is where a signal is split into individual frequencies, with custom thresholds set for each frequency. A noise gate is then applied to each frequency, resulting in a more tailored approach depending on the situation (Audacity, 2021).

We used Tiny Machine Learning (tinyML) techniques to allow our detection to run on the low-powered Raspberry Pi 3B+ inside the MiRo. This was to improve the speed MiRo could respond to pinches and strokes compared to if our detection was running of an external machine communicating with MiRo over a WiFi network. The main objective in tinyML is to reduce the pre-trained models, which is achieved through quantization, pruning and fusing.

Quantization its often the most effective technique and reduces the accuracy of values by using fewer bits. PyTorch’s quantization functionality can reduce the model up to four times its original size by using 8-bit integers over floating point numbers (Quantization, n.d.).

Pruning is a technique which eradicates useless parameters. First a pruning criterion has to be chosen, which assigns a rank to a set of elements to be pruned. The main pruning methods are random pruning, where elements are ranked in a random order, and magnitude-based pruning, where weights below a certain threshold are set to zero.

Finally, fusing combines multiple operators together, however we decided not to implement it as there is very little performance difference in fused and non-fused models with small models, such as this one. (PyTorch Fusing, n.d.).

Overall, this was a really fun project which allowed my team and I to get hands on experience with robots. It also gave us a greater understanding of the theoretical biomimetics we learnt alongside this project – which is the combination of analysis (investigating and understanding complex biological systems), and synthesis (building the system to test our scientific understanding).

 
 

POST AUTHOR

Logan Miller

Computer Science Graduate BSc from The University of Sheffield

 

Hodgson, J. (2010). Understanding Records: A Field Guide To Recording Practice. Bloomsbury Academic.

Alternative Noise Reduction Techniques. (2021, November 16). Audacity Manual. Retrieved May 12, 2022, from https://manual.audacityteam.org/man/alternative_noise_reduction_techniques.html

Quantization — PyTorch 1.11.0 documentation. (n.d.). PyTorch. Retrieved May 12, 2022, from https://pytorch.org/docs/stable/quantization.html

Fuse Modules Recipe — PyTorch Tutorials 1.11.0+cu102 documentation. (n.d.). PyTorch. Retrieved May 12, 2022, from https://pytorch.org/tutorials/recipes/fuse.html