Grasping complexity in bird songs

         Pitt CRC's Barry Moore ll, left, and Justin Kitzes
         showing off one of the simple acoustic sensors
         at the heart of the OpenSoundscape project.
         The sensors record thousands of hours of bird calls
         that are analyzed by a machine learning program.

Picture yourself in a meadow surrounded by woods. A bird calls – recognize it? Pick it out among other bird calls? Maybe it is a red-winged blackbird – how many? How many other birds? How far do the woods extend until they reach a highway?

The citizen-scientists who now collect most biodiversity data can estimate how many birds are within earshot at a given time in a given location but can’t accurately estimate bird populations within a habitat larger than the range of human hearing. Even if humans could hear every bird within a larger habitat, it is unlikely they could discern useful patterns.

For Justin Kitzes, this limited picture is insufficient. At a time when natural habitats are increasingly sliced into smaller parcels by human activity, it is crucial that surveys of biodiversity become both more expansive and more precise. He has developed a new approach – acoustic sensors acquiring vast sets of bird calls, and machine learning programs making sense of the data.

Kitzes, an assistant professor in Pitt’s Biological Sciences Department, works with Pitt CRC consultant Barry Moore II in refining OpenSoundscape – a machine learning program Kitzes developed to create dense, meaningful and applicable data on the distribution and survival of species.

The strategy partnering bird calls and machine learning has earned recognition. In 2018, Kitzes was one of 11 recipients of a Microsoft and National Geographic AI for Earth Innovation Award, which provides cloud-based tools and artificial intelligence services to researchers in agriculture, biodiversity, conservation, climate change and water conservation, with free access to machine learning tools on Microsoft’s Azure platform.

Kitzes explains the computational challenge of OpenSoundscape. “Bird calls have lots of variability between them, but also lots of similarity. Audio spectra of individual bird calls can vary greatly within one species, and birds sing simultaneously on top of each other. It is a complex picture.”

OpenSoundscape begins using simple technology – $50 acoustic sensors running on 3AA batteries. The sensors are deployed at intervals within defined areas – along a stretch of road, for instance. The sensors record and lab members retrieve data cards once a month when the sensors receive a fresh set of batteries.

Now the data problem. Even a relatively small group of sensors recording for several hours a day in a limited area can produce terabytes of data that a human couldn’t possibly listen to in any realistic timeframe, much less analyze. The Kitzes lab alone expects to hit an estimated 100 million observations as of 2020. Humans must cede the role of listening to the computer.

Moore is a research faculty member and consultant at Pitt CRC who collaborates with Kitzes on OpenSoundscape.

“The fundamental question is: can one create a relatively simple machine learning model which functions well with soundscapes? It is a difficult problem due to background noises, weather conditions, simultaneous calls and the amount of data we need to process.”

OpenSoundscape creates templates of audio spectra that match audio spectra of bird calls. Starting with the program’s existing code, Moore and Kitzes seek to generate spectra that can more easily handle background noise that can lead to instability in the program.




Moore first trained a model based on bamboo rat calls. “Rat calls are relatively boring looking – the call produces an audio spectrum of a series of vertical lines. The difficult part with rats is that if we are using a computationally efficient overlap algorithm, the call looks similar to a microphone malfunction. Bird calls produce more complicated shapes in the audio spectra and, in real situations, may overlap each other.”

“We need a method to make the templates of known sounds match the audio spectra without matching other sounds,” Moore explains. “For the rat project we used a more resource intensive matching algorithm to separate malfunctions from rats, but this algorithm will not be fast enough for bird calls. Our focus moving forward is building models which can deal with both these situations.”

Kitzes and Moore are now analyzing data storage and other aspects of OpenSoundscape. They want to develop an interface that works for various levels of users, in which one user can work with data in a simple GUI model, but a more sophisticated user can tune the program, change parameters and control the analysis.

Kitzes is certain OpenSoundscape will soon be used in the field. “We want to see it integrated into daily use by government agencies, private land management, conservation groups. This will be a  common technique in 20 years, but the exciting thing is that it’s happening now, not in the future.”



Brian Connelly
Pitt Center for Research Computing

Friday, September 4, 2020