Sad Songs, Artificial Intelligence and Gracenote’s Quest to Unlock the World’s Music – Variety

Posted: August 1, 2017 at 6:18 pm

Its all about that vibe. Anyone who has ever compiled a mix-tape, or a Spotify playlist for that matter, knows that compilations succeed when they carry a certain emotional quality across their songs.

Thats why the music data specialists at Gracenote have long been classifying the worlds music by moods and emotions. Only, Gracenotes team hasnt actually listened to each and every one of the 100 million individual song recordings in its database. Instead, it has taught computers to detect emotions, using machine listening and artificial intelligence (AI) to figure out whether a song is dreamy, sultry, or just plain sad.

Machine learning is a real strategic edge for us, said Gracenotes GM of music Brian Hamilton during a recent interview.

Gracenote began its work on what it calls sonic mood classification about 10 years ago. Over time, that work has evolved, as more traditional algorithms were switched out for cutting-edge neural networks. And quietly, it has become one of the best examples for the music industrys increasing reliance on artificial intelligence.

First things first: AI doesnt know how you feel.We dont know which effect a musical work will have on an individual listener, said Gracenotes VP of research Markus Cremer during an interview with Variety. Instead, it is trying to identify the intention of the musician as a kind of inherent emotional quality. In other words: It wants to teach computers which songs are truly sad, not which song may make you feel blue because of some heartbreak in your teenage years.

Still, teaching computers to identify emotions in music is a bit like therapy: First, you name your feelings. Gracenotes music team initially developed a taxonomy of more than 100 vibes and moods, and has since expanded that list to more than 400 such emotional qualities.

Some of these include obvious categories like sultry and sassy, but there are also extremely specific descriptors like dreamy sensual, gentle bittersweet, and desperate rabid energy. New categories are constantly being added, while others are fine-tuned based on how well the system performs. Its sort of an iterative process, explained Gracenotes head of content architecture and discovery Peter DiMaria. The taxonomy morphs and evolves.

In addition to this list of moods, Gracenote also uses a so-called training set for its machine learning efforts. The companys music experts have picked and classified some 40,000 songs as examples for these categories. Compiling that training set is an art of its own. We need to make sure that we give it examples of music that people are listening to, said DiMaria. At the same time, songs have to be the best possible example for any given emotion. Some tracks are a little ambiguous, he said.

The current training set includes Lady Gagas Lovegame as an example for a sexy stomper, Radioheads Pyramid Song as plaintive, and Beyonces Me Myself & I as an example for soft sensual & intimate.

Just like the list of emotions itself, that training set needs to be kept fresh constantly. Artists are creating new types of musical expressions all the time, said DiMaria. We need to make sure the system has heard those. Especially quickly-evolving genres like electronica and hip-hop require frequent updates.

Once the system has been trained with these songs, it is being let loose on millions of tracks. But computers dont simply listen to long playlists of songs, one by one. Instead, Gracenotes system cuts up each track into 700-millisecond slices, and then extracts some 170 different acoustic values, like timbre, from any such slice.

In addition, it sometimes takes larger chunks of a song to analyze a songs rhythm and similar features. Those values are then being compared against existing data to classify each song. The result isnt just a single mood, but a mood profile.

All the while, Gracenotes team has to periodically make sure that things dont go wrong. A musical mix is a pretty complex thing, explained Cremer.

With instruments, vocals, and effects layered on top of each other and the result being optimized for car stereos or internet streaming, there is a lot to listen to for a computer including things that arent actually part of the music.

It can capture a lot of different things, said Cremer. Unsupervised, Gracenotes system could for example decide to pay attention to compression artifacts, and match them to moods, with Cremer joking that the system may decide: Its all 96 kbps, so this makes me sad.

Once Gracenote has classified music by moods, it delivers that data to customers, which use it in a number of different ways. Smaller media services often license Gracenotes music data as their end-to-end solution for organizing and recommending music. Media center app maker Plex for example uses the companys music recommendation technology to offer its customers personalized playlists and something the company calls mood radio. Plex users can for example pick a mood like gentle bittersweet, press play, and then wait for Mazzy Star to do its thing.

Gracenote also delivers its data to some of the industrys biggest music service operators, including Apple and Spotify. These big players typically dont like to talk about how they use Gracenotes data for their products. Bigger streaming services generally tend to operate their own music recommendation algorithms, but they often still make use of Gracenotes mood data to train and improve those algorithms, or to help human curators pre-select songs that are then being turned into playlists.

This means that music fans may be acutely aware of Gracenotes mood classification work, while others may have no idea that the companys AI technology has helped to improve their music listening experience.

Either way, Gracenote has to make sure that its data translates internationally, especially as it licenses it into new markets. On Tuesday, the company announced that it will begin to sell its music data product, which among other things includes mood classification as well as descriptive, cleaned-up metadata for cataloging music, in Europe and Latin America. To make sure that nothing is lost in translation, the company employs international editors who not just translate a word like sentimental, but actually listen to example songs to figure out which expression works best in their cultural context.

And the international focus goes both ways. Gracenote is also constantly scouring the globe to feed its training set with new, international sounds. Our data can work with every last recording on the planet, said Cremer.

In the end, classifying all of the worlds music is really only possible if companies like Gracenote do not just rely on humans, but also on artificial intelligence and technologies like machine listening. And in many ways, teaching computers to detect sad songs can actually help humans to have a better and more fulfilling music experience if only because relying on humans would have left many millions of songs unclassified, and thus out of reach for the personalized playlists of their favorite music services.

Using data and technology to unlock these songs from all over the world has been one of the most exciting parts of his job, said Cremer: The reason Im here is to make sure that everyone has access to all of that music.

More here:

Sad Songs, Artificial Intelligence and Gracenote's Quest to Unlock the World's Music - Variety

Related Posts