
credit: Joseph Llanes
Alexa
-
Voice Basics
Who—or more appropriately, what—is Alexa you may ask?
Alexa is best known as the voice (and brain) behind the Amazon Echo smart speaker and is integrated with hundreds of other devices including Sonos, Garmin, Denon, Bose, and more.
Using just your voice, you can ask Alexa to control your music, get the news, check the weather, set alarms, and control smart home devices such as lights, outlets, and locks. Alexa can tell you about upcoming concerts, get directions, find local restaurants and answer your questions. There’s no shortage of questions to ask.
Amidst all of that, “Alexa, play music” remains one of the top customer requests.
-
Getting Heard
By making the listening experience as simple as possible, Amazon Music introduces millions of listeners to streaming music via Alexa. In fact, Amazon Music was actually made with voice in mind. We’ve been the leader in developing a streaming service that combines the simplicity and magic of voice controls with music.
Here are some of the many ways we help listeners connect with your music on Alexa:
-
How It Works
Using Amazon Music with Alexa can be magical, but a lot of work goes in behind the scenes to make it happen. Here are the basics on how the technology works and some voice-related terminology to level-up your understanding.
Utterance - The word utterance by definition means “a spoken word, statement, or vocal sound”. We use this word in reference to the specific request a listener makes of Alexa.
Wake Word - The bridge between you and the technology we use to power the Alexa voice experience is facilitated by using a wake word. In most cases, this is “Alexa” - but it can also be changed to “Computer”, “Echo”, or “Amazon.” Echo devices are designed to detect only your chosen wake word.
It's a generalization of the tech, but there are four major steps that happen after you summon Alexa with a wake word and utterance:
- Automatic Speech Recognition (ASR): When you ask Alexa to play a song - ASR converts your speech into literal text.
- Natural Language Understanding (NLU): The second step is to extract the meaning of that text so that Alexa can take the appropriate action to respond. The second step is actually making sense of that text and understanding the way you speak naturally. So, for example, if you say “Alexa, play the album Rattle and Hum” it knows you want to listen to the album, not watch the documentary.
- Deep Music Intelligence: This encompasses everything from finding the right song, adding correct metadata to fulfill or do more with a request, applying personalization, sequencing, capturing whether or not you liked it, and a whole lot more.
- Text-to-Speech: We convert Alexa’s textual response to the customer's request into spoken audio.
-
Metadata & Voice
Like plumbing, metadata is not exciting, but extremely important. For instance, accurate metadata helps Alexa to recognize what the customer is asking for. Metadata helps Alexa to understand if a song is happy, good for working out, or even to recall your most recently played song. We maintain this metadata for millions of tracks, so your fans can more naturally connect with your music.
How music is delivered to us has a big impact as well. With that in mind, you can make sure Alexa’s able to respond to requests on your release date with some of these best practices:
- Deliver your music early: Alexa needs time to learn about your music. Because of this, music delivered at the last minute is more likely to be problematic. In general, we recommend making sure your music is delivered to Amazon at least seven days in advance of its intended release date for best results.
- Metadata matters: Providing key metadata such as original release date, version info, clean/explicit flags, and genre/sub-genre information goes a long way helping Alexa to play your song.
- Special characters: Even if metadata is correctly populated, Alexa can be challenged with such tasks as handling, understanding, and reading special characters. Good examples of this would be artist names: "6LACK", "!!!", or "P!nk". If it’s unavoidable, please have your label work with Amazon Music to address these types of scenarios in advance.
- Reduce duplicates: It can be challenging to differentiate between duplicate albums or songs, such as multiple version types.
- Minimize product changes near street date: Frequent redeliveries and changes to your content such as release date, new audio assets, and metadata can impact performance. While re-deliveries are often unavoidable, we urge our partners to keep last-minute changes to a minimum. This will ensure the best possible result for your fans listening on Alexa.