Alexa Skills Development Tutorial: Alexa Skills from Scratch – Adding Our Backend Part 3 [Episode 7]

Alexa Skills Development Tutorial: Alexa Skills from Scratch – Adding Our Backend Part 3 [Episode 7]


welcome back in this video we’ll see how to add audio to our skill I’ll be walking you through how to encode the audio properly how to host it to make sure that Alexa can access it and we’ll be testing out our skill with real audio and actually making it a guitar tuner skill and not just a Hello skill this is a nice starting boilerplate for Orlando it also contains a bunch of helper functions for more advanced topics so let’s paste in our boilerplate and let’s walk through it briefly so you see here we have the Welcome output which will be triggered in the launch request you have the help cancel stop play known intent session ended request unhandled so just to clarify all of our logic will be put into these handlers these handlers have a one-to-one match with the intents that we’ve defined in this site which is the developer portal the launch request is triggered when the user says open guitar tuner the help intent is triggered when the user says help cancel when the user says cancel the stuff intent will use to stop the session ended requests when a user is exiting the skill and you see we have auto-generated a play note intent bicycle inator because it read that our intent play note intent existed and it had these slots now we’ll get into how these work a little bit later but let’s just clean it up we only need the notes lock we don’t need the raw for now so let’s go ahead and remove these raw values we can keep the console logs so that we see that we have a log here but let’s just say that’s just update this and say user requested note and requested to make debugging a little easier and you see this is this has a placeholder a response will change this keep in mind we also have to add all the audio and all of the implementation as well ok so that’s about it for now so let’s go ahead and save let’s start actually implementing our skill and by that I mean actually providing audio output from our response so we’re in lambda now I mean zoom in a little me clear here some reason let me change the theme first to be a bit more more light with a chrome okay so let’s start changing what our skill does so currently when a user says open guitar tuner it triggers this welcome help put which is here and if the user doesn’t say anything it will trigger the welcome very prompt which is you can say something like give me a low E or play the D string so once this launcher box is triggered so let’s say this skill says how can I help then the user says play me alone e this intent is triggered here so play note intent right so what this play note intent you’ll see it has this notes log and this pitch slot let’s start by just outputting what these are sent just to make sure that the intent is being triggered correctly so let’s say okay I will play a note it’s locked with pitch pitch slot this is just so we can have something that’s a bit more useful than this is a placeholder response all right we also don’t want and as a reprint as a we prompt the user may want to listen to it again right would you like to kiss again then the user can say yes or no this means that we need to have two other intents called yes and no but we won’t get into that just now let’s try it out let’s see if this works so let’s say that here let’s go to our developer portal and let’s stop to make sure that we quit out of the skill and let’s say ask guitar tuner for a low beep okay I will play a with pitch low it worked perfectly and understood the intent so it got correctly routed to this plane out intent and you’ll see it was a intent request with an intent name play no intent which is fine you’ll see it also captured the slots correctly it got an e so we have a slot name note with value e we have a pitch slot with value low great so in our code here we can use these two values to fulfill the logic and in this case what we want to do is output audio we want to output the sound of a guitar string playing that spec required no how do we do that well we use something called s SML or speech synthesis markup language we can learn all about it if you go to the stimulator and you go to voice and tone here you’ll see that test that Lexus response output and personality you can learn more about supportive SS ml tags here so if we go to this page I think it’s worth checking out there are a bunch of resources these are all the tags that you can use to change the way that alexis says things so you’re effectively overriding or modulating how Alexa’s text-to-speech engine will say something out loud so for example you can add brakes you can add emphasis on specific words you can subdivide the text in paragraphs you can override the phonetic pronunciation of specific words you can change the right pitch and volume of responses so for example you can you know normal volume for the first sentence louder value from the second sentence if I if I if we copy this to try it out you can just paste it in here let’s replace that play it normal volume for the first sentence louder volume for the second sentence when I wake up I speak quite slowly I can speak with my normal pitch but also with a much higher pitch and also with lower pitch apart from being quite amusing they are useful to you know tweak Howell XSS things there’s tons of SML so for example we can override the pronunciation of specific words using the phoneme SS ml tag this this is what it sounds like you say become I say pecan people have given Alexa accents by overwriting specific words with the phonetic pronunciation but what we’re going to be using today for our guitar skill is actually the audio time so if we demo it this is what adding an audio sound bite sounds like welcome to cafe you can order a write or request a fair estimate which will it be you see here I would just added an audio inside the response so the text will be run through the text-to-speech engine and if we add any SS ml it will be used as well but if we also specify an audio that will be added in between the text-to-speech response which is pretty cool and in our case we want to do exactly that and the reason why I chose the guitar tuner scale is so that I could show you how to use audio and your responses because that opens up a whole other world of opportunities right because you can use it to pre record your responses made you with a voice actor if you’re building an adventure game it’ll just help make things a lot more personal so what we have to do here is we have to add the audio tag so rather than say okay I will play a key with pitch low which doesn’t sound very good right we want to do here is and we won’t go into edge piece handling yet we’re just prototyping as we go here is an and we’ll say no sloppy and then ideally we’re just gonna copy this we want to just say lo-e right not mp3 or something but that’s ideally what we want to use we have to create these audio clips and I think this is a great way to show you how to use audio and your skill responses because you have to encode it in a specific way so how do we do that so here I have a bunch of sounds [Music] you what we have to do is we have to host them publicly so that Alexa can access them and we also have to make sure that they are encoded if we go back to the speech this is markup language they have to be encoded in a specific way so I’m using ffmpeg for that you can use an open source audacity I’m going to use ffmpeg and it’s quite easy to do all you to do is use a terminal window once you’ve installed ffmpeg just find out how to install it for your specific system and use the command that is provided here so we have to override the file name here so we’ll call this for example let’s say we want to convert the chord D minor then we’ll call the output as chord D minor inverted inverted the input will be chord D minor we just press ENTER and it’s done if we realist you’ll see we have a chord D minor converted by ffmpeg exactly in the format that Alexa requires which is perfect so I’m gonna do this for the rest of it and I’m gonna add it to this folder called converted and then I’m gonna show you how to upload these sounds so that they are publicly hosted suet alright so i’ve converted all of my sounds into this folder here which is the converted folder what we want to do now is host them publicly so let me go to s3 we’ll keep this window open I’m just going to open another AWS window just so that we don’t have to quit out of our lambda we you still need to use AWS but we’re not using lambda here we have we we essentially want a publicly available folder where we can dump our sounds in and forget about them and s3 it does exactly this s3 is a place where you create buckets a bucket is just a folder where you can add folders or files and you can access it from the web and you fold folder which we will call guitar tuner alexxa for example when you create it in Ireland let’s go to next we’ll keep everything has is we will also keep everything as is as we will give public permissions only to the file it’s not to the bucket itself so we’re gonna go ahead create bucket now it’s gonna create a guitar tuner elects a bucket and in this bucket we will upload our files so let’s just drag and drop our files here just drag and drop them here and we want to grant public read access to these objects leave everything as is upload perfect so now we have all of our notes here and our chords publicly available so if we click on a specific sound and I copy this and then go to a new tab and play it you’ll see it’s accessible this means that Alexa will also be able to play this sound because Alexa I won’t log into our AWS account it needs to have public read access to this great so we we have our sounds I also use the specific naming convention so if we go to guitar tuner you’ll see that all of the notes are just a letter and optionally have a pitch and chords start with the word chord and then have the denomination so C D D minor etc so what we have to do is formulate based on what values come in formulate a response now this skill is not meant to be a coding tutorial nor will they go down in history as the most elegant code ever written so please see this tutorial as a way to build a skill not as a way to code your okay so let’s try and hack something together so what I did is I created an object with the notes that I currently support with low and high for most of the notes they’re the same audio for isse in Sai have two versions I’ve there they’re different audio sources I also have an array for the pitches that way when the request comes in I can check if the note slot is defined and if that note slot is it exists in my notes however first I convert it to lowercase to avoid any issues with capitalization I then check if the pitch is defined as well and if it’s either low or high I assign a default if it isn’t and then I just access the right source based on the notes and the pitch I should change that to note and that to note that’s a little bit more elegant cool then I have two different responses so if they know exists so that this condition succeeds then I let the note play if then I ask if they would like to hear it again otherwise I say sorry the note you asked for is not supported let’s also add it yet to keep things hopeful and then in the reap romped I say but I support a BD low e high e ng that way they know what they can actually ask for and then I just output the response so if we save that we can try it here and they can say ask guitar tuner for a low E as your note you

20 thoughts on “Alexa Skills Development Tutorial: Alexa Skills from Scratch – Adding Our Backend Part 3 [Episode 7]”

  1. Wtf! I was following that nicely, then all of a sudden you did all the slot coding at high speed! I need to know how to assign values to my different slot types!

  2. It was a very helpful tutorial. Thank you so much. It would be better if you could add the link to the lamba code in description.

  3. Good tutorial. Would you have any suggestions on how one could code for access to specific chapter and verse of the Bible . Possibly finding a Bible online and accessing it? Thanks for your time. Your tutorials are much easier to follow than many others I have watched.

  4. Good job with these videos.

    If your Lambda trigger function is assigned a role, could you not use that role to allow access your objects stored in S3, rather than having to make the objects in S3 public read only?

    https://developer.amazon.com/docs/custom-skills/host-a-custom-skill-as-an-aws-lambda-function.html#define-new-role
    https://aws.amazon.com/blogs/security/how-to-restrict-amazon-s3-bucket-access-to-a-specific-iam-role/

  5. I used the skillinator template and the only thing I did differently to you was putting in the app ID of my skill. They had a line that seemed to override my emit (let filledSlots = delegateSlotCollection.call(this);), and when I commented that out, the output JSON did not show a reprompt in the test console, only my initial prompt (I made "reprompt" = my reprompt phrase and added it to my :ask emit. What could be wrong?

  6. I find myself stuck in the middle zone. I'm more advanced than using a builder app like Storyline, but am better at reading than writing JavaScript from scratch. If there's a book that has some basic "here's the JavaScript / Node.js code to make Alexa do XXX" examples, please point me in the right direction. Would gladly purchase. I don't have any other uses for JS, so having to learn the entire language just to use a few bits to run Alexa is overkill. I keep watching videos and doing web searches trying to patch some examples together but it's very spotty (and with the change to SDK v2, I'm not always sure what I'm looking at – old or new). Most JS tutorials want to start from the very beginning and cover things I'll have no need for.

    I also agree with the other comment that said "Thank You" for explaining how to use Slots. I totally get it now!

  7. is there any way possible to read and output the number of files in a folder using alexa? Please help me out

  8. Hello Andrea,

    Thank you a lot for you video there are very helpfull and clear.

    I am having a problem to do exaclty what I would like.
    I want to create an adventure game Skill. So I just need for the user to open the skill and then there is a text and the user dans choose where he wants to go. And I don't know how to simply do that in the index.js because if the user said something else that exist in the story he'll go there and that wouldn't make any sense, right?
    Could you please help me figure that out please?

    It's been few weeks that my skill is ready, I built it with the Interactive Adventure Game Skill that use twine. But my skill had a bug because alexa is not ok with the mp3 audio added by the HTML of twine. Anyway, I was in contact with Amazon Developer to try to figure it out but I didn't have any help.
    It's why I am trying to build everything again.

    I would really appreciate help to be able to have a fonctional skill!
    Thank you so much!!!!!

  9. sorry but when i try to upload the file it gives me an error that is forbidden and i can't make the file public… how can i do?

  10. when i try to play it the page gives me this error: This XML file does not appear to have any style information associated with it. The document tree is shown below.
    how can i do?

  11. Hi

    In regards to the custom intent handling 'speechOutput = 'Ok, I will play a " noteSlot + + "with pitch " + pitchSlot; in the earlier part of the video, I am trying to use at least two of those, however the 'this.emit(":ask", speechOutput, speechOutput); only works on the first speechOutput it is under via the Alexa voice test and not the second one. How can I get more of the speechOutputs to work? I am trying to do this with just voice text, not with my own audio files. Hope that makes sense

    Thank you

  12. Might be a silly question: Are the MP3s used for this tutorial somewhere to put into my own bucket? Or, does everyone have to create their own from scratch and then upload them?

  13. Like your efforts to be quick and concise keeping videos short and filled with the good stuff! I have two skills Im working to make custom they were flash briefing , I want to make them available to the world of English speaking nations. the blueprint flash brief does not allow that, whats quickest way to add the countries? also when working on custom "flash Briefing" I was using s3 as my storage for the txt file which would like to make ssml and call the file rather than put ssml in the code as the text will change as it is a headline of news. I was just uploading the txt files to s3 as needed and flash brief would pick up, however with custom i can find no way to refrence a file in s3 using arn much like is done with speechout using an arn? any help is appreciated!

  14. When I am trying to play, open Guitar Tuner, it says
    "There was a problem with the requested skill's response."

  15. Ottimo tutorial, mi ha fatto capire molte cose, anche se adesso l'ambiente di sviluppo è leggermente diverso rimane una fantastica guida per imparare. Grazie Andrea!

Leave a Reply

Your email address will not be published. Required fields are marked *