Making an Alexa Skill for Kids

Find the Puppy is an Amazon Alexa skill written for children under age 13, and an entry into the Amazon Alexa Skills Challenge: Kids competition. The objective was to go beyond the one-shot “random fact” and the multiple choice trivia games and create a more interactive experience that was both fun and a good exercise for spatial visualization.

Inspiration

Anyone who has younger children, nieces, nephews, or cousins can testify that playing hide and seek is always a hit. And what child doesn’t love a playful puppy? Find the Puppy combines these two child favorites into a fun, interactive, voice game that stimulates the imagination and helps develop a sense of direction and spatial reasoning.

Find the Puppy provides a hide and seek game within a (very simple) maze, with a few features that help it work better in a voice-only context. Like real hide and seek, the player is rewarded with a definitive victory when they succeed at their quest.

Game Play

Alexa asks the player if they’d like the puppy to hide in a small, medium, or large house; the answer determines some parameters of the maze described below. The game starts in the entryway. For this room and every subsequent room, Alexa says the name of the room and lists the colors and directions of the available doors. Players indicate which way they’d like to go by saying the color or direction of a door. If the puppy is hiding in the room they enter, the player wins and the game is over. Otherwise, the loop continues.

Find-the-Puppy-Room-Description

Additional help is provided by the specific things Alexa says about the rooms and doors, and by a sound of the puppy barking, which is louder as the player gets closer to locating him.

At the end of the game, a card is added to the player’s Alexa App. The card includes a picture of the house for that specific game, showing the puppy’s location and the open and closed doors.

Behind the Scenes

The “house” is a simple maze, newly generated for every game. The skill starts with a grid and then generates a random maze by making doorways in the walls between the rooms.

Grid Size Extra Doors Big Rooms
Small 3×2 1 0
Medium 4×2 2 2
Large 4×3 3 3

A few extra doors are added next. This makes the layout more like a house and creates the possibility of loops and shortcuts, which is less frustrating than having to completely backtrack along an unsuccessful path.

For medium and large houses, some rooms are enlarged by removing a wall. This makes the layout more interesting and slightly increases the challenge of mentally keeping track where you are.

Finally, the puppy’s hiding place is selected, room names are assigned, and door colors are chosen. The algorithm is careful to ensure that no room has more than one door of any color, and no walls have more than one door, even in expanded rooms.

large_reduced

Code Organization

Find the Puppy is written entirely in Python and is self-hosted on a private web server. Amazon pushes AWS for hosting and encourages developers to use the AWS Lambda service, which takes care of many of the fundamental deployment, communication, and validation aspects. Since I was reusing existing code, keeping everything on my server was easier. I’m not sure I would have had access to some of the libraries I wanted to use in the Lambda environment.

The main web interface uses Apache and mod_wsgi. The overall structure of the wsgi code is described in Home Automation with Amazon Echo Apps, Part 2, and the source code for that older version of the echo_handler.wsgi file is available in the Maker Musings Github Repository. Head over to the earlier posts about building Amazon Echo apps if you’re interested in the parts of the development process that aren’t discussed here.

The wsgi code handles each of the Alexa intent requests along with the LaunchRequest and SessionEndedRequest messages. Everything about constructing the proper speech response is handled here. A second file provides PuppyGame, PuppyGame.Config, and Room classes. These classes handle the maze generation and manage the game state; they don’t know anything about voice input or voice output.

Security Requirements

Since Find the Puppy is a published skill available to the public, Amazon requires several validation checks to keep everything secure. They provide libraries to do this in Java, and AWS Lambda handles the checks automatically. For my standalone Python code, I had to implement each of the checks.

Since it’s valuable to look at the security verification in detail, check out Securing Self-Hosted Alexa Skills with Python which includes a lot of code and was too much to put here.

State

To keep my web service stateless, without having to persist information on the server, I return the full game state in each Alexa response, which is passed back to my code in the next Alexa request. That way, each incoming Alexa request contains everything needed to handle the player’s input and return a response.

reply = {"version" : "1.0"}
reply['sessionAttributes'] = {'state':game.State(), 'img':game.Visualize()}

The PuppyGame.State() method returns a Python list that contains everything needed to reconstitute the full PuppyGame object when passed to the corresponding PuppyGame.SetState() method. The PuppyGame.Visualize() method returns a list of strings that represent a character version of the maze, which was incredibly useful for debugging and testing.

{
"sessionAttributes": {
  "state": [[2, 3, 1, 0], [1, 0], "S", 1, 11, [21513, 10, 3, [3, -1, 1, -1]], [19467, 9, 2, [0, 3, 2, -1]], [10, 5, 1, [-1, 0, 4, -1]], [13317, 6, 4, [5, -1, -1, 1]], [11270, 1, 3, [-1, 5, -1, 2]], [260, 15, 0, [-1, -1, -1, 4]]], 
  "img": [
    "+---+---+---+",
    "| 10|  9|  5|",
    "|   Y   B   |",
    "| 3 | 2 | 1 |",
    "+-R-+-G-+-W-+",
    "|  6|  1| 15|",
    "|   P   | P |",
    "| 4 | 3 | 0 |",
    "+---+---+---+"
]}}

When each request is received, a new PuppyGame object is created and the state is restored.

game = puppyfinder.PuppyGame()

game_state = session_attrs.get('state', None)
if game_state is not None:
    game.SetState(game_state)

Optimizing Properties

Depending on the size of the maze, each game has between 6 and 12 rooms. Each Room object has 15 boolean properties, such as whether the room has been visited, whether there’s a door to the east, and so on. If these were represented as separate member fields in each object, the JSON-encoded version would get a bit large, somewhere between 500 and 1200 characters just for these boolean properties alone.

To generate the card pictures for the Alexa App, it was useful to store the state in a MySQL varchar(255) field, so I wanted a more compact representation. I chose a bit field approach.

This could have been performed in the marshalling/unmarshalling code in PuppyGame.State() and PuppyGame.SetState(). However, I chose to implement the bit field in the object itself.

class Room(object):
    def __init__(self):
        self._flags = 0
        self.name = 0
        self.distance = 100000
        self.door_colors = [-1, -1, -1, -1]

    def State(self):
        return [self._flags, self.name, self.distance, self.door_colors]

    def SetState(self, state):
        self._flags = state[0]
        self.name = state[1]
        self.distance = state[2]
        self.door_colors = state[3]

    def getter(self, f):
        return (self._flags & f) != 0
    def setter(self, f, v):
        if v:
            self._flags |= f
        else:
            self._flags &= ~f

    @property
    def door_e(self):
        return self.getter(1)
    @door_e.setter
    def door_e(self, v):
        self.setter(1, v)

    # ... plus 14 more properties

Making Conversation

When Alexa says the exact same thing over and over, time after time, the repetition leads to boredom, and players are less likely to keep using the skill. Even a little variety in what Alexa says helps make the interaction feel more natural and conversational. Just about everything Alexa says in Find the Puppy is chosen from a list of dozens, sometimes hundreds, of ways to say the same thing.

For example, consider this sentence telling the player that the puppy is hiding very close. (The ‘{}’ gets replaced by the <audio> tag for the barking sound.)

You hear the puppy barking very nearby. {}

Instead of only that specific sentence, the code randomly selects one of several alternatives:

BARK_CLOSE = [
  "You can hear the puppy barking in the next room. {}",
  "You can hear the puppy barking just one room away. {}",
  "You can hear the puppy barking nearby. {}",
  "You can hear the puppy barking very close. {}",
  "You can hear the puppy barking very nearby. {}",
  "You can hear the puppy in his hiding place in the next room. {}",
  "You can hear the puppy in his hiding place just one room away. {}",
  "You can hear the puppy in his hiding place nearby. {}",
  "You can hear the puppy in his hiding place very close. {}",
  "You can hear the puppy in his hiding place very nearby. {}",
  "You can hear the puppy in the next room. {}",
  "You can hear the puppy just one room away. {}",
  "You can hear the puppy nearby. {}",
  "You can hear the puppy very close. {}",
  ...
]

Using the Amazon Utterance Expander, it’s very easy to generate this list and paste it into the source code. The following input string generates 137 variations for the list:

({}/You (/can) hear (the/your) puppy (/barking/waiting for you/in his hiding place) (very close/(/very) nearby/in the next room/just one room away). {}/{} (The/Your) (puppy sounds (/like he is)/puppy's hiding place (sounds/must be)/puppy must be hiding/is (/hiding)) very (near/nearby/close/close by).)

Easter Eggs

Even with the speech alternatives and the dynamically-generated puzzle each time, it’s hard to keep the experience fresh and interesting after several games. One proven technique is to sprinkle the interaction with rare surprises.

Find the Puppy does this with several additional comments about each room that are unexpected and a little funny. The code includes them in the response randomly with a low probability. What’s more, they’re only used when entering a room for the first time. So within a single game, no matter how many times you go in and out of the room trying to hear it again, you won’t succeed. But maybe you’ll hear it again, or something different, three, five, or ten games later.

Alexa App Image

When the skill receives an intent request (player action) that finds the puppy and ends the game, the response includes a card to be added to the user’s Alexa App. The card includes a picture of the solved maze.

{
  "version": "1.0",
  "response": {
    ...
    "card": {
      "text": "You can say, \"Ask Find the Puppy to play in a medium house\" for a more challenging game, or in a large house if you're a super seeker!",
      "image": {
        "smallImageUrl": "https://example.com/ftp/img/s/835",
        "largeImageUrl": "https://example.com/ftp/img/l/835"
      },
      "type": "Standard",
      "title": "You Found the Puppy!"
    }
  }
}

Rather than generate and store the images as actual files, the /ftp/img endpoint is handled by a different function of the same wsgi code that generates the image on the fly. The code parses the last two elements of the path to get the size of the image to return (‘s’ or ‘l’) and the ID of the successful game. It then looks up the game state for the specified game ID and generates a PNG image. This is part of the Python code and uses the Pillow image library.

find_the_puppy

Amazon appears to retrieve the image, cache it on their side, and generate data: URLs for the Alexa App. So, no need to worry about storage space for card images! Interestingly, despite Amazon’s advice to supply both sizes of images they support, 720×480 and 1200×800, they appear to scale the image down to 180×120 in the browser and leave a bunch of unused white space anyway.

Contest Entry

Find the Puppy was entered into the Amazon Alexa Skills Challenge: Kids. The project write-up on Devpost is a briefer version of this blog post. You can view the required 5-minute video if you’d like to see a demo of the game.

Currently, Devpost is showing 458 submissions, although many of them don’t appear to contain a video demo and there’s no way to know how many won’t be certified by Amazon before the deadline on January 17, 2018.

As the two-stage judging process plays out, watch for any further updates here. In the mean time, if you have an Echo, please give Find the Puppy a try. I’d appreciate your review and rating if you feel so inclined.

 

 

Leave a Reply

Your email address will not be published. Required fields are marked *