There has never been a more exciting time to be a part of a team designing and building digital products! As an example, here are just a few of the projects that I was involved with last year:
● A Slack bot that serves up cloud application and network performance alerts and graphs, and provides proactive push recommendations.
● A dedicated NLP (natural-language processing) assistant for a major hospital chain, able to find doctors and locations, set up appointments and more, right from the home page.
● NLP search for analytics applications, which maps a natural-language query into an Elastic faceted search UI as keywords and selected facet values for further refinement and exploration.
● Building an Alexa Skill that delivers predictive corrosion notifications and daily asset performance analytics reports for industrial plants.
And this just scratches the surface: the range of conversational bot applications is wide, far-reaching and exciting. Today, consumer sophistication is at an all-time high, and people in the consumer and enterprise space demand polished, context-aware, personalized cross-platform experiences fully integrated into their work and leisure activities. Tech giants are rising to this challenge, competing with one another by providing extraordinarily sophisticated frameworks and cloud-based services for almost every component of your app, including (most importantly for us) robust language parsing and speech recognition frameworks. Examples include: Microsoft Bot Framework with LUIS (Language Understanding Intelligent Service) and Cortana, Siri from Apple, Amazon Lex, Google Assistant, IBM Watson, and many more.
Most frameworks offer similar basic functionality and structure for building bots, as well as free developer licenses for conversational UI and voice recognition services.
Microsoft Bot Framework strikes the right balance between power and learnability. At the time of writing, it’s free for developers, has a decent admin UI, is cloud-based right out of the box (which has the advantage of introducing you to cloud-based app architecture), and can be coded using many mainstream languages, including Node.js, which we’ll use in our examples. Most importantly, Microsoft Bot Framework boasts heaps of step-by-step tutorials and examples, and truly impressive documentation.
Amazon Alexa is newer and, in many ways, easier to setup and configure than LUIS, and offers many useful shortcuts. However, there is one crucial difference: Alexa is best set up as an Alexa Skill, whereas LUIS is a standalone chatbot. Which means that to invoke our bot – let’s call it GUPPI (more on naming in a moment) – and pass it a command (called an invocation), you have to invoke Alexa first. Compare the two invocations below:
Standalone bot: “GUPPI, play Rolling Stones.”
Alexa Skill bot: “Alexa, ask GUPPI to play Rolling Stones.”
While Alexa Skill invocation works fine for simple commands, it creates awkward verbal constructs for more complex queries. Fortunately, bot developers have some flexibility in invoking their Alexa Skill. In principle, all of the utterances below would work equally well to launch GUPPI and successfully pass the command along:
“Alexa, ask GUPPI to play Rolling Stones.”
“Alexa, tell GUPPI to play Rolling Stones.”
“Alexa, talk to GUPPI and tell it to play Rolling Stones.”
“Alexa, play Rolling Stones using GUPPI.”
In practice, there are some important limitations, and trying to override Alexa’s own commands like “Play,” “Time,” “Weather,” and so on will often result in a buggy and inconsistent experience.
Another important consideration is the length of the command, typically restricted to a single English sentence. Having to invoke Alexa every time as part of your request places additional limits on the length of your command. Supplementary Amazon Alexa APIs (such as Smart Home, List Skill, and Flash Briefing) are also available, offering some flexibility for specialized applications. However, at the time of writing, there is no easy way to invoke an Alexa Skill outside Alexa. The key takeaway is that the Alexa Skills framework is set up explicitly to invoke your bot inside Alexa as a Skill, whereas the LUIS framework allows you to have a standalone bot (with, perhaps, a little more work).
Google Assistant is similar to the Alexa model, with a few significant differences. The Google Assistant conversation bot (called an app) can be launched from within Google Assistant using an invocation phrase, just like an Alexa Skill. Unlike Alexa, however, the user cannot pass parameters to the Google Assistant app during an invocation. Thus, during the initial user response phase (called fulfillment), the user is typically greeted with a Default Welcome Intent message variant, which launches a separate conversation with the bot. Once the user invokes a custom app, Alexa prefers to take care of everything in one session, whereas the Google Assistant allows for a more robust session context which can store variables between invocations (accessible via a lastSeen attribute.)
Lastly, it pays to keep in mind that Alexa uses intent-based bot architecture (similar to Microsoft LUIS), so the developer has to add a dialog construct later in the development process using a chunk of custom code (using an AWS Lambda function2, for example). In contrast, Google Assistant assumes everything happens by default within a dialog, using the Dialogflow wrapper, which also handles intents and entities. Thus, Google Assistant gives developers access to powerful visual dialog builder features (shown in Figure 1), allowing developers to set up simple applications almost entirely using the UI, with minimal edits of the underlying code. If a developer requires additional configurability, the Google Assistant framework offers access to the raw JSON conversation webhook format which communicates directly to the Assistant.
Figure 1: Google Assistant provides powerful visual builder features for creating dialogs.
In contrast to Alexa and Google Assistant (where the user invokes the custom bot from within the primary digital assistant), IBM Watson is a standalone bot framework, which makes it useful for autonomous tasks. IBM Watson is in many ways similar to Microsoft Bot Framework with LUIS. One nice improvement over LUIS is that while both frameworks use intents and entities as a base, Watson also has a powerful visual builder for the dialog construct, similar to that of Google Assistant, as shown in Figure 2.
Figure 2: IBM Watson’s version of the visual dialog builder.
Microsoft LUIS only stores intents and entities in the bot’s definition, while the additional Microsoft Bot Framework application code provides the waterfall dialog interaction (more on this later). In contrast, IBM Watson offers the option to have the dialogs build directly as part of the bot’s JSON definition, providing a comprehensive high-level visual diagram of your bot’s waterfall dialog logic, shown in Figure 3.
Figure 3: IBM Watson’s visual logic diagram for a demo bot.
IBM Watson also provides integrated testing of the conversational experience using a built-in testing app, offering an excellent all-in-one framework to design, train and test your bot. IBM Watson bot framework will appeal to many designers, as it allows you to build a reasonably sophisticated, fully functional standalone bot without writing any code at all.
Do you need a standalone bot? It depends on your particular application. Write out the invocations you plan to use and see if they would sound exceptionally long or awkward if the user has to invoke Alexa first. If so, use a standalone framework such as LUIS or IBM Watson; otherwise, you’re in luck, and you can use the Alexa Skill framework or Google Assistant as a quick shortcut. I can personally attest that setting up an Alexa Skill on an Echo device creates particularly impressive workplace demos you can throw together in just a few hours.
One last point: if your bot has a notifications or alerts feature, Amazon recommends invoking “Notifications” at the top level as aggregated Alexa notifications, not from within the Skill itself. The good news is that users will get an accurate visual indication that they have notifications (in the form of a flashing orange ring on their Echo device). The bad news is that you have to conform to Alexa’s notification framework. If deeply customized interactive notifications are essential to your experience, you may be better off with a standalone bot.
The Google Assistant bot framework does not yet provide app-level notifications (currently in developer preview), though they should be available soon. Google Assistant notifications appear to offer a middle ground of a hybrid model, promising to provide more comprehensive customization options at the cost of some additional complexity compared to Amazon Alexa.
Many other bot frameworks exist, including Facebook Bot Engine, Dialogflow, Aspect CXP, and Aspect NLU, and many other popular platforms. Many more frameworks are in current active development. Detailed competitive analysis is outside the purview of this chapter – for more detailed comparison of various frameworks, see Olga Davydova’s well-researched article, “25 Chatbot Platforms: A Comparative Table,” published in Chatbots Journal.
Despite some differences, at their core most bot frameworks are fairly similar. Design patterns and UX principles should translate well between various frameworks, even as technical specifics experience rapid growth and evolution. Since a standalone bot offers a more complex and interesting use case, I used LUIS as the primary example in my chapter on Designing Chatbots and Virtual Assistants in the Smashing Book #6using Alexa as a counterpoint of a “skill bot” operating from within another digital assistant.
Regardless of the framework you choose, the first step in developing your bot is to break up generic bot commands (or invocations) into the basic building blocks of conversational UI: intents, entities, and dialogs. Ready to design our chatbot? Smashing Book #6 is now available here: https://www.smashingmagazine.com/2018/06/meet-smashing-book-6/