• Home
  • blog
  • Voice controlled PHP apps with

Voice controlled PHP apps with

In this tutorial we’ll be looking into, an API that lets us build apps which understand natural language, much like Siri. It can accept either text or speech as input, which it then parses and returns a JSON string that can be interpreted by the code that we write.

All the files we’ll use in this tutorial are available in this Github repository.

Microphone in front of blurred audience


Before we move on to the practical part, it’s important that we first understand the following concepts:

  • agents – agents are applications. We create an agent as a means of grouping individual entities and intents.

  • entities – entities are custom concepts that we want to incorporate into our application. They provide a way of giving meaning to a specific concept by means of adding examples. A sample entity would be ‘currency’. We define it by adding synonyms such as ‘USD’, ‘US Dollar’, or just ‘Dollars’. Each synonym is then assigned to a reference value that can be used in the code. It’s just a list of words which can be used to refer to that concept. already provides some basic entities such as @sys.number, which is an entity referring to any number, and which is an entity referring to any email address. We can use the built-in entities by specifying @sys as the prefix.

  • intents – intents allow us to define which actions the program will execute depending on what a user says. A sample intent would be ‘convert currency’. We then list out all the possible phrases or sentences the user would say if they want to convert currency. For example, a user could say ‘how much is @sys.number:number @currency:fromCurrency in @currency:toCurrency?’. In this example, we’ve used 2 entities: @sys.number and @currency. Using the colon after the entity allows us to define an alias for that entity. This alias can then be used in our code to get the value of the entity. We need to give the same entity a different alias so that we could treat them separately in our code. In order for humans to understand the above intent, all we have to do is substitute the entities with actual values. So a user might say ‘How much is 900 US Dollars in Japanese Yen?’ and would just map ‘900’ as the value for @sys.number, ‘US Dollar’ for the fromCurrency @currency and ‘Japanese Yen’ for the toCurrency @currency.

  • contexts – contexts represent the current context of a user expression. For example, a user might say ‘How much is 55 US Dollars in Japanese Yen?’ and then follow with ‘what about in Philippine Peso?’., in this case, uses what was previously spoken by the user, ‘How much is 55 US Dollars,’ as the context for the second expression.

  • aliases – aliases provide a way of referring to a specific entity in your code, as we saw earlier in the explanation for the intents.

  • domains – domains are pre-defined knowledge packages. We can think of them as a collection of built-in entities and intents in In other words, they are tricks that can perform with little to no setup or coding required. For example, a user can say, ‘Find videos of Pikachu on YouTube.’ and would already know how to parse that and returns ‘Pikachu’ as the search term and ‘Youtube’ as the service. From there, we can just use the data returned to navigate to Youtube and search for ‘Pikachu’. In JavaScript, it’s only a matter of setting the location.href to point to Youtube’s search results page:

    window.location.href = "";

Continue reading %Voice controlled PHP apps with