How to Annoy Your Coworkers for $42 using ClojureScript and Rust

12 Sep 2014 hacks

Control Spotify music, Hue lights and everything else at your office with voice commands and Hipchat messages

In our last few Friday Hacks we’ve been on a mission to create a real life version of our interactive home automation demo for our office. Project Cambridge has been a great way for us to play with some new technologies and dogfood our own product at the same time. Today we’re releasing the two major components for Project Cambridge: Witd and Cherry.

Witd: Wit SDK for Raspberry Pi

One of Wit’s coolest features is that the API accepts text and voice commands. We try hard to make it easy for developers to record and send audio to the Wit API with tools like our iOS and Android SDKs. So when we turned our attention to the Raspberry Pi we knew we had to give it a super simple way to record audio. Witd is our solution for handling audio recording and streaming it to the Wit API.

Pi chillin' on a chair

Raspberry Pi hanging out during some new plugin development

The initial goal of Witd was to have a small, stand-alone, open-source daemon that would handle both the voice recording and the queries to Wit.ai. We originally implemented one in Python, because the language is easy, widely used, flexible and compatible with many different platforms. However, as we moved away from our powerful Intel Core i7 laptops to the world of embedded devices, we realized that spawning a Python interpreter was less than ideal for performance, and that a lightweight, self-contained executable would be much nicer. C or C++ seemed like the most reasonable choice but it was also a terribly unsexy one. Instead, we decided to a give a chance to the promising new star in the world of systems programming languages: Rust.

Since the language is designed to help you write safe code, the compiler is really unforgiving. The initial steps can be a bit difficult, but it makes for a deep feeling of satisfaction when your program finally compiles. At this point you have:

  • a high confidence that your code is correct
  • a super-fast and lightweight binary

Rust also makes it very easy for you to use existing C libraries, which is especially important since the ecosystem is very young. For example, we used the Sox library for recording audio. There is a bit of boilerplate involved because you have to redeclare all the prototypes of the C functions, structs and enums you use, but once you get used to moving back and forth between Rust types and C types, it becomes a fairly easy process. The main caveat is that your Rust code becomes less “pure”: you have to enclose your C function calls in unsafe{} blocks and you may need to do some memory management and casting by yourself. Rust does a pretty good job at making this as non-ugly as possible, though.

Overall, we’re really satisfied with our Rust experience, and we will probably use it for other projects. The community is moving very fast but it’s shaping up to be an excellent way to do some low-level programming without ending up with an unmaintainable mess after a few thousand lines of code. And you can even bring on board some of your beloved functional programming constructs and type-safe macro wizardry.

Cherry

Cherry: An Extensible Hub for Office and Home Automation

Cherry is an easily extensible framework for home automation. It runs on Node.js and is written in ClojureScript. Cherry acts as the hub of your smart home, allowing your connected components to communicate with each other. Cherry’s power comes from its plugin system. Plugins are building blocks that you assemble to create your own smart system.

Hooking up a system on Cherry is easy. As an example, let’s say you have a Philips Hue light and you want to turn it on by pressing a button. You just need a few lines of Javascript.

module.exports = function (cherry) {
    console.log("lightswitch ready to rock!");
    cherry.handle({
        pin: function (message) {
            var plugins = cherry.plugins();
            if (message.state === "high") {
                plugins.hue({on: true});
            } else if (message.state === "low") {
                plugins.hue({on: false});
            }
        }
    });
}
A simple Cherry plugin to connect Hue lights to a GPIO.

Cherry also acts as a repository for home automation/internet of things plugins. We believe in an open, crowd-sourced solution for home automation and this is our go at it. We already created a handful of plugins to show how easy it is to connect devices together (cherry-wit, cherry-spotify, cherry-hue, cherry-gpio). As new devices come out, we’ll be releasing more plugins (Nest, Myo, Google Glass, Pebble, …). We’ll also feature the plugins from the community.

Project Cambridge

Project Cambridge - The Automated Office

Our office automation system is built on a cheap Raspberry Pi running Cherry and Cherry’s suite of plugins including Witd. The Pi is also connected to various services through those plugins as well as a microphone for voice recording and speakers. Using the microphone, you can ask to search for some new music, change the color of the lights or ask the system some details about itself. We’ve also integrated HipChat so we can control the music in our office through text or voice commands. Check out the video below to see Project Cambridge fully in action.

Both Cherry and Witd are open source and getting started is easy!

Check out the Cherry README to get started with Cherry. We’ve even included a small script to get everything up and running on a Raspberry Pi.

Witd is written in Rust and we recommend you use Cargo to build it.

A few plugins to interact with your favorite components:

  • cherry-spotify uses Spop to play music from Spotify.
  • cherry-wit adds natural language understanding to your system
  • cherry-hue turns your lamps on and off and changes their color
  • cherry-gpio makes it easy to interact with your electronic circuit

Team Wit


Introducing Silence Detection in the iOS SDK

21 Aug 2014 feature

At Wit.ai we work hard to make voice recognition interfaces simple to implement for any developer. Wit.ai’s iOS SDK is the easiest way to integrate our powerful voice recognition and artificial intelligence API into any iOS application. The SDK allows you to capture voice commands and returns structured, actionable information you can use to drive your applications.

Today we’re introducing the Voice Activity Detection feature in our iOS SDK. Starting from version 1.3, the iOS SDK will include silence detection. This feature, known as Voice Activity Detection, means that the SDK can detect when a user has stopped talking and will automatically proceed to the next action. If you’re curious about the algorithm we use to accomplish the silence detection, take a look at this paper.

This is the first exciting step toward a full “hands-free” voice activated interface. We are already hard at work implementing the ability to detect when a user begins speaking. We hope to release the full voice activity detection feature as an experimental feature in the coming weeks.

Team Wit


Friday Hacks - The Pi that could sing

01 Aug 2014 hacks

At Wit, we strive for great products. The end goal is to provide the best experience to our users, cranking out features and making sure everything stays in check.

However, we’re also Wit users and love to tinker with (i.e. break) the latest technologies to make our lives easier. You may remember our interactive house demo (video and demo page). Recently, we decided we needed to implement this in real life and started to work on our smart office project. It was great in many ways: we could spend some quality time as a team, crafting something new, shiny and not production-critical; this also allowed us to dogfood our product, and fix UX glitches.

Finally, we could make the first step towards an open-source home automation framework (based on Node.js) as well as a Wit SDK for Raspberry Pi (any Linux machine really) that handles the recording of voice commands and communicates with Wit. More on these in a later post.

We wanted to interact with the system like we would with a human, using simple, improvised voice commands. This way, we could control the lights, music, etc. in a natural way, without having to memorize a particular grammar or set of commands.

The tech we used:

  • Wit, obviously, to turn a human sentence into a machine action
  • Raspberry Pi as a hub controlling other devices, as well as recording voice commands
  • Rust, to create witd — our lightweight recording daemon
  • ClojureScript on Node.js, to create electric-cherry, an extensible open-source home automation hub software
  • HipChat to get user input via XMPP
  • Mopidy to play music from famous providers
  • Philips Hue, to control lights via HTTP

In the end, we managed to build an easy-to-deploy and easy-to-extend home automation system that exceeded our expectations. Videos of the results will be uploaded in later posts. We architectured electric-cherry to allow modules to be written in a very simple way. Even though the repo is available on GitHub, we need to clean it and add proper documentation before announcing it, so please be patient :)

Here are some photos of the afternoon.

Eating lunch

The plan

The Raspberry Pi

Screens are cool

Terminals are cooler

Sincerely,
Team Wit


Ambiguity in natural language

27 Jun 2014 feature

TL;DR: Wit.ai now returns several probable results instead of just one (documentation).

Natural language is ambiguous: one phrase often has multiple meanings. That’s the result of many thousands of years of evolution, and it’s actually very efficient. If we had to be always explicit, our phrases and sentences would be much, much longer — and boring. When A says something to B, A makes assumptions on B’s context, common sense,beliefs and knowledge. The verbal message itself just contains the minimum information needed, on top of this pre existing information, in order for B to get it.

Language can be seen as a very efficient compression algorithm. The phrase the speaker chooses is the shortest message, given the context of the receiver.

In computer science, Natural Language Processing (NLP) struggles a lot with ambiguity. Trying to have software decide about the meaning of a piece of text or audio, without taking into account the context (and common sense, culture, etc.) is a lost battle. And yet we are fighting this battle everyday!

At Wit.ai, our assumption is that we don’t have enough information about the speaker context to make a final decision about the meaning of their utterances. Your app or your device might have more information, though. That’s why we are introducing a new feature called “N-Best Outcomes”. Simply put, it means that now Wit.ai will not just return one outcome, but the n most probable outcomes. In many cases, your app or your device will have sufficient contextual information to choose the best among these n outcomes.

For instance, if you have two intents tv_control and lights_control and the speaker says “Turn it off”, Wit.ai will return both intents (more information in the documentation).

  • If your app knows that the current topic is the TV, it will choose tv_control
  • If your app knows that the TV is off but the lights are on, it will choose lights_control
  • If your app has no idea about the context, you may ask the user “Did you mean the lights, or the TV?”

In the future, Wit.ai will do more and more to help you leverage the user context to resolve ambiguities. Right now, we prefer to avoid making arbitrary and obscure decisions, and put the responsibility for this decision into your hands.

Team Wit


Stop reinventing the wheel

04 Jun 2014 community, feature

Instead of starting a Wit.ai instance from scratch, you may want to use someone else’s open instance as the starting point for your own instance. For example, if you are building a speech interface for your home, the Explore page already lists instances that contain useful intents to manage lights, thermostats, blinds, etc. Why not starting from there?

This is known as “forking” on source control systems like Github, so we called this feature “fork” too! You can now fork any open instance, or any private instance you own.

The starting point to find instances to fork is the Explore page. In the near future we’ll add more ways to discover open instances that are relevant to what you are building.

A special use case is to fork your own instance. Forking your own instance is useful when you use an instance in production, and you want to start to work on a version 2 without messing up with the production instance.

To fork an instance, go to the instance’s page and click the “Fork” button on the right:

fork an instance

Happy forking!

Team Wit