EMNLP Recap

23 Sep 2015 conference, recap

The 2015 Conference on Empirical Methods in Natural Language Processing is the high mass for the NLP community. It happened last weekend in Lisbon, Portugal. Here are a few thoughts I’d like to share.

First, NLP is HOT. I could see the Sponsors/Hunters/Gatherers of Talent from many well known Silicon Valley companies… hiding in the dark alleys, waiting for their PhD preys. Hey I had to hide my Facebook badge because I think some researchers thought I was a recruiter and ran away from me!

Application-wise, apart from some people who seem obsessed with “Information Retrieval” (read: automatic surveillance of social media), the old usual suspects: machine translation, summarization, question answering, search, and a bunch of classical academic tasks like parsing, semantic role labelling, etc. Nothing really new.

So what’s new? The deep learning tsunami continues to take over NLP.

I’m proud that I saw the names of my FAIR friends and colleagues Yann LeCun, Ronan Collobert, Antoine Bordes, Jason Weston and many others cited on presentation slides over and over. The insider joke in Lisbon was that the E in EMNLP now stands for Embedding (instead of Empirical) – yes I know, when a full room in a restaurant laughs about something like that, you know you are in a special place. After all, 99% of modern NLP is empirical anyway. The opening keynote was delivered by Yoshua Bengio. Many papers were kind of “the state of the art for X was Y. We replaced the hand-crafted, manually hacked, heavily engineered Z by a RNN. It improved state of the art by 5 points.” The poor guys who presented deep learning-free papers invariably got the question: “did you also try with a [insert deep net technique here]?”

We are now in a new phase where beyond just using deep learning to improve some components (like the acoustic model in speech recognition), researcher start shipping complete end-to-end systems certified 100% deep learning (with 0 added rule based engine!), an approach pioneered by the now classic NLP (almost) from scratch.

I was a bit disappointed that nobody spoke about “grounded NLP”. In fact deep learning brings some much improvement potential, researchers may be tempted to improve existing systems instead of venturing into unchartered territories. I can’t wait to see them try to teach machines the actual, experienced, “felt” meaning of language. I took a Super Bock beer to forget.

Finally, I had interesting conversations about Facebook M, an ambitious project the Wit team is involved with at Facebook, with lots of researchers. We’ll need to improve and expand many aspects of Wit in order to develop M, and I’m glad the community will also benefit from that. That’s the beginning of an interesting journey.

As always, feel free to reach out if you have any questions, comments, or suggestions.

Keep hacking,

Alex Lebrun & Team Wit



Introducing composite entities

05 May 2015 entities, console, feature

As your Wit app matures, you will encounter more and more complex queries. Even using roles, it can become difficult to associate relevant entities with each other within a complex query like, “Find me a 2007 red Mustang and a 2003 grey Prius”.

To make your life easier, today we roll out composite entities. Composite entities are custom entities which have entities within them. For example, your could tag “2007 red Mustang” as a car and tag “Mustang” as the model, “red” as the color, and “2007” as the year. Nested entities can be other custom entities or any of Wit’s builtin entities.

Composite entities are tagged just like regular entities. First, select the entire span of the composite entity and assign it to a custom entity just like you would before. Then, select spans within the tagged entity and tag them as any entity you want (custom or builtin). Currently, Wit will only processes one level of nested entities; you cannot have a nested entity within another nested entity.

You can attach roles to composite entities and nested entities just like you would for standard entities.

In the api response, nested entities found within composite entities are returned as a list under the field “entities” and are formatted just like regular entities. An example response is below.

{
  "msg_id" : "$ID",
  "_text" : "find a 2007 red mustang",
  "outcomes" : [ {
    "_text" : "find a 2007 red mustang",
    "intent" : "find_car",
    "entities" : {
      "car" : [ {
        "value" : "2007 red mustang",
        "entities" : {
          "number" : [ {
            "value" : 2007,
            "type" : "value"
          } ],
          "color" : [ {
            "value" : "red"
          } ],
          "model" : [ {
            "value" : "mustang"
          } ]
        }
      } ]
    },
    "confidence" : 0.52
  } ]
}

We hope that composite entities will help you better interpret the information Wit extracts from your users’ statements so your app can make more intelligent decisions.

As always, feel free to reach out if you have any questions, comments, or suggestions.

Keep hacking,

Team Wit



Toward Dialog: Thread

23 Apr 2015 dialog, feature

Since day one, we decided to focus on trying to solve one problem at a time. This is why we have been working on understanding one aspect of human language at a time. We knew it was a simplistic approach; communications between humans and machines often require back and forth interactions. We knew we would eventually need to help developers manage dialog, but it was too soon. As a result, our developers had to implement this part themselves in their app or device.

Last year, we made our first step toward helping you manage dialog. By adding an optional state object to the request, developers can activate and deactivate specific intents, giving Wit some additional context about the conversation at hand.

Moving forward, we will be releasing more features to help you train and manage human conversations. Today, we are happy to release threads in Wit. This will be the foundation for future dialog features.

You can mow add a thread-id parameter to your API request which will let you group requests per conversation.

Here is an example :

curl -H 'Authorization: Bearer YOUR_TOKEN'   'https://api.wit.ai/message?v=20150414&q=open%20the%20door&thread_id=1234'

This message will then appear in your Logs page under its own thread along with any future messages which share its thread-id.

As always, don’t hesitate to reach out if you have any questions.

Team Wit



Together we stand: leveraging the community

31 Mar 2015 community, feature, console

Instead of starting a Wit instance from scratch, you may want to reuse someone else’s work as the starting point for your own instance. Last year, we released the Fork feature. It allowed you to copy someone else’s open instance and start from there. However, it was almost impossible to discover open instances. That was just the beginning.

Today we’re happy to fill this gap and unleash the power of our community. To help Wit understand what people say, you need to first define intents. Instead of building and training some from scratch, you can now see if someone from the community has already created the intents you need.

Imagine you want create an app that will allow you to check the weather. Like before, you type a sentence you want your app to understand. But starting today, you will be presented with some existing intents from the community.

You can then “get” (aka copy) the intents that look relevant to your use case and directly use and improve them in your own instance.

This feature is still experimental, so please bear with us as we learn how to improve the accuracy and UX of this feature!

As always, please reach out if you have any questions.

Team Wit



Unlocking the Archive

03 Mar 2015 inbox, archive

Ever wish you could teach Wit to ignore certain phrases as irrelevant or out-of-context? Well, you’re in luck. Today Wit is proud to announce the roll-out of our newest feature: The Archive.

The Archive is a place to put expression that are outside the scope of your app or that might one day be relevant for future features, but are just noise for now.

When reviewing expressions in the Inbox, just click Archive and the expression will be stored for later use. Future occurrences of these expressions will skip your Inbox leaving you with a cleaner validating experience.

Even better, Wit will learn to classify these expressions as out of context, classifying the intent as UNKNOWN so your app knows that it doesn’t know. Here’s an example response:

{
"msg_id" : "624420bf-daf2-4daa-83e8-17ccd9dfc3b5",
"_text" : "hello",
"outcomes" : [ {
    "_text" : "hello",
    "intent" : "UNKNOWN",
    "entities" : { },
    "confidence" : 0.98
    } ]
}

Archived expressions can be found under the Archive tab in the inbox. There, you can review your archived expressions and reassign misclassified ones or retrieve saved expressions to quickly start training a newly created intent.

New Archive tab in the inbox

Please let us know if you have any questions or issues. Good luck and keep hacking.

Team Wit