Multi-Task Sequence Labeling Under Adverse Conditions (2019-2021)

Despite unprecedented advances in Natural Language Understanding (NLU), our models still dreadfully lack the ability to generalize to conditions that are different from the ones encountered during training. Such adverse conditions include learning for noisy domains up to the extreme case of adaptation: new languages. Recent work on transfer learning offers great promise to remedy the problem, particularly Multi-Task Learning (MTL). MTL has been applied successfully across NLU. However, most of the work has limited scope: e.g., only sharing across a few tasks or domains, and typically considering a single language. Little is known on when which type of sharing is most beneficial, especially if we want to expedite NLU to dozens of languages or customer-specific domains. In this project, we focus on a core NLU problem, sequence tagging, and ask: How can we create the best sequence labelers at scale, under adverse conditions, if little to no annotated data exists? We propose to combine diverse sources of supervision to bridge the gap, while also learning what and how to successfully share in MTL, to derive a set of best practices and models that quickly scale to new conditions.

This project is partially funded by an Amazon Research Award

Parsing Algorithms for Uncertain Input (2015-2019)

The automated analysis of natural language is an important ingredient for future applications which require the ability to understand natural language. For carefully edited texts current algorithms now obtain good results. However, for user generated content such as tweets and contributions to Internet fora, these methods are not adequate - for a variety of reasons including spelling mistakes, grammatical mistakes, unusual tokenization, partial utterances, interruptions. Likewise, the analysis of spoken language faces enormous challenge. One important aspect in which current methods break downs that they take the input very literal. Disfluencies, small mistakes or unexpected interruptions in the input often lead to serious problems. In contrast, humans understand such utterances without problems and are often not even aware of a spelling mistake or a grammatical mistake in the input.

We propose to study a model of language analysis in which the purpose of the parser is to provide the analysis of the 'intended' utterance, which obviously is closely related to the observed input, but might be slighdy different. The relation between the observed sentence and the intended sentence is modeled by a kernel function on input string pairs. Such a kernel function accounts for different kinds of noise. The kernel function might model errors such as disfluencies, false starts, word swaps, etc. More concretely, this kernel function can be thought of as a weighted finite-state transducer, mapping an observed input to a weight. Finite state automaton representing a probability distribution met possible intended input. The parser then is supposed to pick the best parse out of the set of parses of all passible inputs - taking into account the various probabilities. Note that there is an obvious similarity with parsing word graphs (word lattices) as output of a speech recognizer, as well as with some earlier techniques in ill-formed input parsing. The current model combines and generalizes these ideas.

This project was funded by the Nuance Foundation