Tutorial: AAAI-16: Learning and Inference in Structured Prediction Models
, Gourab Kundu
, Dan Roth
, and Vivek Srikumar
Date and Time
2:00pm-6:00pm, Feb. 13, 2016.
Many prediction problems required structured decisions. That is, the goal is
to assign values to multiple interdependent variables. The relationships
between the output variables could represent a sequence, a set of clusters, or
in the general case, a graph.
When solving these problems, it is important to make consistent decisions that take
the interdependencies among output variables into
account. Such problems are often referred to as structured prediction problems.
In past decades, multiple structured prediction models have been proposed and
studied and success has been
demonstrated in a range of applications, including natural language
processing, information extraction, computer vision and computational
However, the high computational cost often
limits both models' expressive power and the size of the data that can be handled.
Therefore, designing efficient inference and learning algorithms for these models
is a key challenge for structured prediction.
In this tutorial, we will focus on recent developments in discriminative structured
prediction models such as Structured SVMs and Structured Perceptron.
Beyond introducing the algorithmic approaches in this domain, we
will discuss ideas that result in significant improvements both in the learning and in the
inference stages of these algorithms.
In particular, we will discuss the use of caching techniques to
reuse computations and methods for decomposing complex structures, along with learning
procedures that make use of it to simplify the learning stage. We will also
present a recently proposed formulation that captures similarities between
structured labels by using distributed representation.
Participants will learn about
existing trends in learning and the inference for the structured prediction models,
recent tools developed in this area, and how they can be applied to AI applications.
Introduction [60 min]
We will present several AI applications with structured outputs to motivate
the need for structured prediction models. We then present Constrained Conditional Models
a framework that is used to model interdependencies between output variables using constrains and
features. We then discuss how to formulate an inference problem as an Integer Linear Programming
We will also describe several paradigms to learn the parameters of a CCM.
Necessary background about structured prediction will be provided in this section.
Efficient Learning for Structured Prediction Models [45 min]
We will first discuss global learning v.s. local learning. We then describe several structured learning approaches such as
Structured SVMs and Structured Perceptron.
Next, We describe efficient learning algorithms for Structured SVMs based on a
dual coordinate descent method.
Finally, we will present methods that make use of amortized inference
Amortized Inference for Structured Prediction Models [45 min]
We will describe a recently developed technique, amortized inference, for speeding up inference for
structured prediction models by caching previous inference samples.
We will also discuss how to further improve the amortized inference techniques by
incorporating a dual decomposition approach which decomposes the output
structure and makes use of Lagrangian relaxation methods.
Distributed Representation for Structured Prediction [30 min]
We will present a recently proposed structured learning formulation,
, which models meaning of labels using real valued vectors.
We will also describe inference and learning algorithms in this model.
Structured Prediction Software [15 min]
We will introduce IllinoisSL -- a Java based discriminative structured learning
library. We will use the problem of part-of-speech tagging as a running example and
demonstrate how to implement a sequential tagging model using the library.
Conclusion and Future Research Directions [15 min]
Structured prediction models are widely used in AI.
Therefore, designing efficient learning and inference algorithms for
them could have a great impact. We will conclude the tutorial by presenting some
challenges and potential research topics in designing and applying structured
is a post-doctoral researcher at Microsoft Research. He will be joining the Department of Computer Science at the University of Virginia as Assistant Professor in the Fall 2016. His research interests are in designing practical machine learning techniques for large and complex data, and applying them to real-world applications. He has been working on various topics in Machine Learning and Natural Language Processing, including large-scale classification, structured learning, co-reference resolution, and relation extraction. He has been also involved in developing machine learning packages such as LIBLINEAR, Vowpal Wabbit, and Illinois-SL. He was awarded the KDD Best Paper Award in 2010 and won the Yahoo! Key Scientific Challenges Award in 2011.
is a research staff member at IBM research.
He is broadly interested in all aspects of machine learning and natural language processing. He has publications in top tier machine learning and natural language processing conferences along with a best student paper in CoNLL 2011.
is a Professor in the Department of Computer Science and the Beckman
Institute at the University of Illinois at Urbana-Champaign and a University
of Illinois Scholar.
Roth is a Fellow of the AAAS, ACM, AAAI and ACL, for his contributions to
Machine Learning and to Natural Language Processing. He has published
broadly in machine learning, natural language processing, knowledge
representation and reasoning, and learning theory, and has developed advanced
machine learning based tools for natural language applications that are being
used widely by the research community and commercially.
Roth is the Editor-in-Chief of the Journal of Artificial Intelligence Research
(JAIR) and has served on the editorial board of several of the major journals
in his research areas. He was the program chair of AAAI'11, ACL'03 and
CoNLL'02 and serves regularly as an area chair and senior program committee
member in the major conferences in his research areas.
Prof. Roth received his B.A Summa cum laude in Mathematics from the
Technion, Israel, and his Ph.D in Computer Science from Harvard University
is an assistant professor in the School of Computing at
the University of Utah. His research interests are in the areas of
machine learning and natural learning processing in the context of
structured learning and prediction. His research has primarily been
driven by questions arising from the need to learn structured
representations of text using little or indirect supervision and to
scale inference to large problems. His work has been published in
various AI, NLP and machine learning venues and recently received the
best paper award at EMNLP. Previously, he obtained his Ph.D. from the
University of Illinois at Urbana-Champaign in 2013 and was a
post-doctoral scholar at Stanford University.
- 2016 Structured Learning Tutorial