Tutorial: AAAI-20: Recent Advances in Transferable Representation Learning


Muhao Chen, Kai-Wei Chang and Dan Roth.

Date and Time

2:00pm-6:00pm, Feb. 7, 2020.

Goal of Tutorial:

This tutorial targets AI researchers and practitioners who are interested in applying deep learning techniques to cross-domain decision making tasks. These include tasks that involve multilingual and cross-lingual natural language processing, domain-specific knowledge, and different data modalities. This tutorial will provide audience with a holistic view of (i) a wide selection of representation learning methods for unlabeled text, multi-relational and multimedia data, (ii) techniques for aligning and transferring knowledge across multiple representations, with limited supervision, and (iii) a wide range of AI applications using these techniques in natural language understanding, knowledge bases, and computational biology. We will conclude the tutorial by outlining future research directions in this area.


Many AI tasks require cross-domain decision making. For example, many NLP tasks involve predictions across multiple languages, in which different languages can be treated as different domains; in AI-aided biomedical study, the prediction of side effects of drugs is often in parallel to modeling the interactions of proteins and organisms. To support machine learning models to solve such cross-domain tasks, a requisite is to extract the characteristics and relations of data components in different domains, and capture their associations in a unified representation scheme. Towards such a demand, recent advances of representation learning often involve mapping unlabeled data of different domains into shared embedding spaces. In such a way, cross-domain knowledge transfer can be realized by vector collocation or transformations. Such transferable representations have seen successes in a range of AI applications involving crossdomain decision making. However, frontier research in this area faces two key challenges. One is to efficaciously extract features from specific domains with very few learning resources. The other is to precisely align and transfer knowledge with minimal supervision, since the alignment information that connects between different domains can often be insufficient and noisy.

In this tutorial, we will comprehensively review recent developments of transferable representation learning methods, with a focus on those for text, multi-relational and multimedia data. Beyond introducing the intra-domain embedding learning approaches, we will discuss various semi-supervised, weakly supervised, multi-view and selfsupervised learning techniques to connect multiple domainspecific embedding representations. We will also compare retrofitting and joint learning processes for both intradomain embedding learning and cross-domain alignment learning. In addition, we will discuss how obtained transferable representations can be utilized to address low-resource and label-less learning tasks. Participants will learn about recent trends and emerging challenges in this topic, representative tools and learning resources to obtain ready-to-use models, and how related models and techniques benefit realworld AI applications.

Tutorial Outline

Introduction [30 min]

We motivate the need of transferable representation learning by introducing several application scenarios where knowledge transfer is needed for decision making in low-resource domains. We also identify the key technical challenges from three perspectives of transfering across languages, domains and modalities.

The Basics of Embeddings and Cross-X Embeddings [15 min]

We first provide the general overview of embedding learning methods for structured and unstructured data in different domains. On top of that, we discuss about how domain-specific embedding spaces can be associated using retrofitting or joint learning methods.

Transferable Representation Learning for Multilingual Natural Language Processing (Part I) [60 min]

We discuss how transferable representations are incorporated into various multilingual NLP tasks. We demonstrate how knowledge transfer allows NLP models trained on high-resource languages to be transferred to low-resource language tasks.

Transferable Representation Learning for Multilingual Natural Language Processing (Part II) [15 min]

Taking dependency parsing systems as an example, we demonstrate how adversarial training can be used to obtain language-invariant semantic representations, which help the learning system trained on English to effectively adapt to other languages.

Coffee Break

Multimodal Representations and Transfer [30 min]

We demonstrate how the multimodal contextualized language representation models obtain signals from both text and images, and help downstream models understand commonsense concepts in both human languages and vision.

Transferable Representation Learning for Multi-Relational Data [45 min]

We present recent research efforts in a number of tasks that apply joint representation learning on multi-relational data. From the methodology perspective, we discuss how entities can be aligned based on multi-view learning on diverse entity profiles in different modalities, and how rank-based boosting techniques can be deployed to integrate knowledge from multiple and possibly inconsistent views. From the application perspective, we investigate representative systems that utilize transferable multi-relational embeddings to address knowledge base integration, entity typing, protein-protein interaction prediction and polypharmacy side effect identification.

Conclusions and Future Research Directions [15 min]

Transferable representation learning impacts on a wide spectrum of data-driven and knowledge-driven AI tasks. We conclude the tutorial by presenting some challenges and potential research topics in designing transferable representation learning models for data with complex structures, and emerging challenges in trustworthiness, fairness and AI for good.


  • Tutorial syllabus
  • Instructors' bio:

    Muhao Chen is currently a postdoctoral fellow in Department of Computer and Information Science, UPenn. He received a Ph.D. in Computer Science from UCLA in 2019. His research focuses on data-driven machine learning approaches for structured and unstructured data, and extending their applications to natural language understanding, knowledge base construction, computational biology and medical informatics. Particularly, he is interested in developing knowledge-aware learning systems with generalizability and requiring minimal supervision. His work has led to over 30 publications in leading conferences and journals. His dissertation research was awarded a UCLA Dissertation Fellowship.

    Kai-Wei Chang is an assistant professor in the Department of Computer Science at the University of California Los Angeles. His research interests include designing robust machine learning methods for large and complex data and building language processing models for social good applications. Chang has published broadly in machine learning, natural language processing, and artificial intelligence conferences. His awards include the EMNLP Best Long Paper Award (2017), the KDD Best Paper Award (2010), and the Okawa Research Grant Award (2018). Kai-Wei has given tutorials at NAACL, FAT, AAAI, EMNLP on various research topics.

    Dan Roth is the Eduardo D. Glandt Distinguished Professor at the Department of Computer and Information Science, University of Pennsylvania, and a Fellow of the AAAS, ACM, AAAI, and the ACL. In 2017 Roth was awarded the John McCarthy Award, the highest award the AI community gives to mid-career AI researchers. Roth was recognized for major conceptual and theoretical advances in the modeling of natural language understanding, machine learning, and reasoning. Roth has published broadly in machine learning, natural language processing, knowledge representation and reasoning, and learning theory, and has developed advanced machine learning based tools for natural language applications that are being used widely. Roth has given tutorials on these and other topics in all ACL and AAAI major conferences. Until February 2017 Roth was the Editor-in-Chief of the Journal of Artificial Intelligence Research (JAIR). He was the program chair of AAAI’11, ACL’03 and CoNLL’02, and serves regularly as an area chair and senior program committee member in the major conferences in his research areas. Prof. Roth received his B.A Summa cum laude in Mathematics from the Technion, Israel, and his Ph.D. in Computer Science from Harvard University in 1995.