NLP Highlights show

NLP Highlights

Summary: Welcome to the NLP highlights podcast, where we invite researchers to talk about their work in various areas in natural language processing. The hosts are the members of the AllenNLP team at Allen Institute for AI. All views expressed belong to the hosts and guests and do not represent their employers.

Join Now to Subscribe to this Podcast
  • Visit Website
  • RSS
  • Artist: Allen Institute for Artificial Intelligence
  • Copyright: All rights reserved

Podcasts:

 126 - Optimizing Continuous Prompts for Generation, with Lisa Li | File Type: audio/mpeg | Duration: 00:47:38

We invited Lisa Li to talk about her recent work, Prefix-Tuning: Optimizing Continuous Prompts for Generation. Prefix tuning is a lightweight alternative to finetuning, and the idea is to tune only a fixed-length task-specific continuous vector, and to keep the pretrained transformer parameters frozen. We discussed how prefix tuning compares with finetuning and other efficient alternatives on two tasks in various experimental settings, and in what scenarios prefix tuning is preferable. Lisa is a Phd student at Stanford University. Lisa's webpage: https://xiangli1999.github.io/ The hosts for this episode are Pradeep Dasigi and Ana Marasović.

 125 - VQA for Real Users, with Danna Gurari | File Type: audio/mpeg | Duration: 00:42:10

How can we build Visual Question Answering systems for real users? For this episode, we chatted with Danna Gurari, about her work in building datasets and models towards VQA for people who are blind. We talked about the differences between the existing datasets, and Vizwiz, a dataset built by Gurari et al., and the resulting algorithmic changes. We also discussed the unsolved challenges in this field, and the new tasks they result in. Danna Gurari is an Assistant Professor as well as Founding Director of the Image and Video Computing group in the School of Information at University of Texas at Austin (UT-Austin). Vizwiz project page: https://vizwiz.org/ The hosts for this episode are Ana Marasović and Pradeep Dasigi.

 124 - Semantic Machines and Task-Oriented Dialog, with Jayant Krishnamurthy and Hao Fang | File Type: audio/mpeg | Duration: 00:45:37

We invited Jayant Krishnamurthy and Hao Fang, researchers at Microsoft Semantic Machines to discuss their platform for building task-oriented dialog systems, and their recent TACL paper on the topic. The paper introduces a new formalism for task-oriented dialog to effectively handle references and revisions in complex dialog, and a large realistic dataset that uses this formalism. Leaderboard associated with the dataset: https://microsoft.github.io/task_oriented_dialogue_as_dataflow_synthesis/ Jayant's Twitter handle: https://twitter.com/jayantkrish Hao's Twitter handle: https://twitter.com/hfang90

 123 - Robust NLP, with Robin Jia | File Type: audio/mpeg | Duration: 00:47:59

In this episode, Robin Jia talks about how to build robust NLP systems. We discuss the different senses in which a system can be robust, reasons to care about system robustness, and the challenges involved in evaluating robustness of NLP models. We talk about how to build certifiably robust models through interval bound propagation and discrete encoding functions, as well as how to modify data collection procedures through active learning for more robust model development. Robin Jia is currently a visiting researcher at Facebook AI Research, and will be an assistant professor in the Department of Computer Science at the University of Southern California starting Fall 2021.

 122 - Statutory Reasoning in Tax Law, with Nils Holzenberger | File Type: audio/mpeg | Duration: 00:46:18

We invited Nils Holzenberger, a PhD student at JHU to talk about a dataset involving statutory reasoning in tax law Holzenberger et al. released recently. This dataset includes difficult textual entailment and question answering problems that involve reasoning about how sections in tax law are applicable to specific cases. They also released a Prolog solver that fully solves the problems, and show that learned models using dense representations of text perform poorly. We discussed why this is the case, and how one can train models to solve these challenges. Project webpage: https://nlp.jhu.edu/law/

 121 - Language and the Brain, with Alona Fyshe | File Type: audio/mpeg | Duration: 00:42:38

We invited Alona Fyshe to talk about the link between NLP and the human brain. We began by talking about what we currently know about the connection between representations used in NLP and representations recorded in the brain. We also discussed how different brain imaging techniques compare to each other. We then dove into experiments investigating how hidden states of LSTM language models correlate with EEG brain imaging data on three types of language inputs: well-formed grammatical sentences, pseudo-word sentences preserving syntax but not semantics, and word-lists preserving neither. We talk about the kinds of conclusions that can be drawn from these correlations and conclude by discussing avenues for future work.

 120 - Evaluation of Text Generation, with Asli Celikyilmaz | File Type: audio/mpeg | Duration: 00:55:13

We invited Asli Celikyilmaz for this episode to talk about evaluation of text generation systems. We discussed the challenges in evaluating generated text, and covered human and automated metrics, with a discussion of recent developments in learning metrics. We also talked about some open research questions, including the difficulties in evaluating factual correctness of generated text. Asli Celikyilmaz is a Principal Researcher at Microsoft Research. Link to a survey co-authored by Asli on this topic: https://arxiv.org/abs/2006.14799

 119 - Social NLP, with Diyi Yang | File Type: audio/mpeg | Duration: 00:53:32

In this episode, Diyi Yang gives us an overview of using NLP models for social applications, including understanding social relationships, processes, roles, and power. As NLP systems are getting used more and more in the real world, they additionally have increasing social impacts that must be studied. We talk about how to get started in this field, what datasets exist and are commonly used, and potential ethical issues. We additionally cover two of Diyi's recent papers, on neutralizing subjective bias in text, and on modeling persuasiveness in text. Diyi Yang is an assistant professor in the School of Interactive Computing at Georgia Tech.

 118 - Coreference Resolution, with Marta Recasens | File Type: audio/mpeg | Duration: 00:47:30

In this episode, we talked about Coreference Resolution with Marta Recasens, a Research Scientist at Google. We discussed the complexity involved in resolving references in language, the simplification of the problem that the NLP community has focused on by talking about specific datasets, and the complex coreference phenomena that are not yet captured in those datasets. We also briefly talked about how coreference is handled in languages other than English, and how some of the notions we have about modeling coreference phenomena in English do not necessarily transfer to other languages. We ended the discussion by talking about large language models, and to what extent they might be good at handling coreference.

 117 - Interpreting NLP Model Predictions, with Sameer Singh | File Type: audio/mpeg | Duration: 00:56:56

We interviewed Sameer Singh for this episode, and discussed an overview of recent work in interpreting NLP model predictions, particularly instance-level interpretations. We started out by talking about why it is important to interpret model outputs and why it is a hard problem. We then dove into the details of three kinds of interpretation techniques: attribution based methods, interpretation using influence functions, and generating explanations. Towards the end, we spent some time discussing how explanations of model behavior can be evaluated, and some limitations and potential concerns in evaluation methods. Sameer Singh is an Assistant Professor of Computer Science at the University of California, Irvine. Some of the techniques discussed in this episode have been implemented in the AllenNLP Interpret framework (details and demo here: https://allennlp.org/interpret).

 116 - Grounded Language Understanding, with Yonatan Bisk | File Type: audio/mpeg | Duration: 00:59:28

We invited Yonatan Bisk to talk about grounded language understanding. We started off by discussing an overview of the topic, its research goals, and the the challenges involved. In the latter half of the conversation, we talked about ALFRED (Shridhar et al., 2019), a grounded instruction following benchmark that simulates training a robot butler. The current best models built for this benchmark perform very poorly compared to humans. We discussed why that might be, and what could be done to improve their performance. Yonatan Bisk is currently an assistant professor at Language Technologies Institute at Carnegie Mellon University. The data and the leaderboard for ALFRED can be accessed here: https://askforalfred.com/.

 115 - AllenNLP, interviewing Matt Gardner | File Type: audio/mpeg | Duration: 00:33:25

In this special episode, Carissa Schoenick, a program manager and communications director at AI2 interviewed Matt Gardner about AllenNLP. We chatted about the origins of AllenNLP, the early challenges in building it, and the design decisions behind the library. Given the release of AllenNLP 1.0 this week, we asked Matt what users can expect from the new release, what improvements the AllenNLP team is working on for the future versions.

 114 - Behavioral Testing of NLP Models, with Marco Tulio Ribeiro | File Type: audio/mpeg | Duration: 00:43:32

We invited Marco Tulio Ribeiro, a Senior Researcher at Microsoft, to talk about evaluating NLP models using behavioral testing, a framework borrowed from Software Engineering. Marco describes three kinds of black-box tests the check whether NLP models satisfy certain necessary conditions. While breaking the standard IID assumption, this framework presents a way to evaluate whether NLP systems are ready for real-world use. We also discuss what capabilities can be tested using this framework, how one can come up with good tests, and the need for an evolving set of behavioral tests for NLP systems. Marco’s homepage: https://homes.cs.washington.edu/~marcotcr/

 113 - Managing Industry Research Teams, with Fernando Pereira | File Type: audio/mpeg | Duration: 00:42:22

We invited Fernando Pereira, a VP and Distinguished Engineer at Google, where he leads NLU and ML research, to talk about managing NLP research teams in industry. Topics we discussed include prioritizing research against product development and effective collaboration with product teams, dealing with potential research interest mismatch between individuals and the company, managing publications, hiring new researchers, and diversity and inclusion.

 112 - Alignment of Multilingual Contextual Representations, with Steven Cao | File Type: audio/mpeg | Duration: 00:33:15

We invited Steven Cao to talk about his paper on multilingual alignment of contextual word embeddings. We started by discussing how multilingual transformers work in general, and then focus on Steven’s work on aligning word representations. The core idea is to start from a list of words automatically aligned from parallel corpora and to ensure the representations of the aligned words are similar to each other while not moving too far away from their original representations. We discussed the experiments on the XNLI dataset in the paper, analysis, and the decision to do the alignment at word level and compare it to other possibilities such as aligning word pieces or higher level encoded representations in transformers. Paper: https://openreview.net/forum?id=r1xCMyBtPS Steven Cao’s webpage: https://stevenxcao.github.io/

Comments

Login or signup comment.