abstract pattern with gradients of pink and purple
2022年5月22日 2022年5月27日

Microsoft at ACL 2022

地点: Hybrid | Dublin, Ireland

所有时间都在 PDT (UTC -7)

此表单需要 JavaScript,并在输入更改时自动提交。结果在页面上实时更新,无需重新加载。

Sunday, May 22, 2022

  • 01:3005:00 Tutorial

    Knowledge-Augmented Methods for Natural Language Processing [cutting-edge]

    Presenters: Chenguang Zhu, Yichong Xu, Xiang Ren, Bill Yuchen Lin, Meng Jiang, Wenhao Yu

    Knowledge in NLP has been a rising trend especially after the advent of large scale pre-trained models. NLP models with attention to knowledge can i) access unlimited amount of external information; ii) delegate the task of storing knowledge from its parameter space to knowledge sources; iii) obtain up-to-date information; iv) make prediction results more explainable via selected knowledge. In this tutorial, we will introduce the key steps in integrating knowledge into NLP, including knowledge grounding from text, knowledge representation and fusing. We will also introduce recent state-of-the-art applications in fusing knowledge into language understanding, language generation and commonsense reasoning.

  • 01:3005:00 Tutorial

    Learning with Limited Text Data [cutting-edge]

    Presenters: Diyi Yang (2021 Microsoft Research Faculty Fellow), Ankur P Parikh, Colin Raffel

    Natural Language Processing (NLP) has achieved great progress in the past decade on the basis of neural models, which often make use of large amounts of labeled data to achieve state-of-the-art performance. The dependence on labeled data prevents NLP models from being applied to low-resource settings and languages because of the time, money, and expertise that is often required to label massive amounts of textual data. Consequently, the ability to learn with limited labeled data is crucial for deploying neural systems to real-world NLP applications. Recently, numerous approaches have been explored to alleviate the need for labeled data in NLP such as data augmentation and semi-supervised learning. This tutorial aims to provide a systematic and up-to-date overview of these methods in order to help researchers and practitioners understand the landscape of approaches and the challenges associated with learning from limited labeled data, an emerging topic in the computational linguistics community. We will consider applications to a wide variety of NLP tasks (including text classification, generation, and structured prediction) and will highlight current challenges and future directions.

  • 06:0010:00 Tutorial

    Non-Autoregressive Sequence Generation [cutting-edge]

    Speakers: Jiatao Gu, Xu Tan

    Non-autoregressive sequence generation (NAR) attempts to generate the entire or partial output sequences in parallel to speed up the generation process and avoid potential issues (e.g., label bias, exposure bias) in autoregressive generation. While it has received much research attention and has been applied in many sequence generation tasks in natural language and speech, naive NAR models still face many challenges to close the performance gap between state-of-the-art autoregressive models because of a lack of modeling power. In this tutorial, we will provide a thorough introduction and review of non-autoregressive sequence generation, in four sections: 1) Background, which covers the motivation of NAR generation, the problem definition, the evaluation protocol, and the comparison with standard autoregressive generation approaches. 2) Method, which includes different aspects: model architecture, objective function, training data, learning paradigm, and additional inference tricks. 3) Application, which covers different tasks in text and speech generation, and some advanced topics in applications. 4) Conclusion, in which we describe several research challenges and discuss the potential future research directions. We hope this tutorial can serve both academic researchers and industry practitioners working on non-autoregressive sequence generation.

Monday, May 23, 2022

Tuesday, May 24, 2022

Wednesday, May 25, 2022

Thursday, May 26, 2022

  • 00:5009:50 Workshop

    7th Workshop on Representation Learning for NLP

    Keynote Speaker: Monojit Choudhury
    Program Committee: Lijun Wu, Yue Chen

    The 7th Workshop on Representation Learning for NLP aims to continue the success of the Repl4NLP workshop series, with the 1st Workshop on Representation Learning for NLP having received about 50 submissions and over 250 attendees — the second most attended collocated event at ACL’16 after WMT). The workshop was introduced as a synthesis of several years of independent *CL workshops focusing on vector space models of meaning, compositionality, and the application of deep neural networks and spectral methods to NLP. It provides a forum for discussing recent advances on these topics, as well as future research directions in linguistically motivated vector-based models in NLP. The workshop will take place in a hybrid setting, and, as in previous years, feature interdisciplinary keynotes, paper presentations, posters, as well as a panel discussion.

  • 01:0009:05 Workshop

    The 2nd DialDoc workshop on Document-grounded Dialogue and Conversational Question Answering

    Invited Speaker: Michel Galley

    DialDoc Workshop focuses on document-grounded dialogue and conversational question answering. There is a vast amount of document content created every day by human writers to communicate with human readers for sharing knowledge, ranging from encyclopedias to social benefits. Making the document content accessible to users via conversational systems and scaling it to various domains could be a meaningful yet challenging task. There are significant individual research threads that show promise in handling heterogeous knowledge embedded in documents for building conversational systems, including (1) unstructured content, such as text passages; (2) semi-structured content, such as tables or lists; (3) multi-modal content, such as images and videos along with text descriptions, and so on. The purpose of this workshop is to invite researchers and practitioners to bring their individual perspectives on the subject of document-grounded dialogue and conversational question answering to advance the field in a joint effort.

Friday, May 27, 2022

  • 01:0009:00 Workshop

    The Second Workshop on Language Technology for Equality, Diversity and Inclusion (LT-EDI-2022)

    Organizers: Bharathi Raja Chakravarthi, Bharathi B, John P. McCrae, Manel Zarrouk, Kalika Bali, Paul Buitelaar
    Invited Speaker: Su Lin Blodgett
    Program Committee: Monojit Choudhury

    Equality, Diversity and Inclusion (EDI) is an important agenda across every field [1] throughout the world. Language as a major part of communication should be inclusive and treat everyone with equality. Today’s large internet community uses language technology (LT) and has a direct impact on people across the globe. EDI is crucial to ensure everyone is valued and included, so it is necessary to build LT that serves this purpose. Recent results have shown that big data and deep learning are entrenching existing biases and that some algorithms are even naturally biased due to problems such as ‘regression to the mode’. Our focus is on creating LT that will be more inclusive of gender [2], racial [3], sexual orientation [4], persons with disability [5,6]. The workshop will focus on creating speech and language technology to address EDI not only in English, but also in less resourced languages.

  • 01:0010:00 Workshop

    The 2nd Workshop on Human Evaluation of NLP Systems (HumEval 2022)

    Program Committee: Tom Kocmi

    With this workshop, we wish to create a forum for current human evaluation research and future directions, a space for researchers working with human evaluations to exchange ideas and begin to address the issues human evaluation in NLP faces from many points of view, including experimental design, meta-evaluation and reproducibility.

  • 01:0010:00 Workshop

    The Second Workshop on Combating Online Hostile Posts in Regional Languages during Emergency Situation (CONSTRAINT 2022)

    Program Committee: Anoop Kunchukuttan, Monojit Choudhury

    The increasing accessibility of the Internet has dramatically changed the way we consume information. The ease of social media usage not only encourages individuals to freely express their opinion (freedom of speech) but also provides content polluters with ecosystems to spread hostile posts (hate speech, fake news, cyberbullying, propaganda, etc.). Such hostile activities are expected to increase manifold during emergencies such as the presidential election and COVID-19 pandemic spreading. Most of such hostile posts are written in regional languages, and therefore can easily evade online surveillance engines that are majority trained on the posts written in resource-rich languages such as English and Chinese. Therefore, regions such as Asia, Africa, South America, where low-resource regional languages are used for day-to-day communication, suffer due to the lack of tools, benchmark datasets and learning techniques. Other developing countries such as Italy, Spain, where the used languages (pseudo-low-resource) are not as equipped with sophisticated computational resources as English, might also be facing the same issues.

  • 01:0010:00 Workshop

    The 4th Workshop on NLP for Conversational AI

    Organizers: Bing Liu, Alexandros Papangelis, Stefan Ultes, Abhinav Rastogi, Yun-Nung (Vivian) Chen, Georgios Spithourakis, Elnaz Nouri, Weiyan Shi
    Invited Speaker: Michael Tjalve

    Following the success of the 3rd NLP for Conversational AI workshop at EMNLP, “The 4th NLP4ConvAI” will be a one-day workshop, co-located with ACL 2022 in Dublin. The goal of this workshop is to bring together researchers and practitioners to discuss impactful research problems in this area, share findings from real-world applications, and generate ideas for future research directions. The workshop will include keynotes, posters, and panel sessions. In keynote talks, senior technical leaders from industry and academia will share insights on the latest developments in the field. We would like to encourage researchers and students to share their prospects and latest discoveries. There will also be a panel discussion with noted conversational AI leaders focused on the state of the field, future directions, and open problems across academia and industry.

  • 01:0010:00 Workshop

    Workshop on Multilingual Multimodal Learning

    Invited Speaker: Lei Ji

    Multilingual multimodal research focuses on collecting resources, developing models, and evaluating systems that need to jointly reason over multilingual text and multimodal inputs, including images, videos, texts, and knowledge bases. Multilingual multimodal NLP presents new and unique challenges. First, it is one of the areas that suffer the most from language imbalance issues. Texts in most multimodal datasets are usually only available in high-resource languages. Second, multilingual multimodal research provides opportunities to investigate culture-related phenomena. On top of the language imbalance issue in text-based corpora and models, the data of additional modalities (e.g. images or videos) are mostly collected from North American and Western European sources (and their worldviews). As a result, multimodal models do not capture our world’s multicultural diversity and do not generalise to out-of-distribution data from minority cultures. The interplay of the two issues leads to extremely poor performance of multilingual multimodal systems in real-life scenarios. This workshop encourages and promotes research efforts towards more inclusive multimodal technologies and tools to assess them.