Comparison of Diverse Decoding Methods from Conditional Language Models.
Daphne Ippolito*, Reno Kriz*, João Sedoc, Maria Kustikova, Chris Callison-Burch.
The 2019 Conference of the Association for Computational Linguistics (ACL 2019) 2019.
Abstract
While conditional language models have greatly improved in their ability to output high-quality natural language, many NLP applications benefit from being able to generate a diverse set of candidate sequences. Diverse decoding strategies aim to, within a given-sized candidate list, cover as much of the space of high-quality outputs as possible, leading to improvements for tasks that re-rank and combine candidate outputs. Standard decoding methods, such as beam search, optimize for generating high likelihood sequences rather than diverse ones, though recent work has focused on increasing diversity in these methods. In this work, we perform an extensive survey of decoding-time strategies for generating diverse outputs from conditional language models. We also show how diversity can be improved without sacrificing quality by over-sampling additional candidates, then filtering to the desired number.
BibTex
@inproceedings{ippolito2019comparison,
title={Comparison of Diverse Decoding Methods from Conditional Language Models},
author={Daphne Ippolito and Reno Kriz and Joao Sedoc and Maria Kustikova and Chris Callison-Burch},
booktitle={Proceedings of the 57th Conference of the Association for Computational Linguistics},
year={2019},
address={Florence, Italy},
publisher={Association for Computational Linguistics}
}
|
Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification.
Reno Kriz, João Sedoc, Marianna Apidianaki, Carolina Zheng, Gaurav Kumar, Eleni Miltsakaki, and Chris Callison-Burch.
The 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019) 2019.
Abstract
Sentence simplification is the task of rewriting texts so they are easier to understand. Recent research has applied sequence-to-sequence (Seq2Seq) models to this task, focusing largely on training-time improvements via reinforcement learning and memory augmentation. One of the main problems with applying generic Seq2Seq models for simplification is that these models tend to copy directly from the original sentence, resulting in outputs that are relatively long and complex. We aim to alleviate this issue through the use of two main techniques. First, we incorporate content word complexities, as predicted with a leveled word complexity model, into our loss function during training. Second, we generate a large set of diverse candidate simplifications at test time, and rerank these to promote fluency, adequacy, and simplicity. Here, we measure simplicity through a novel sentence complexity model. These extensions allow our models to perform competitively with state-of-the-art systems while generating simpler sentences. We report standard automatic and human evaluation metrics.
BibTex
@inproceedings{kriz2019complexity,
title={Complexity-Weighted Loss and Diverse Reranking for Sentence Simplification},
author={Reno Kriz and Jo{\~a}o Sedoc and Marianna Apidianaki and Carolina Zheng and Gaurav Kumar and Eleni Miltsakaki and Chris Callison-Burch},
journal={Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2019)},
year={2019},
address={Minneapolis, Minnesota},
publisher={Association for Computational Linguistics}
}
|
Simplification Using Paraphrases and Context-based Lexical Substitution.
Reno Kriz, Eleni Miltsakaki, Marianna Apidianaki, and Chris Callison-Burch.
The 2018 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018) 2018.
Abstract
Lexical simplification involves identifying complex words or phrases that need to be simplified, and recommending simpler meaning-preserving substitutes that can be more easily understood. We propose a complex word identification (CWI) model that exploits both lexical and contextual features, and a simplification mechanism which relies on a word-embedding lexical substitution model to replace the detected complex words with simpler paraphrases. We compare our CWI and lexical simplification models to several baselines, and evaluate the performance of our simplification system against human judgments. The results show that our models are able to detect complex words with higher accuracy than other commonly used methods, and propose good simplification substitutes in context. They also highlight the limited contribution of context features for CWI, which nonetheless improve simplification compared to context-unaware models.
BibTex
@inproceedings{Kriz-et-al:2018:NAACL,
author={Reno Kriz and Eleni Miltsakaki and Marianna Apidianaki and Chris Callison-Burch},
title={Simplification Using Paraphrases and Context-based Lexical Substitution},
booktitle={The 2018 Conference of the North American Chapter of the Association for Computational Linguistics (NAACL 2018)},
month={June},
year={2018},
address={New Orleans, Louisiana}
}
|
Learning Translations via Images with a Massively Multilingual Image Dataset.
John Hewitt, Daphne Ippolito, Brendan Callahan, Reno Kriz, Derry Wijaya and Chris Callison-Burch.
The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018) 2018.
Abstract
We conduct the most comprehensive study to date into translating words via images. To facilitate research on the task, we introduce a large-scale multilingual corpus of images, each labeled with the word it represents. Past datasets have been limited to only a few high-resource languages and unrealistically easy translation settings. In contrast, we have collected by far the largest available dataset for this task, with images for approximately 10,000 words in each of 100 languages. We run experiments on a dozen high resource languages and 20 low resources languages, demonstrating the effect of word concreteness and part-of-speech on translation quality. To improve image-based translation, we introduce a novel method of predicting word concreteness from images, which improves on a previous state-of-the-art unsupervised technique. This allows us to predict when image-based translation may be effective, enabling consistent improvements to a state-of-the-art text-based word translation system.
BibTex
@inproceedings{Hewitt-et-al:2018:ACL,
author = {John Hewitt and Daphne Ippolito and Brendan Callahan and Reno Kriz and Derry Wijaya and Chris Callison-Burch},
title = {Learning Translations via Images with a Massively Multilingual Image Dataset},
booktitle = {The 56th Annual Meeting of the Association for Computational Linguistics (ACL 2018)},
month = {July},
year = {2018},
address = {Melbourne, Australia}
}
|