今日 cs.CL方向共计10篇文章。
知识图谱(1篇)
[1]:A Survey on Graph Neural Networks for Knowledge Graph Completion
标题:用于知识图完成的图神经网络综述
作者:Siddhant Arora
备注:10 pages, 2 figures
链接:https://arxiv.org/abs/2007.12374
摘要:Knowledge Graphs are increasingly becoming popular for a variety of downstream tasks like Question Answering and Information Retrieval. However, the Knowledge Graphs are often incomplete, thus leading to poor performance. As a result, there has been a lot of interest in the task of Knowledge Base Completion. More recently, Graph Neural Networks have been used to capture structural information inherently stored in these Knowledge Graphs and have been shown to achieve SOTA performance across a variety of datasets. In this survey, we understand the various strengths and weaknesses of the proposed methodology and try to find new exciting research problems in this area that require further investigation.信息检索(1篇)
[1]:ZSCRGAN: A GAN-based Expectation Maximization Model for Zero-Shot Retrieval of Images from Textual Descriptions
标题:ZSCRGAN:一种基于GAN的文本描述图像零镜头检索的期望最大化模型
作者:Anurag Roy, Vinay Kumar Verma, Kripabandhu Ghosh, Saptarshi Ghosh
链接:https://arxiv.org/abs/2007.12212
摘要:Most existing algorithms for cross-modal Information Retrieval are based on a supervised train-test setup, where a model learns to align the mode of the query (e.g., text) to the mode of the documents (e.g., images) from a given training set. Such a setup assumes that the training set contains an exhaustive representation of all possible classes of queries. In reality, a retrieval model may need to be deployed on previously unseen classes, which implies a zero-shot IR setup. In this paper, we propose a novel GAN-based model for zero-shot text to image retrieval. When given a textual description as the query, our model can retrieve relevant images in a zero-shot setup. The proposed model is trained using an Expectation-Maximization framework. Experiments on multiple benchmark datasets show that our proposed model comfortably outperforms several state-of-the-art zero-shot text to image retrieval models, as well as zero-shot classification and hashing models suitably used for retrieval.自动摘要(1篇)
[1]:SummEval: Re-evaluating Summarization Evaluation
标题:SummEval:重新评估摘要评估
作者:Alexander R. Fabbri, Wojciech Kryściński, Bryan McCann, Richard Socher, Dragomir Radev
备注:10 pages, 4 tables, 1 figure
链接:https://arxiv.org/abs/2007.12626
摘要:The scarcity of comprehensive up-to-date studies on evaluation metrics for text summarization and the lack of consensus regarding evaluation protocols continues to inhibit progress. We address the existing shortcomings of summarization evaluation methods along five dimensions: 1) we re-evaluate 12 automatic evaluation metrics in a comprehensive and consistent fashion using neural summarization model outputs along with expert and crowd-sourced human annotations, 2) we consistently benchmark 23 recent summarization models using the aforementioned automatic evaluation metrics, 3) we assemble the largest collection of summaries generated by models trained on the CNN/DailyMail news dataset and share it in a unified format, 4) we implement and share a toolkit that provides an extensible and unified API for evaluating summarization models across a broad range of automatic metrics, 5) we assemble and share the largest and most diverse, in terms of model types, collection of human judgments of model-generated summaries on the CNN/Daily Mail dataset annotated by both expert judges and crowd source workers. We hope that this work will help promote a more complete evaluation protocol for text summarization as well as advance research in developing evaluation metrics that better correlate with human judgements.情感分析(1篇)
[1]:JUNLP@SemEval-2020 Task 9:Sentiment Analysis of Hindi-English code mixed data
标题:SemEval-2020年6月任务9:印地语-英语码混合数据的情感分析
作者:Avishek Garain, Sainik Kumar Mahata, Dipankar Das
链接:https://arxiv.org/abs/2007.12561
摘要:Code-mixing is a phenomenon which arises mainly in multilingual societies. Multilingual people, who are well versed in their native languages and also English speakers, tend to code-mix using English-based phonetic typing and the insertion of anglicisms in their main language. This linguistic phenomenon poses a great challenge to conventional NLP domains such as Sentiment Analysis, Machine Translation, and Text Summarization, to name a few. In this work, we focus on working out a plausible solution to the domain of Code-Mixed Sentiment Analysis. This work was done as participation in the SemEval-2020 Sentimix Task, where we focused on the sentiment analysis of English-Hindi code-mixed sentences. our username for the submission was "sainik.mahata" and team name was "JUNLP". We used feature extraction algorithms in conjunction with traditional machine learning algorithms such as SVR and Grid Search in an attempt to solve the task. Our approach garnered an f1-score of 66.2\% when tested using metrics prepared by the organizers of the task.模型(1篇)
[1]:Named entity recognition in chemical patents using ensemble of contextual language models
标题:基于上下文语言模型集成的化学专利命名实体识别
作者:Jenny Copara, Nona Naderi, Julien Knafou, Patrick Ruch, Douglas Teodoro
链接:https://arxiv.org/abs/2007.12569
摘要:Chemical patent documents describe a broad range of applications holding key information, such as chemical compounds, reactions, and specific properties. However, the key information should be enabled to be utilized in downstream tasks. Text mining provides means to extract relevant information from chemical patents through information extraction techniques. As part of the Information Extraction task of the Cheminformatics Elseiver Melbourne University challenge, in this work we study the effectiveness of contextualized language models to extract reaction information in chemical patents. We compare transformer architectures trained on a generic corpus with models specialised in chemistry patents, and propose a new model based on the combination of existing architectures. Our best model, based on the ensemble approach, achieves an exact F1-score of 92.30% and a relaxed F1 -score of 96.24%. We show that the ensemble of contextualized language models provides an effective method to extract information from chemical patents. As a next step, we will investigate the effect of transformer language models pre-trained in chemical patents.其他(5篇)
[1]:FiSSA at SemEval-2020 Task 9: Fine-tuned For Feelings
标题:FISA在SemEval-2020任务9:调整情绪
作者:Bertelt Braaksma, Richard Scholtens, Stan van Suijlekom, Remy Wang, Ahmet Üstün
备注:In Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval-2020), Barcelona, Spain, December. Association for Computational Linguistics
链接:https://arxiv.org/abs/2007.12544
摘要:In this paper, we present our approach for sentiment classification on Spanish-English code-mixed social media data in the SemEval-2020 Task 9. We investigate performance of various pre-trained Transformer models by using different fine-tuning strategies. We explore both monolingual and multilingual models with the standard fine-tuning method. Additionally, we propose a custom model that we fine-tune in two steps: once with a language modeling objective, and once with a task-specific objective. Although two-step fine-tuning improves sentiment classification performance over the base model, the large multilingual XLM-RoBERTa model achieves best macro F1-score with 0.498 on development data and 0.739 on test data. With this score, our team jupitter placed tenth overall in the competition.
[2]:MULTISEM at SemEval-2020 Task 3: Fine-tuning BERT for Lexical Meaning
标题:SemEval-2020的MULTISEM任务3:词义微调BERT
作者:Aina Garí Soler, Marianna Apidianaki
备注:8 pages, 2 tables. Accepted at the 14th International Workshop on Semantic Evaluation (SemEval-2020)
链接:https://arxiv.org/abs/2007.12432
摘要:We present the MULTISEM systems submitted to SemEval 2020 Task 3: Graded Word Similarity in Context (GWSC). We experiment with injecting semantic knowledge into pre-trained BERT models through fine-tuning on lexical semantic tasks related to GWSC. We use existing semantically annotated datasets and propose to approximate similarity through automatically generated lexical substitutes in context. We participate in both GWSC subtasks and address two languages, English and Finnish. Our best English models occupy the third and fourth positions in the ranking for the two subtasks. Performance is lower for the Finnish models which are mid-ranked in the respective subtasks, highlighting the important role of data availability for fine-tuning.
[3]:IDS at SemEval-2020 Task 10: Does Pre-trained Language Model Know What to Emphasize?
标题:在SemEval-2020任务10:预训练的语言模型知道要强调什么吗?
作者:Jaeyoul Shin, Taeuk Kim, Sang-goo Lee
链接:https://arxiv.org/abs/2007.12390
摘要:We propose a novel method that enables us to determine words that deserve to be emphasized from written text in visual media, relying only on the information from the self-attention distributions of pre-trained language models (PLMs). With extensive experiments and analyses, we show that 1) our zero-shot approach is superior to a reasonable baseline that adopts TF-IDF and that 2) there exist several attention heads in PLMs specialized for emphasis selection, confirming that PLMs are capable of recognizing important words in sentences.
[4]:IR-BERT: Leveraging BERT for Semantic Search in Background Linking for News Articles
标题:IR-BERT:在新闻文章背景链接中利用BERT进行语义搜索
作者:Anup Anand Deshmukh, Udhav Sethi
备注:6 pages, 6 figures
链接:https://arxiv.org/abs/2007.12603
摘要:This work describes our two approaches for the background linking task of TREC 2020 News Track. The main objective of this task is to recommend a list of relevant articles that the reader should refer to in order to understand the context and gain background information of the query article. Our first approach focuses on building an effective search query by combining weighted keywords extracted from the query document and uses BM25 for retrieval. The second approach leverages the capability of SBERT (Nils Reimers et al.) to learn contextual representations of the query in order to perform semantic search over the corpus. We empirically show that employing a language model benefits our approach in understanding the context as well as the background of the query article. The proposed approaches are evaluated on the TREC 2018 Washington Post dataset and our best model outperforms the TREC median as well as the highest scoring model of 2018 in terms of the nDCG@5 metric. We further propose a diversity measure to evaluate the effectiveness of the various approaches in retrieving a diverse set of documents. This would potentially motivate researchers to work on introducing diversity in their recommended list. We have open sourced our implementation on Github and plan to submit our runs for the background linking task in TREC 2020.
[5]:The Lottery Ticket Hypothesis for Pre-trained BERT Networks
标题:预训练BERT网络的彩票假设
作者:Tianlong Chen, Jonathan Frankle, Shiyu Chang, Sijia Liu, Yang Zhang, Zhangyang Wang, Michael Carbin
链接:https://arxiv.org/abs/2007.12223
摘要:In natural language processing (NLP), enormous pre-trained models like BERT have become the standard starting point for training on a range of downstream tasks, and similar trends are emerging in other areas of deep learning. In parallel, work on the lottery ticket hypothesis has shown that models for NLP and computer vision contain smaller matching subnetworks capable of training in isolation to full accuracy and transferring to other tasks. In this work, we combine these observations to assess whether such trainable, transferrable subnetworks exist in pre-trained BERT models. For a range of downstream tasks, we indeed find matching subnetworks at 40% to 90% sparsity. We find these subnetworks at (pre-trained) initialization, a deviation from prior NLP research where they emerge only after some amount of training. Subnetworks found on the masked language modeling task (the same task used to pre-train the model) transfer universally; those found on other tasks transfer in a limited fashion if at all. As large-scale pre-training becomes an increasingly central paradigm in deep learning, our results demonstrate that the main lottery ticket observations remain relevant in this context. Codes available atthis https URL.