ACL2021全部论文列表已经放出,并且前两天整理了 ACL2021主会议相关论文及pdf链接 ,详细见:ACL2021主会议论文汇总及分类
应部分网友的需要,将 ACL2021Findings的论文 也进行了分类整理,并附上了对应的论文链接。
主要包括10个分类,如下:(1)预训练语言模型及应用(40篇);(2)表征学习(8篇);(3)问答及检索(20篇);(4)文本生成(41篇);(5)摘要(19篇);(6)小样本(10篇);(7)对话(18篇);(8)情感及情绪分析(12篇);(9)信息抽取(46篇);(10)其他(47篇)。
整理不易,请多多关注、转发、点赞。也请多多关注本人知乎「刘聪NLP」,有问题的朋友也欢迎加我微信私聊 。
我们的口号是“生命不止,学习不停”,虽然是是周末,但是依然要“卷”。
往期推荐
SIGIR2021论文:基于Text-to-Text多视图学习的段落重排序
SIGIR2021之DvBERT模型:双视图蒸馏的句向量BERT模型
SIGIR2021之IDCM模型: 文档内部级联选择段落服务于文档排序
一、预训练语言模型及应用
Long
(1)LV-BERT: Exploiting Layer Variety for BERT
https://arxiv.org/abs/2106.11740
(2)Joint Optimization of Tokenization and Downstream Model
https://arxiv.org/abs/2105.12410
(3)How does Attention Affect the Model?
(4)AMBERT: A Pre-trained Language Model with Multi-Grained Tokenization
https://arxiv.org/abs/2008.11869
(5)Adapt-and-Distill: Developing Small, Fast and Effective PretrainedLanguage Models for Domains
https://arxiv.org/abs/2106.13474
(6)On Commonsense Cues in BERT for Solving Commonsense Tasks
https://arxiv.org/abs/2008.03945
(7)MusicBERT: Symbolic Music Understanding with Large-ScalePre-Training
https://arxiv.org/abs/2106.05630
(8)RealFormer: Transformer Likes Residual Attention
https://arxiv.org/abs/2012.11747
(9)Out of Order: How important is the sequential order of words in asentence in Natural Language Understanding tasks?
https://arxiv.org/abs/2012.15180
(10)LICHEE: Improving Language Model Pre-training with Multi-grainedTokenization
(11)K-Adapter: Infusing Knowledge into Pre-Trained Models with Adapters
https://arxiv.org/abs/2002.01808
(12)BERT-Defense: A Probabilistic Model Based on BERT to CombatCognitively Inspired Orthographic Adversarial Attacks
https://arxiv.org/abs/2106.01452
(13)CiteWorth: Cite-Worthiness Detection for Improved ScientificDocument Understanding
https://arxiv.org/abs/2105.10912
(14)MiniLMv2: Multi-Head Self-Attention Relation Distillation forCompressing Pretrained Transformers
https://arxiv.org/abs/2012.15828
(15)Attending via both Fine-tuning and Compressing
(16)On the Interplay Between Fine-tuning and Composition in Transformers
https://arxiv.org/abs/2105.14668
(17)Training ELECTRA Augmented with Multi-word Selection
https://arxiv.org/abs/2106.00139
(18)Latent Reasoning for Low-Resource Question Generation
https://arxiv.org/abs/2106.07285
(19)Defending Pre-trained Language Models from Adversarial WordSubstitutions Without Performance Sacrifice
https://arxiv.org/abs/2105.14553
(20)BERT-Proof Syntactic Structures: Investigating Errors inDiscontinuous Constituency Parsing
(21)BERT Busters: Outlier Dimensions that Disrupt Transformers
https://arxiv.org/abs/2105.06990
(22)“We will Reduce Taxes” - Identifying Election Pledges with LanguageModels
(23)Are Larger Pretrained Language Models Uniformly Better? ComparingPerformance at the Instance Level
https://arxiv.org/abs/2105.06020
(24)Memory-Efficient Differentiable Transformer Architecture Search
https://arxiv.org/abs/2105.14669
(25)MLMLM: Link Prediction with Mean Likelihood Masked Language Model
https://arxiv.org/abs/2009.07058
(26)Learning to Sample Replacements for ELECTRA Pre-Training
https://arxiv.org/abs/2106.13715
(27)Using Social and Linguistic Information to Adapt PretrainedRepresentations for Political Perspective Identification
(28)Fingerprinting Fine-tuned Language Models in the Wild
https://arxiv.org/abs/2106.01703
(29)EBERT: Efficient BERT Inference with Dynamic Structured Pruning
(30)Language Models Use Monotonicity to Assess NPI Licensing
https://arxiv.org/abs/2105.13818
Short
(31)Improving BERT with Syntax-aware Local Attention
https://arxiv.org/abs/2012.15150
(32)BertGCN: Transductive Text Classification by Combining GNN and BERT
https://arxiv.org/abs/2105.05727
(33)Fusing Label Embedding into BERT: An Efficient Improvement for TextClassification
(34)MA-BERT: Learning Representation by Incorporating Multi-AttributeKnowledge in Transformers
(35)Enhancing Language Generation with Effective Checkpoints ofPre-trained Language Model
(36)DoT: An efficient Double Transformer for NLP tasks with tables
https://arxiv.org/abs/2106.00479
(37)Effective Attention Sheds Light On Interpretability
https://arxiv.org/abs/2105.08855
(38)On the Distribution, Sparsity, and Inference-time Quantization ofAttention Values in Transformers
https://arxiv.org/abs/2106.01335
(39)One Teacher is Enough? Pre-trained Language Model Distillation fromMultiple Teachers
https://arxiv.org/abs/2106.01023
(40)Task-adaptive Pre-training of Language Models with Word EmbeddingRegularization
二、表征学习
Long
(1)Incorporating Global Information in Local Attention for KnowledgeRepresentation Learning
(2)An Evaluation of Disentangled Representation Learning for Texts
(3)Evaluating Word Embeddings with Categorical Modularity
https://arxiv.org/abs/2106.00877
(4)Language-Mediated, Object-Centric Representation Learning
https://arxiv.org/abs/2012.15814
(5)Biomedical Interpretable Entity Representations
https://arxiv.org/abs/2106.09502
(6)Verb Sense Clustering using Contextualized Word Representations forSemantic Frame Induction
https://arxiv.org/abs/2105.13465
(7)Disentangled Code Representation Learning for Multiple ProgrammingLanguages
Short
(8)RetroGAN: A Cyclic Post-Specialization System for ImprovingOut-of-Knowledge and Rare Word Representations
三、问答及检索
Long
(1)Explainable Inference Over Grounding-Abstract Chains for ScienceQuestions
http://publications.idiap.ch/downloads/papers/2021/Thayaparan\_ACL-IJCNLP\_2021.pdf
(2)REPT: Bridging Language Models and Machine Reading Comprehension viaRetrieval-Based Pre-training
https://arxiv.org/abs/2105.04201
(3)Deep Cognitive Reasoning Network for Multi-hop Question Answeringover Knowledge Graphs
(4)Contrastive Fine-tuning Improves Robustness for Neural Rankers
https://arxiv.org/abs/2105.12932
(5)TellMeWhy: A Dataset for Answering Why-Questions in Narratives
https://arxiv.org/abs/2106.06132
(6)Weakly Supervised Pre-Training for Multi-Hop Retriever
https://arxiv.org/abs/2106.09983
(7)Why Machine Reading Comprehension Models Learn Shortcuts?
https://arxiv.org/abs/2106.01024
(8)Dynamic Semantic Graph Construction and Reasoning for ExplainableMulti-hop Science Question Answering
https://arxiv.org/abs/2105.11776
(9)GCRC: A New Challenging MRC Dataset from Gaokao Chinese forExplainable Evaluation
(10)Knowing More About Questions Can Help: Improving Calibration inQuestion Answering
https://arxiv.org/abs/2106.01494
(11)PAIR: Leveraging Passage-Centric Similarity Relation for ImprovingDense Passage Retrieval
(12)Knowledge-Empowered Representation Learning for Chinese MedicalReading Comprehension: Task, Model and Resources
https://arxiv.org/abs/2008.10327
(13)Self-Supervised Document Similarity Ranking via ContextualizedLanguage Models and Hierarchical Inference
https://arxiv.org/abs/2106.01186
(14)Disfl-QA: A Benchmark Dataset for Understanding Disfluencies inQuestion Answering
https://arxiv.org/abs/2106.04016
(15)Leveraging Abstract Meaning Representation for Knowledge BaseQuestion Answering
https://arxiv.org/abs/2012.01707
(16)Cluster-Former: Clustering-based Sparse Transformer for QuestionAnswering
https://arxiv.org/abs/2009.06097
Short
(17)Reader-Guided Passage Reranking for Open-Domain Question Answering
https://arxiv.org/abs/2101.00294
(18)Benchmarking Robustness of Machine Reading Comprehension Models
https://arxiv.org/abs/2004.14004
(19)Fusing Context Into Knowledge Graph for Commonsense QuestionAnswering
https://arxiv.org/abs/2012.04808
(20)Answer Generation for Retrieval-based Question Answering Systems
https://arxiv.org/abs/2106.00955
四、文本生成
Long
(1)Generate, Prune, Select: A Pipeline for Counterspeech Generationagainst Online Hate Speech
https://arxiv.org/abs/2106.01625
(2)Contrastive Attention for Automatic Chest X-ray Report Generation
https://arxiv.org/abs/2106.06965
(3)GLGE: A New General Language Generation Evaluation Benchmark
https://arxiv.org/abs/2011.11928
(4)Keep the Primary, Rewrite the Secondary: A Two-Stage Approach forParaphrase Generation
https://aclanthology.org/2020.acl-main.535.pdf
(5)CoMAE: A Multi-factor Hierarchical Framework for Empathetic ResponseGeneration
https://arxiv.org/abs/2105.08316
(6)UniKeyphrase: A Unified Extraction and Generation Framework forKeyphrase Prediction
https://arxiv.org/abs/2106.04847
(7)Towards Knowledge-Grounded Counter Narrative Generation for HateSpeech
https://arxiv.org/abs/2106.11783
(8)Promoting Graph Awareness in Linearized Graph-to-Text Generation
https://arxiv.org/abs/2012.15793
(9)Unsupervised Knowledge Selection for Dialogue Generation
(10)On-the-Fly Attention Modulation for Neural Generation
https://arxiv.org/abs/2101.00371
(11)Detecting Hallucinated Content in Conditional Neural SequenceGeneration
https://arxiv.org/abs/2011.02593
(12)PsyQA: A Chinese Dataset for Generating Long Counseling Text forMental Health Support
https://arxiv.org/abs/2106.01702
(13)Learning to Generate Questions by Learning to RecoverAnswer-containing Sentences
https://openreview.net/forum?id=PRr\_3HPakQ
(14)Few-shot Knowledge Graph-to-Text Generation with Pretrained LanguageModels
https://arxiv.org/abs/2106.01623
(15)Counter-Argument Generation by Attacking Weak Premises
(16)Automatic Document Sketching: Generating Drafts from Analogous Texts
https://arxiv.org/abs/2106.07192
(17)REAM: An Enhancement Approach to Reference-based EvaluationMetrics for Open-domain Dialog Generation
https://arxiv.org/abs/2105.14488
(18)JointGT: Graph-Text Joint Representation Learning for TextGeneration from Knowledge Graphs
https://arxiv.org/abs/2106.10502
(19)Automatic Text Simplification for Social Good: Progress andChallenges
(20)Knowledge-Grounded Dialogue Generation with Term-level De-noising
(21)Latent Reasoning for Low-Resource Question Generation
(22)Provably Secure Generative Linguistic Steganography
https://arxiv.org/abs/2106.02011
(23)Generating Informative Conclusions for Argumentative Texts
https://arxiv.org/abs/2106.01064
(24)ProofWriter: Generating Implications, Proofs, and AbductiveStatements over Natural Language
https://arxiv.org/abs/2012.13048
(25)A Non-Autoregressive Edit-Based Approach to Controllable TextSimplification
(26)Logic-Consistency Text Generation from Semantic Parses
(27)An Investigation of Suitability of Pre-Trained Language Models forDialogue Generation – Avoiding Discrepancies
(28)He is very intelligent, she is very beautiful? On Mitigating SocialBiases in Language Modelling and Generation
(29)Investigating Memorization of Conspiracy Theories in Text Generation
https://arxiv.org/abs/2101.00379
(30)Cross-Domain Review Generation for Aspect-Based Sentiment Analysis
(31)Sketch and Refine: Towards Faithful and Informative Table-to-TextGeneration
https://arxiv.org/abs/2105.14778
(32)TILGAN: Transformer-based Implicit Latent GAN for Diverse andCoherent Text Generation
(33)Generalized Supervised Attention for Text Generation
(34)Elaborative Simplification: Content Addition and ExplanationGeneration in Text Simplification
https://arxiv.org/abs/2010.10035
Short
(35)Investigating Text Simplification Evaluation
(36)Grammar-Based Patches Generation for Automated Program Repair
(37)Structure-Aware Pre-Training for Table-to-Text Generation
(38)Stylized Story Generation with Style-Guided Planning
https://arxiv.org/abs/2105.08625
(39)Retrieval Enhanced Model for Commonsense Generation
https://arxiv.org/abs/2105.11174
(40)Summary Grounded Conversation Generation
https://arxiv.org/abs/2106.03337
(41)BioGen: Generating Biography Summary under Table Guidance onWikipedia
五、摘要
Long
(1)Entity-Aware Abstractive Multi-Document Summarization
(2)TransSum: Translating Aspect and Sentiment Embeddings forSelf-Supervised Opinion Summarization
(3)Code Summarization with Structure-induced Transformer
https://arxiv.org/abs/2012.14710
(4)Improving Unsupervised Extractive Summarization with Facet-AwareModeling
(5)Contrastive Aligned Joint Learning for Multilingual Summarization
(6)Learning Sequential and Structural Information for Source CodeSummarization
(7)A Joint Model for Structure-based News Genre Classification withApplication to Text Summarization
(8)To Point or Not to Point: Understanding How Abstractive SummarizersParaphrase Text
https://arxiv.org/abs/2106.01581
(9)AgreeSum: Agreement-Oriented Multi-Document Summarization
https://arxiv.org/abs/2106.02278
(10)How well do you know your summarization datasets?
https://arxiv.org/abs/2106.11388
(11)XL-Sum: Large-Scale Multilingual Abstractive Summarization for 44Languages
https://arxiv.org/abs/2106.13822
(12)Word Graph Guided Summarization for Radiology Findings
(13)DialogSum: A Real-Life Scenario Dialogue Summarization Dataset
https://arxiv.org/abs/2105.06762
(14)Controllable Abstractive Dialogue Summarization with SketchSupervision
https://arxiv.org/abs/2105.14064
Short
(15)LenAtten: An Effective Length Controlling Unit For TextSummarization
https://arxiv.org/abs/2106.00316
(16)GO FIGURE: A Meta Evaluation of Factuality in Summarization
https://arxiv.org/abs/2010.12834
(17)Is Human Scoring the Best Criteria for Summary Evaluation?
https://arxiv.org/abs/2012.14602
(18)Improve Query Focused Abstractive Summarization by IncorporatingAnswer Relevance
https://arxiv.org/abs/2105.12969
(19)Highlight-Transformer: Leveraging Key Phrase Aware Attention toImprove Abstractive Multi-Document Summarization
六、小样本学习
Long
(1)Minimax and Neyman-Pearson Meta-Learning for Outlier Languages
https://arxiv.org/abs/2106.01051
(2)Meta-Learning Adversarial Domain Adaptation Network for Few-ShotText Classification
(3)Bi-Granularity Contrastive Learning for Post-Training in Few-ShotScene
https://arxiv.org/abs/2106.02327
(4)Don’t Miss the Labels: Label-semantic Augmented Meta-Learner forFew-Shot Text Classification
(5)Learning to Bridge Metric Spaces: Few-shot Joint Learning of IntentDetection and Slot Filling
https://arxiv.org/abs/2106.07343
(6)Reordering Examples Helps during Priming-based Few-Shot Learning
https://arxiv.org/abs/2106.01751
Short
(7)Frustratingly Simple Few-Shot Slot Tagging
(8)Zero-shot Medical Entity Retrieval without Annotation: Learning FromRich Knowledge Graph Semantics
https://arxiv.org/abs/2105.12682
(9)Enhancing Zero-shot and Few-shot Stance Detection with CommonsenseKnowledge Graph
(10)Few-Shot Upsampling for Protest Size Detection
https://arxiv.org/abs/2105.11260
七、对话
Long
(1)Retrieve & Memorize: Dialog Policy Learning with Multi-ActionMemory
https://arxiv.org/abs/2106.02317
(2)Dialogue in the Wild: Learning from a Deployed Role-Playing Game withHumans and Bots
(3)Scheduled Dialog Policy Learning: An Automatic Curriculum LearningFramework for Task-oriented Dialog System
(4)Exploring the Role of Context in Utterance-level Emotion, Act andIntent Classification in Conversations: An Empirical Study
https://declare-lab.net/assets/pdfs/dialogue-understanding-acl2021-findings.pdf
(5)HyKnow: End-to-End Task-Oriented Dialog Modeling with HybridKnowledge Management
https://arxiv.org/abs/2105.06041
(6)Gaussian Process based Deep Dyna-Q approach for Dialogue PolicyLearning
(7)High-Quality Dialogue Diversification by Intermittent ShortExtension Ensembles
https://arxiv.org/abs/2106.00891
(8)GRICE: A Grammar-based Dataset for Recovering Implicature andConversational rEasoning
(9)Dialogue-oriented Pre-training
https://arxiv.org/abs/2106.00420
(10) Decoupled Dialogue Modeling and SemanticParsing for Multi-Turn Text-to-SQL
https://arxiv.org/abs/2106.02282
(11)Probabilistic Graph Reasoning for Natural Proof Generation
https://arxiv.org/abs/2012.14827
(12)Synthesizing Adversarial Negative Responses for Robust ResponseRanking and Evaluation
https://arxiv.org/abs/2106.05894
(13)Enhancing the Open-Domain Dialogue Evaluation in Latent Space
(14)Slot Transferability for Cross-domain Slot Filling
(15)What Did You Refer to? Evaluating Co-References in Dialogue
Short
(16)Assessing Dialogue Systems with Distribution Distances
https://arxiv.org/abs/2105.02573
(17)Improving Automated Evaluation of Open Domain Dialog via DiverseReference Augmentation
https://arxiv.org/abs/2106.02833
(18)Constraint based Knowledge Base Distillation in End-to-End TaskOriented Dialogs
https://www.cse.iitd.ac.in/~mausam/papers/aclfindings21.pdf
八、情感或情绪分析
Long
(1)DNN-driven Gradual Machine Learning for Aspect-term SentimentAnalysis
https://chenbenben.org/paper/driven-DNN-GML.pdf
(2)Dynamic and Multi-Channel Graph Convolutional Networks forAspect-Based Sentiment Analysis
(3)Making Flexible Use of Subtasks: A Multiplex Interaction Network forUnified Aspect-based Sentiment Analysis
(4)Detecting Domain Polarity-Changes of Words in a Sentiment Lexicon
https://arxiv.org/abs/2004.14357
(5)Multi-Task Learning and Adapted Knowledge Models for Emotion-CauseExtraction
https://arxiv.org/abs/2106.09790
(6)Who Blames or Endorses Whom? Entity-to-Entity Directed SentimentExtraction in News Text
https://arxiv.org/abs/2106.01033
(7)Automatically Select Emotion for Response via Personality-affectedEmotion Transition
Short
(8)Boundary Detection with BERT for Span-level Emotion Cause Analysis
(9)Exploiting Position Bias for Robust Aspect Sentiment Classification
https://arxiv.org/abs/2105.14210
(10)Jointly Identifying Rhetoric and Implicit Emotions via Multi-TaskLearning
(11)UserAdapter: Few-Shot User Learning in Sentiment Analysis
(12)Modulating Language Models with Emotions
九、信息抽取
Long
(1)Few-Shot Event Detection with Prototypical Amortized ConditionalRandom Field
https://arxiv.org/abs/2012.02353
(2)From What to Why: Improving Relation Extraction with Rationale Graph
(3)CasEE: A Joint Learning Framework with Cascade Decoding forOverlapping Event Extraction
(4)Spatial Dependency Parsing for Semi-Structured Document InformationExtraction
https://arxiv.org/abs/2005.00642
(5)SIRE: Separate Intra- and Inter-sentential Reasoning forDocument-level Relation Extraction
https://arxiv.org/abs/2106.01709
(6)KGPool: Dynamic Knowledge Graph Context Selection for RelationExtraction
https://arxiv.org/abs/2106.00459
(7)A Dialogue-based Information Extraction System for Medical InsuranceAssessment
(8)Manual Evaluation Matters: Reviewing Test Protocols of DistantlySupervised Relation Extraction
https://arxiv.org/abs/2105.09543
(9)Zero-shot Label-Aware Event Trigger and Argument Classification
(11)MRN: A Locally and Globally Mention-Based Reasoning Network forDocument-Level Relation Extraction
(12)Semantic and Syntactic Enhanced Aspect Sentiment Triplet Extraction
https://arxiv.org/abs/2106.03315
(13)Target-oriented Fine-tuning for Zero-Resource NamedEntity Recognition
https://aclanthology.org/2020.repl4nlp-1.1.pdf
(14)Event Detection as Graph Parsing
(15)Toward Fully Exploiting Heterogeneous Corpus: A Decoupled NamedEntity Recognition Model with Two-stage Training
(16)Discriminative Reasoning for Document-level Relation Extraction
https://arxiv.org/abs/2106.01562
(17)Template-Based Named Entity Recognition Using BART
https://arxiv.org/abs/2106.01760
(18)Trade the Event: Corporate Events Detection for News-BasedEvent-Driven Trading
https://arxiv.org/abs/2105.12825
(19)Improving Event Causality Identification via Self-SupervisedRepresentation Learning on External Causal Statement
https://arxiv.org/abs/2106.01654
(20)DocOIE: A Document-level Context-Aware Dataset for OpenIE
https://arxiv.org/abs/2105.04271
(21)Adaptive Knowledge-Enhanced Bayesian Meta-Learning for Few-shotEvent Detection
https://arxiv.org/abs/2105.09509
(22)A Multi-Level Attention Model for Evidence-Based Fact Checking
https://arxiv.org/abs/2106.00950
(23)Relation Extraction with Type-aware Map Memories of WordDependencies
(24)H-FND: Hierarchical False-Negative Denoising for Distant SupervisionRelation Extraction
https://arxiv.org/abs/2012.03536
(25)Paths to Relation Extraction through Semantic Structure
(26)GrantRel: Grant Information Extraction via Joint Entity and RelationExtraction
(27)Understanding Feature Focus in Multitask Settings forLexico-semantic Relation Identification
(28)HacRED: A Large-Scale Relation Extraction Dataset Toward Hard Casesin Practical Applications
(29)Adjacency List Oriented Relational Fact Extraction via AdaptiveMulti-task Learning
https://arxiv.org/abs/2106.01559
(30)Effective Cascade Dual-Decoder Model for Joint Entity and RelationExtraction
https://arxiv.org/abs/2106.14163
(31)Named Entity Recognition through Deep Representation Learning andWeak Supervision
(32)The Utility and Interplay of Gazetteers and Entity Segmentation forNamed Entity Recognition in English
(33)Unsupervised DomainAdaptation for Event Detection using Domain-specific Adapters
(34)HySPA: Hybrid Span Generation for Scalable Text-to-Graph Extraction
https://arxiv.org/abs/2106.15838
(35)Corpus-Level Evaluation for Event QA: The IndiaPoliceEvents CorpusCovering the 2002 Gujarat Violence
https://arxiv.org/abs/2105.12936
(36)Constrained Labeled Data Generation for Low-Resource Named EntityRecognition
(37)Modeling Event-Pair Relations in External Knowledge Graphs forScript Reasoning
(38)Revisiting the Evaluation of End-to-end Event Extraction
(39)Named Entity Recognition via Noise Aware Training Mechanism withData Filter
(40)A Multi-Task Approach for Improving Biomedical Named EntityRecognition by Incorporating Multi-Granularity information
Short
(41)Relation Classification with Entity Type Restriction
https://arxiv.org/abs/2105.08393
(42)Injecting Knowledge Base Information into End-to-End Joint Entityand Relation Extraction and Coreference Resolution
(43)Event Extraction from Historical Texts: A New Dataset for BlackRebellions
(44)A Neural Edge-Editing Approach for Document-Level Relation GraphExtraction
https://arxiv.org/abs/2106.09900
(45)Neural Entity Recognition with Gazetteer based Fusion
https://arxiv.org/abs/2105.13225
(46)Enhancing Dialogue-based Relation Extraction by Speaker and TriggerWords Prediction
十、 其他
Long
(1)LUX (Linguistic aspects Under eXamination): Discourse Analysis forAutomatic Fake News Classification
(2)The Authors Matter: Understanding and Mitigating Implicit Bias inDeep Text Classification
https://arxiv.org/abs/2105.02778
(3)SyGNS: A Systematic Generalization Testbed Based on Natural LanguageSemantics
https://arxiv.org/abs/2106.01077
(4)WikiTableT: A Large-Scale Data-to-Text Dataset for GeneratingWikipedia Article Sections
https://arxiv.org/abs/2012.14919
(5)Enhancing Transformers with Gradient Boosted Decision Trees for NLIFine-Tuning
https://arxiv.org/abs/2105.03791
(6)Empirical Error Modeling Improves Robustness of Noisy NeuralSequence Labeling
https://arxiv.org/abs/2105.11872
(7)SOLID: A Large-Scale Semi-Supervised Dataset for Offensive LanguageIdentification
https://arxiv.org/abs/2004.14454
(8)A Survey of Data Augmentation Approaches for NLP
https://arxiv.org/abs/2105.03075
(9)Sensei: Self-Supervised Sensor Name Segmentation
https://arxiv.org/abs/2101.00130
(10)Addressing Inquiries about History: An Efficient and PracticalFramework for Evaluating Open-domain Chatbot Consistency
https://arxiv.org/abs/2106.02228
(11)Enhancing Label Correlation Feedback in Multi-Label TextClassification via Multi-Task Learning
https://arxiv.org/abs/2106.03103
(12)Controlling Text Edition by Changing Answers of Specific Questions
https://arxiv.org/abs/2105.11018
(13)Global Attention Decoder for Chinese Spelling Error Correction
(14)Improving Gradient-based Adversarial Training for TextClassification by Contrastive Learning and Auto-Encoder
(15)KACC: A Multi-task Benchmark for Knowledge Abstraction, Concretizationand Completion
https://arxiv.org/abs/2004.13631
(16)A Query-Driven Topic Model
https://arxiv.org/abs/2106.07346
(17)Structured Refinement for Sequential Labeling
(18)Empowering Language Understanding with Counterfactual Reasoning
https://arxiv.org/abs/2106.03046
(19)Correcting Chinese Spelling Errors with Phonetic Pre-training
(20)Dynamic Connected Networks for Chinese Spelling Check
(21)PLATO-2: Towards Building an Open-Domain Chatbot via CurriculumLearning
https://arxiv.org/abs/2006.16779
(22)Joint Multi-Decoder Framework with Hierarchical Pointer Network forFrame Semantic Parsing
(23)On Sparsifying Encoder Outputs in Sequence-to-Sequence Models
https://arxiv.org/abs/2004.11854
(24)Inspecting the concept knowledge graph encoded by modern languagemodels
https://arxiv.org/abs/2105.13471
(25)How Good Is NLP? A Sober Look at NLP Tasks through the Lens ofSocial Impact
https://arxiv.org/abs/2106.02359
(26)Detecting Bot-Generated Text by Characterizing LinguisticAccommodation in Human-Bot Interactions
https://arxiv.org/abs/2106.01170
(27)Does Robustness Improve Fairness? Approaching Fairness with WordSubstitution Robustness Methods for Text Classification
https://arxiv.org/abs/2106.10826
(28)Substructure Substitution: Structured Data Augmentation for NLP
https://arxiv.org/abs/2101.00411
(29)Not Far Away, Not So Close: Sample Efficient Nearest Neighbour DataAugmentation via MiniMax
https://arxiv.org/abs/2105.13608
(30)The interplay between language similarity and script on a novelmulti-layer Algerian dialect corpus
https://arxiv.org/abs/2105.07400
(31)Explaining NLP Models via Minimal Contrastive Editing (MiCE)
https://arxiv.org/abs/2012.13985
(32)LEWIS: Levenshtein Editing for Unsupervised Text Style Transfer
https://arxiv.org/abs/2105.08206
(33)Analyzing Stereotypes in Generative Text Inference Tasks
(34) Unsupervised Label Refinement ImprovesDataless Text Classification
https://arxiv.org/abs/2012.04194
(35)Prompting Contrastive Explanations for Commonsense Reasoning Tasks
https://arxiv.org/abs/2106.06823
(36)Marked Attribute Bias in Natural Language Inference
(37)Effective Batching for Recurrent Neural Network Grammars
https://arxiv.org/abs/2105.14822
(38)DocNLI: A Large-scale Dataset for Document-level Natural LanguageInference
https://arxiv.org/abs/2106.09449
Short
(39)CoDesc: A Large Code-Description Parallel Dataset
https://arxiv.org/abs/2105.14220
(40)Decoupling Adversarial Training for Fair NLP
(41)Better Robustness by More Coverage: Adversarial and Mixup DataAugmentation for Robust Finetuning
https://arxiv.org/abs/2012.15699
(42)SSMix: Saliency-Based Span Mixup for Text Classification
https://arxiv.org/abs/2106.08062
(43)Grammatical Error Correction as GAN-like Sequence Labeling
https://arxiv.org/abs/2105.14209
(44)Figurative Language in Recognizing Textual Entailment
https://arxiv.org/abs/2106.01195
(45)Modeling the Unigram Distribution
https://arxiv.org/abs/2106.02289
(46)Sequence Models for Computational Etymology of Borrowings
(47)Do Grammatical Error Correction Models Realize GrammaticalGeneralization?