人工智能方向0311-LingLab

人工智能方向0311

352 阅读 2021-03-26 17:57:17 上传

以下文章来源于言语语言病理学

今日 cs.AI方向共计32篇文章。

Artiical Intelligence(20篇)

[1]：Using Cognitive Models to Train Warm Start Reinforcement Learning Agents for Human-Computer Interactions
标题：利用认知模型训练人机交互的热启动强化学习代理
作者：Chao Zhang, Shihan Wang, Henk Aarts, Mehdi Dastani
链接：https://arxiv.org/abs/2103.06160
摘要：Reinforcement learning (RL) agents in human-computer interactions applications require repeated user interactions before they can perform well. To address this "cold start" problem, we propose a novel approach of using cognitive models to pre-train RL agents before they are applied to real users. After briefly reviewing relevant cognitive models, we present our general methodological approach, followed by two case studies from our previous and ongoing projects. We hope this position paper stimulates conversations between RL, HCI, and cognitive science researchers in order to explore the full potential of the approach.[2]：The whole brain architecture approach: Accelerating the development of artificial general intelligence by referring to the brain
标题：全脑架构方法：以大脑为参照加速人工通用智能的发展
作者：Hiroshi Yamakawa
备注：28 pages, 10 figures, Preprint submitted to Neural Networks
链接：https://arxiv.org/abs/2103.06123
摘要：The vastness of the design space created by the combination of a large number of computational mechanisms, including machine learning, is an obstacle to creating an artificial general intelligence (AGI). Brain-inspired AGI development, in other words, cutting down the design space to look more like a biological brain, which is an existing model of a general intelligence, is a promising plan for solving this problem. However, it is difficult for an individual to design a software program that corresponds to the entire brain because the neuroscientific data required to understand the architecture of the brain are extensive and complicated. The whole-brain architecture approach divides the brain-inspired AGI development process into the task of designing the brain reference architecture (BRA) -- the flow of information and the diagram of corresponding components -- and the task of developing each component using the BRA. This is called BRA-driven development. Another difficulty lies in the extraction of the operating principles necessary for reproducing the cognitive-behavioral function of the brain from neuroscience data. Therefore, this study proposes the Structure-constrained Interface Decomposition (SCID) method, which is a hypothesis-building method for creating a hypothetical component diagram consistent with neuroscientific findings. The application of this approach has begun for building various regions of the brain. Moving forward, we will examine methods of evaluating the biological plausibility of brain-inspired software. This evaluation will also be used to prioritize different computational mechanisms, which should be merged, associated with the same regions of the brain.[3]：A Two-stage Framework and Reinforcement Learning-based Optimization Algorithms for Complex Scheduling Problems
标题：基于两阶段框架和强化学习的复杂调度优化算法
作者：Yongming He, Guohua Wu, Yingwu Chen, Witold Pedrycz
链接：https://arxiv.org/abs/2103.05847
摘要：There hardly exists a general solver that is efficient for scheduling problems due to their diversity and complexity. In this study, we develop a two-stage framework, in which reinforcement learning (RL) and traditional operations research (OR) algorithms are combined together to efficiently deal with complex scheduling problems. The scheduling problem is solved in two stages, including a finite Markov decision process (MDP) and a mixed-integer programming process, respectively. This offers a novel and general paradigm that combines RL with OR approaches to solving scheduling problems, which leverages the respective strengths of RL and OR: The MDP narrows down the search space of the original problem through an RL method, while the mixed-integer programming process is settled by an OR algorithm. These two stages are performed iteratively and interactively until the termination criterion has been met. Under this idea, two implementation versions of the combination methods of RL and OR are put forward. The agile Earth observation satellite scheduling problem is selected as an example to demonstrate the effectiveness of the proposed scheduling framework and methods. The convergence and generalization capability of the methods are verified by the performance of training scenarios, while the efficiency and accuracy are tested in 50 untrained scenarios. The results show that the proposed algorithms could stably and efficiently obtain satisfactory scheduling schemes for agile Earth observation satellite scheduling problems. In addition, it can be found that RL-based optimization algorithms have stronger scalability than non-learning algorithms. This work reveals the advantage of combining reinforcement learning methods with heuristic methods or mathematical programming methods for solving complex combinatorial optimization problems.[4]：A New K means Grey Wolf Algorithm for Engineering Problems
标题：求解工程问题的一种新的K均值灰太狼算法
作者：Hardi M. Mohammed, Zrar Kh. Abdul, Tarik A. Rashid, Abeer Alsadoon, Nebojsa Bacanin
备注：15 pages. World Journal of Engineering, 2021
链接：https://arxiv.org/abs/2103.05760
摘要：Purpose: The development of metaheuristic algorithms has increased by researchers to use them extensively in the field of business, science, and engineering. One of the common metaheuristic optimization algorithms is called Grey Wolf Optimization (GWO). The algorithm works based on imitation of the wolves' searching and the process of attacking grey wolves. The main purpose of this paper to overcome the GWO problem which is trapping into local optima.
Design or Methodology or Approach: In this paper, the K-means clustering algorithm is used to enhance the performance of the original Grey Wolf Optimization by dividing the population into different parts. The proposed algorithm is called K-means clustering Grey Wolf Optimization (KMGWO).
Findings: Results illustrate the efficiency of KMGWO is superior to GWO. To evaluate the performance of the KMGWO, KMGWO applied to solve 10 CEC2019 benchmark test functions. Results prove that KMGWO is better compared to GWO. KMGWO is also compared to Cat Swarm Optimization (CSO), Whale Optimization Algorithm-Bat Algorithm (WOA-BAT), and WOA, so, KMGWO achieves the first rank in terms of performance. Statistical results proved that KMGWO achieved a higher significant value compared to the compared algorithms. Also, the KMGWO is used to solve a pressure vessel design problem and it has outperformed results.
Originality/value: Results prove that KMGWO is superior to GWO. KMGWO is also compared to cat swarm optimization (CSO), whale optimization algorithm-bat algorithm (WOA-BAT), WOA, and GWO so KMGWO achieved the first rank in terms of performance. Also, the KMGWO is used to solve a classical engineering problem and it is superior[5]：On complementing end-to-end human motion predictors with planning
标题：用规划补充端到端人体运动预测
作者：Liting Sun, Xiaogang Jia, Anca D. Dragan
链接：https://arxiv.org/abs/2103.05661
摘要：High capacity end-to-end approaches for human motion prediction have the ability to represent subtle nuances in human behavior, but struggle with robustness to out of distribution inputs and tail events. Planning-based prediction, on the other hand, can reliably output decent-but-not-great predictions: it is much more stable in the face of distribution shift, but it has high inductive bias, missing important aspects that drive human decisions, and ignoring cognitive biases that make human behavior suboptimal. In this work, we analyze one family of approaches that strive to get the best of both worlds: use the end-to-end predictor on common cases, but do not rely on it for tail events / out-of-distribution inputs -- switch to the planning-based predictor there. We contribute an analysis of different approaches for detecting when to make this switch, using an autonomous driving domain. We find that promising approaches based on ensembling or generative modeling of the training distribution might not be reliable, but that there very simple methods which can perform surprisingly well -- including training a classifier to pick up on tell-tale issues in predicted trajectories.[6]：Quantum machine learning with differential privacy
标题：基于微分隐私的量子机器学习
作者：William M Watkins, Samuel Yen-Chi Chen, Shinjae Yoo
链接：https://arxiv.org/abs/2103.06232
摘要：Quantum machine learning (QML) can complement the growing trend of using learned models for a myriad of classification tasks, from image recognition to natural speech processing. A quantum advantage arises due to the intractability of quantum operations on a classical computer. Many datasets used in machine learning are crowd sourced or contain some private information. To the best of our knowledge, no current QML models are equipped with privacy-preserving features, which raises concerns as it is paramount that models do not expose sensitive information. Thus, privacy-preserving algorithms need to be implemented with QML. One solution is to make the machine learning algorithm differentially private, meaning the effect of a single data point on the training dataset is minimized. Differentially private machine learning models have been investigated, but differential privacy has yet to be studied in the context of QML. In this study, we develop a hybrid quantum-classical model that is trained to preserve privacy using differentially private optimization algorithm. This marks the first proof-of-principle demonstration of privacy-preserving QML. The experiments demonstrate that differentially private QML can protect user-sensitive information without diminishing model accuracy. Although the quantum model is simulated and tested on a classical computer, it demonstrates potential to be efficiently implemented on near-term quantum devices (noisy intermediate-scale quantum [NISQ]). The approach's success is illustrated via the classification of spatially classed two-dimensional datasets and a binary MNIST classification. This implementation of privacy-preserving QML will ensure confidentiality and accurate learning on NISQ technology.[7]：Automatic Speaker Independent Dysarthric Speech Intelligibility Assessment System
标题：非特定人语音清晰度自动评估系统
作者：Ayush Tripathi, Swapnil Bhosale, Sunil Kumar Kopparapu
备注：29 pages, 2 figures, Computer Speech & Language 2021
链接：https://arxiv.org/abs/2103.06157
摘要：Dysarthria is a condition which hampers the ability of an individual to control the muscles that play a major role in speech delivery. The loss of fine control over muscles that assist the movement of lips, vocal chords, tongue and diaphragm results in abnormal speech delivery. One can assess the severity level of dysarthria by analyzing the intelligibility of speech spoken by an individual. Continuous intelligibility assessment helps speech language pathologists not only study the impact of medication but also allows them to plan personalized therapy. It helps the clinicians immensely if the intelligibility assessment system is reliable, automatic, simple for (a) patients to undergo and (b) clinicians to interpret. Lack of availability of dysarthric data has resulted in development of speaker dependent automatic intelligibility assessment systems which requires patients to speak a large number of utterances. In this paper, we propose (a) a cost minimization procedure to select an optimal (small) number of utterances that need to be spoken by the dysarthric patient, (b) four different speaker independent intelligibility assessment systems which require the patient to speak a small number of words, and (c) the assessment score is close to the perceptual score that the Speech Language Pathologist (SLP) can relate to. The need for small number of utterances to be spoken by the patient and the score being relatable to the SLP benefits both the dysarthric patient and the clinician from usability perspective.[8]：GraphBreak: Tool for Network Community based Regulatory Medicine, Gene co-expression, Linkage Disequilibrium analysis, functional annotation and more
标题：GraphBreak：用于基于网络社区的调控医学、基因共表达、连锁不平衡分析、功能注释等的工具
作者：Abhishek Narain Singh
链接：https://arxiv.org/abs/2103.06145
摘要：Graph network science is becoming increasingly popular, notably in big-data perspective where understanding individual entities for individual functional roles is complex and time consuming. It is likely when a set of genes are regulated by a set of genetic variants, the genes set is recruited for a common or related functional purpose. Grouping and extracting communities from network of associations becomes critical to understand system complexity, thus prioritizing genes for dis-ease and functional associations. Workload is reduced when studying entities one at a time. For this, we present GraphBreak, a suite of tools for community detection application, such as for gene co-expression, protein interaction, regulation network, etc.Although developed for use case of eQTLs regulatory genomic net-work community study -- results shown with our analysis with sample eQTL data. Graphbreak can be deployed for other studies if input data has been fed in requisite format, including but not limited to gene co-expression networks, protein-protein interaction network, signaling pathway and metabolic network. Graph-Break showed critical use case value in its downstream analysis for disease association of communities detected. If all independent steps of community detection and analysis are a step-by-step sub-part of the algorithm, GraphBreak can be considered a new algorithm for community based functional characterization. Combination of various algorithmic implementation modules into a single script for this purpose illustrates GraphBreak novelty. Compared to other similar tools, with GraphBreak we can better detect communities with over-representation of its member genes for statistical association with diseases, therefore target genes which can be prioritized for drug-positioning or drug-re-positioning as the case be.[9]：Semantically Constrained Memory Allocation (SCMA) for Embedding in Efficient Recommendation Systems
标题：有效推荐系统中嵌入的语义约束内存分配（SCMA）
作者：Aditya Desai, Yanzhou Pan, Kuangyuan Sun, Li Chou, Anshumali Shrivastava
链接：https://arxiv.org/abs/2103.06124
摘要：Deep learning-based models are utilized to achieve state-of-the-art performance for recommendation systems. A key challenge for these models is to work with millions of categorical classes or tokens. The standard approach is to learn end-to-end, dense latent representations or embeddings for each token. The resulting embeddings require large amounts of memory that blow up with the number of tokens. Training and inference with these models create storage, and memory bandwidth bottlenecks leading to significant computing and energy consumption when deployed in practice. To this end, we present the problem of \textit{Memory Allocation} under budget for embeddings and propose a novel formulation of memory shared embedding, where memory is shared in proportion to the overlap in semantic information. Our formulation admits a practical and efficient randomized solution with Locality sensitive hashing based Memory Allocation (LMA). We demonstrate a significant reduction in the memory footprint while maintaining performance. In particular, our LMA embeddings achieve the same performance compared to standard embeddings with a 16$\times$ reduction in memory footprint. Moreover, LMA achieves an average improvement of over 0.003 AUC across different memory regimes than standard DLRM models on Criteo and Avazu datasets[10]：BCFNet: A Balanced Collaborative Filtering Network with Attention Mechanism
标题：BCFNet：一个具有注意机制的均衡协同过滤网络
作者：Chang-Dong Wang, Zi-Yuan Hu, Jin Huang, Zhi-Hong Deng, Ling Huang, Jian-Huang Lai, Philip S. Yu
链接：https://arxiv.org/abs/2103.06105
摘要：Collaborative Filtering (CF) based recommendation methods have been widely studied, which can be generally categorized into two types, i.e., representation learning-based CF methods and matching function learning-based CF methods. Representation learning tries to learn a common low dimensional space for the representations of users and items. In this case, a user and item match better if they have higher similarity in that common space. Matching function learning tries to directly learn the complex matching function that maps user-item pairs to matching scores. Although both methods are well developed, they suffer from two fundamental flaws, i.e., the representation learning resorts to applying a dot product which has limited expressiveness on the latent features of users and items, while the matching function learning has weakness in capturing low-rank relations. To overcome such flaws, we propose a novel recommendation model named Balanced Collaborative Filtering Network (BCFNet), which has the strengths of the two types of methods. In addition, an attention mechanism is designed to better capture the hidden information within implicit feedback and strengthen the learning ability of the neural network. Furthermore, a balance module is designed to alleviate the over-fitting issue in DNNs. Extensive experiments on eight real-world datasets demonstrate the effectiveness of the proposed model.[11]：Impacts of the Numbers of Colors and Shapes on Outlier Detection: from Automated to User Evaluation
标题：颜色和形状数量对异常检测的影响：从自动到用户评估
作者：Loann Giovannangeli, Romain Giot, David Auber, Romain Bourqui
链接：https://arxiv.org/abs/2103.06084
摘要：The design of efficient representations is well established as a fruitful way to explore and analyze complex or large data. In these representations, data are encoded with various visual attributes depending on the needs of the representation itself. To make coherent design choices about visual attributes, the visual search field proposes guidelines based on the human brain perception of features. However, information visualization representations frequently need to depict more data than the amount these guidelines have been validated on. Since, the information visualization community has extended these guidelines to a wider parameter space.
This paper contributes to this theme by extending visual search theories to an information visualization context. We consider a visual search task where subjects are asked to find an unknown outlier in a grid of randomly laid out distractor. Stimuli are defined by color and shape features for the purpose of visually encoding categorical data. The experimental protocol is made of a parameters space reduction step (i.e., sub-sampling) based on a machine learning model, and a user evaluation to measure capacity limits and validate hypotheses. The results show that the major difficulty factor is the number of visual attributes that are used to encode the outlier. When redundantly encoded, the display heterogeneity has no effect on the task. When encoded with one attribute, the difficulty depends on that attribute heterogeneity until its capacity limit (7 for color, 5 for shape) is reached. Finally, when encoded with two attributes simultaneously, performances drop drastically even with minor heterogeneity.[12]：Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs
标题：设计人工智能系统的分类评估：选择、考虑和权衡
作者：Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, Duncan Wadsworth, Hanna Wallach
链接：https://arxiv.org/abs/2103.06076
摘要：Several pieces of work have uncovered performance disparities by conducting "disaggregated evaluations" of AI systems. We build on these efforts by focusing on the choices that must be made when designing a disaggregated evaluation, as well as some of the key considerations that underlie these design choices and the tradeoffs between these considerations. We argue that a deeper understanding of the choices, considerations, and tradeoffs involved in designing disaggregated evaluations will better enable researchers, practitioners, and the public to understand the ways in which AI systems may be underperforming for particular groups of people.[13]：Improving Sequential Recommendation with Attribute-augmented Graph Neural Networks
标题：基于属性增强图神经网络的顺序推荐改进
作者：Xinzhou Dong, Beihong Jin, Wei Zhuo, Beibei Li, Taofeng Xue
链接：https://arxiv.org/abs/2103.05923
摘要：Many practical recommender systems provide item recommendation for different users only via mining user-item interactions but totally ignoring the rich attribute information of items that users interact with. In this paper, we propose an attribute-augmented graph neural network model named Murzim. Murzim takes as input the graphs constructed from the user-item interaction sequences and corresponding item attribute sequences. By combining the GNNs with node aggregation and an attention network, Murzim can capture user preference patterns, generate embeddings for user-item interaction sequences, and then generate recommendations through next-item prediction. We conduct extensive experiments on multiple datasets. Experimental results show that Murzim outperforms several state-of-the-art methods in terms of recall and MRR, which illustrates that Murzim can make use of item attribute information to produce better recommendations. At present, Murzim has been deployed in MX Player, one of India's largest streaming platforms, and is recommending videos for tens of thousands of users.[14]：Fast and flexible: Human program induction in abstract reasoning tasks
标题：快速灵活：抽象推理任务中的人工程序归纳
作者：Aysja Johnson, Wai Keen Vong, Brenden M. Lake, Todd M. Gureckis
备注：7 pages, 7 figures, 1 table
链接：https://arxiv.org/abs/2103.05823
摘要：The Abstraction and Reasoning Corpus (ARC) is a challenging program induction dataset that was recently proposed by Chollet (2019). Here, we report the first set of results collected from a behavioral study of humans solving a subset of tasks from ARC (40 out of 1000). Although this subset of tasks contains considerable variation, our results showed that humans were able to infer the underlying program and generate the correct test output for a novel test input example, with an average of 80% of tasks solved per participant, and with 65% of tasks being solved by more than 80% of participants. Additionally, we find interesting patterns of behavioral consistency and variability within the action sequences during the generation process, the natural language descriptions to describe the transformations for each task, and the errors people made. Our findings suggest that people can quickly and reliably determine the relevant features and properties of a task to compose a correct solution. Future modeling work could incorporate these findings, potentially by connecting the natural language descriptions we collected here to the underlying semantics of ARC.[15]：ZYELL-NCTU NetTraffic-1.0: A Large-Scale Dataset for Real-World Network Anomaly Detection
标题：ZYELL-NCTU-NetTraffic-1.0：一个用于网络异常检测的大规模数据集
作者：Lei Chen, Shao-En Weng, Chu-Jun Peng, Hong-Han Shuai, Wen-Huang Cheng
备注：2 pages, 3 tables, 1 figure
链接：https://arxiv.org/abs/2103.05767
摘要：Network security has been an active research topic for long. One critical issue is improving the anomaly detection capability of intrusion detection systems (IDSs), such as firewalls. However, existing network anomaly datasets are out of date (i.e., being collected many years ago) or IP-anonymized, making the data characteristics differ from today's network. Therefore, this work introduces a new, large-scale, and real-world dataset, ZYELL-NCTU NetTraffic-1.0, which is collected from the raw output of firewalls in a real network, with the objective to advance the development of network security researches.[16]：Embodied Continual Learning Across Developmental Time Via Developmental Braitenberg Vehicles
标题：通过发展性Braitenberg工具实现跨越发展时间的持续学习
作者：Bradly Alicea, Rishabh Chakrabarty, Akshara Gopi, Anson Lim, Jesse Parent
备注：14 pages, 3 figures
链接：https://arxiv.org/abs/2103.05753
摘要：There is much to learn through synthesis of Developmental Biology, Cognitive Science and Computational Modeling. One lesson we can learn from this perspective is that the initialization of intelligent programs cannot solely rely on manipulation of numerous parameters. Our path forward is to present a design for developmentally-inspired learning agents based on the Braitenberg Vehicle. Using these agents to exemplify artificial embodied intelligence, we move closer to modeling embodied experience and morphogenetic growth as components of cognitive developmental capacity. We consider various factors regarding biological and cognitive development which influence the generation of adult phenotypes and the contingency of available developmental pathways. These mechanisms produce emergent connectivity with shifting weights and adaptive network topography, thus illustrating the importance of developmental processes in training neural networks. This approach provides a blueprint for adaptive agent behavior that might result from a developmental approach: namely by exploiting critical periods or growth and acquisition, an explicitly embodied network architecture, and a distinction between the assembly of neural networks and active learning on these networks.[17]：Analyzing Human Models that Adapt Online
标题：分析在线适应的人体模型
作者：Andrea Bajcsy, Anand Siththaranjan, Claire J. Tomlin, Anca D. Dragan
备注：ICRA 2021
链接：https://arxiv.org/abs/2103.05746
摘要：Predictive human models often need to adapt their parameters online from human data. This raises previously ignored safety-related questions for robots relying on these models such as what the model could learn online and how quickly could it learn it. For instance, when will the robot have a confident estimate in a nearby human's goal? Or, what parameter initializations guarantee that the robot can learn the human's preferences in a finite number of observations? To answer such analysis questions, our key idea is to model the robot's learning algorithm as a dynamical system where the state is the current model parameter estimate and the control is the human data the robot observes. This enables us to leverage tools from reachability analysis and optimal control to compute the set of hypotheses the robot could learn in finite time, as well as the worst and best-case time it takes to learn them. We demonstrate the utility of our analysis tool in four human-robot domains, including autonomous driving and indoor navigation.[18]：The AI Arena: A Framework for Distributed Multi-Agent Reinforcement Learning
标题：人工智能领域：一个分布式多智能体强化学习框架
作者：Edward W. Staley, Corban G.Rivera, Ashley J. Llorens
链接：https://arxiv.org/abs/2103.05737
摘要：Advances in reinforcement learning (RL) have resulted in recent breakthroughs in the application of artificial intelligence (AI) across many different domains. An emerging landscape of development environments is making powerful RL techniques more accessible for a growing community of researchers. However, most existing frameworks do not directly address the problem of learning in complex operating environments, such as dense urban settings or defense-related scenarios, that incorporate distributed, heterogeneous teams of agents. To help enable AI research for this important class of applications, we introduce the AI Arena: a scalable framework with flexible abstractions for distributed multi-agent reinforcement learning. The AI Arena extends the OpenAI Gym interface to allow greater flexibility in learning control policies across multiple agents with heterogeneous learning strategies and localized views of the environment. To illustrate the utility of our framework, we present experimental results that demonstrate performance gains due to a distributed multi-agent learning approach over commonly-used RL techniques in several different learning environments.[19]：Automatic code generation from sketches of mobile applications in end-user development using Deep Learning
标题：使用深度学习从终端用户开发中的移动应用程序草图自动生成代码
作者：Daniel Baulé, Christiane Gresse von Wangenheim, Aldo von Wangenheim, Jean C. R. Hauck, Edson C. Vargas Júnior
备注：18 pages
链接：https://arxiv.org/abs/2103.05704
摘要：A common need for mobile application development by end-users or in computing education is to transform a sketch of a user interface into wireframe code using App Inventor, a popular block-based programming environment. As this task is challenging and time-consuming, we present the Sketch2aia approach that automates this process. Sketch2aia employs deep learning to detect the most frequent user interface components and their position on a hand-drawn sketch creating an intermediate representation of the user interface and then automatically generates the App Inventor code of the wireframe. The approach achieves an average user interface component classification accuracy of 87,72% and results of a preliminary user evaluation indicate that it generates wireframes that closely mirror the sketches in terms of visual similarity. The approach has been implemented as a web tool and can be used to support the end-user development of mobile applications effectively and efficiently as well as the teaching of user interface design in K-12.[20]：A Gradient Estimator for Time-Varying Electrical Networks with Non-Linear Dissipation
标题：非线性耗散时变网络的梯度估计
作者：Jack Kendall
备注：12 pages, 0 figures
链接：https://arxiv.org/abs/2103.05636
摘要：We propose a method for extending the technique of equilibrium propagation for estimating gradients in fixed-point neural networks to the more general setting of directed, time-varying neural networks by modeling them as electrical circuits. We use electrical circuit theory to construct a Lagrangian capable of describing deep, directed neural networks modeled using nonlinear capacitors and inductors, linear resistors and sources, and a special class of nonlinear dissipative elements called fractional memristors. We then derive an estimator for the gradient of the physical parameters of the network, such as synapse conductances, with respect to an arbitrary loss function. This estimator is entirely local, in that it only depends on information locally available to each synapse. We conclude by suggesting methods for extending these results to networks of biologically plausible neurons, e.g. Hodgkin-Huxley neurons.

CV方向重复(8篇)

[1]：A Relational-learning Perspective to Multi-label Chest X-ray Classification
标题：多标签胸部X线分类的关系学习视角
作者：Anjany Sekuboyina, Daniel Oñoro-Rubio, Jens Kleesiek, Brandon Malone
链接：https://arxiv.org/abs/2103.06220
摘要：Multi-label classification of chest X-ray images is frequently performed using discriminative approaches, i.e. learning to map an image directly to its binary labels. Such approaches make it challenging to incorporate auxiliary information such as annotation uncertainty or a dependency among the labels. Building towards this, we propose a novel knowledge graph reformulation of multi-label classification, which not only readily increases predictive performance of an encoder but also serves as a general framework for introducing new domain knowledge.
Specifically, we construct a multi-modal knowledge graph out of the chest X-ray images and its labels and pose multi-label classification as a link prediction problem. Incorporating auxiliary information can then simply be achieved by adding additional nodes and relations among them. When tested on a publicly-available radiograph dataset (CheXpert), our relational-reformulation using a naive knowledge graph outperforms the state-of-art by achieving an area-under-ROC curve of 83.5%, an improvement of "sim 1" over a purely discriminative approach.[2]：Towards Learning an Unbiased Classifier from Biased Data via Conditional Adversarial Debiasing
标题：有偏数据条件对抗性减损学习无偏分类器的研究
作者：Christian Reimers, Paul Bodesheim, Jakob Runge, Joachim Denzler
链接：https://arxiv.org/abs/2103.06179
摘要：Bias in classifiers is a severe issue of modern deep learning methods, especially for their application in safety- and security-critical areas. Often, the bias of a classifier is a direct consequence of a bias in the training dataset, frequently caused by the co-occurrence of relevant features and irrelevant ones. To mitigate this issue, we require learning algorithms that prevent the propagation of bias from the dataset into the classifier. We present a novel adversarial debiasing method, which addresses a feature that is spuriously connected to the labels of training images but statistically independent of the labels for test images. Thus, the automatic identification of relevant features during training is perturbed by irrelevant features. This is the case in a wide range of bias-related problems for many computer vision tasks, such as automatic skin cancer detection or driver assistance. We argue by a mathematical proof that our approach is superior to existing techniques for the abovementioned bias. Our experiments show that our approach performs better than state-of-the-art techniques on a well-known benchmark dataset with real-world images of cats and dogs.[3]：MixMo: Mixing Multiple Inputs for Multiple Outputs via Deep Subnetworks
标题：MixMo：通过深子网混合多个输入以获得多个输出
作者：Alexandre Rame, Remy Sun, Matthieu Cord
备注：8 pages, 10 figures, 6 tables
链接：https://arxiv.org/abs/2103.06132
摘要：Recent strategies achieved ensembling for free by fitting concurrently diverse subnetworks inside a single base network. The main idea during training is that each subnetwork learns to classify only one of the multiple inputs simultaneously provided. However, the question of how these multiple inputs should be mixed has not been studied yet. In this paper, we introduce MixMo, a new generalized framework for learning multi-input multi-output deep subnetworks. Our key motivation is to replace the suboptimal summing operation hidden in previous approaches by a more appropriate mixing mechanism. For that purpose, we draw inspiration from successful mixed sample data augmentations. We show that binary mixing in features - particularly with patches from CutMix - enhances results by making subnetworks stronger and more diverse. We improve state of the art on the CIFAR-100 and Tiny-ImageNet classification datasets. In addition to being easy to implement and adding no cost at inference, our models outperform much costlier data augmented deep ensembles. We open a new line of research complementary to previous works, as we operate in features and better leverage the expressiveness of large networks.[4]：Deep Sensing of Urban Waterlogging
标题：城市内涝的深度感知
作者：Shi-Wei Lo
备注：10 pages, 10 figures, under submitting and patenting
链接：https://arxiv.org/abs/2103.05927
摘要：In the monsoon season, sudden flood events occur frequently in urban areas, which hamper the social and economic activities and may threaten the infrastructure and lives. The use of an efficient large-scale waterlogging sensing and information system can provide valuable real-time disaster information to facilitate disaster management and enhance awareness of the general public to alleviate losses during and after flood disasters. Therefore, in this study, a visual sensing approach driven by deep neural networks and information and communication technology was developed to provide an end-to-end mechanism to realize waterlogging sensing and event-location mapping. The use of a deep sensing system in the monsoon season in Taiwan was demonstrated, and waterlogging events were predicted on the island-wide scale. The system could sense approximately 2379 vision sources through an internet of video things framework and transmit the event-location information in 5 min. The proposed approach can sense waterlogging events at a national scale and provide an efficient and highly scalable alternative to conventional waterlogging sensing methods.[5]：RL-CSDia: Representation Learning of Computer Science Diagrams
标题：计算机科学图表的表征学习
作者：Shaowei Wang, LingLing Zhang, Xuan Luo, Yi Yang, Xin Hu, Jun Liu
链接：https://arxiv.org/abs/2103.05900
摘要：Recent studies on computer vision mainly focus on natural images that express real-world scenes. They achieve outstanding performance on diverse tasks such as visual question answering. Diagram is a special form of visual expression that frequently appears in the education field and is of great significance for learners to understand multimodal knowledge. Current research on diagrams preliminarily focuses on natural disciplines such as Biology and Geography, whose expressions are still similar to natural images. Another type of diagrams such as from Computer Science is composed of graphics containing complex topologies and relations, and research on this type of diagrams is still blank. The main challenges of graphic diagrams understanding are the rarity of data and the confusion of semantics, which are mainly reflected in the diversity of expressions. In this paper, we construct a novel dataset of graphic diagrams named Computer Science Diagrams (CSDia). It contains more than 1,200 diagrams and exhaustive annotations of objects and relations. Considering the visual noises caused by the various expressions in diagrams, we introduce the topology of diagrams to parse topological structure. After that, we propose Diagram Parsing Net (DPN) to represent the diagram from three branches: topology, visual feature, and text, and apply the model to the diagram classification task to evaluate the ability of diagrams understanding. The results show the effectiveness of the proposed DPN on diagrams understanding.[6]：Limitations of Post-Hoc Feature Alignment for Robustness
标题：后置特征对齐在鲁棒性方面的局限性
作者：Collin Burns, Jacob Steinhardt
备注：Accepted to CVPR 2021
链接：https://arxiv.org/abs/2103.05898
摘要：Feature alignment is an approach to improving robustness to distribution shift that matches the distribution of feature activations between the training distribution and test distribution. A particularly simple but effective approach to feature alignment involves aligning the batch normalization statistics between the two distributions in a trained neural network. This technique has received renewed interest lately because of its impressive performance on robustness benchmarks. However, when and why this method works is not well understood. We investigate the approach in more detail and identify several limitations. We show that it only significantly helps with a narrow set of distribution shifts and we identify several settings in which it even degrades performance. We also explain why these limitations arise by pinpointing why this approach can be so effective in the first place. Our findings call into question the utility of this approach and Unsupervised Domain Adaptation more broadly for improving robustness in practice.[7]：AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation
标题：AutoDO：基于可伸缩概率隐式微分的标签噪声有偏数据鲁棒自增强
作者：Denis Gudovskiy, Luca Rigazio, Shun Ishizaka, Kazuki Kozuka, Sotaro Tsukizawa
备注：Accepted to CVPR 2021. Preprint
链接：https://arxiv.org/abs/2103.05863
摘要：AutoAugment has sparked an interest in automated augmentation methods for deep learning models. These methods estimate image transformation policies for train data that improve generalization to test data. While recent papers evolved in the direction of decreasing policy search complexity, we show that those methods are not robust when applied to biased and noisy data. To overcome these limitations, we reformulate AutoAugment as a generalized automated dataset optimization (AutoDO) task that minimizes the distribution shift between test data and distorted train dataset. In our AutoDO model, we explicitly estimate a set of per-point hyperparameters to flexibly change distribution of train data. In particular, we include hyperparameters for augmentation, loss weights, and soft-labels that are jointly estimated using implicit differentiation. We develop a theoretical probabilistic interpretation of this framework using Fisher information and show that its complexity scales linearly with the dataset size. Our experiments on SVHN, CIFAR-10/100, and ImageNet classification show up to 9.3% improvement for biased datasets with label noise compared to prior methods and, importantly, up to 36.6% gain for underrepresented SVHN classes.[8]：A Multi-resolution Approach to Expression Recognition in the Wild
标题：一种多分辨率的野外表情识别方法
作者：Fabio Valerio Massoli, Donato Cafarelli, Giuseppe Amato, Fabrizio Falchi
链接：https://arxiv.org/abs/2103.05723
摘要：Facial expressions play a fundamental role in human communication. Indeed, they typically reveal the real emotional status of people beyond the spoken language. Moreover, the comprehension of human affect based on visual patterns is a key ingredient for any human-machine interaction system and, for such reasons, the task of Facial Expression Recognition (FER) draws both scientific and industrial interest. In the recent years, Deep Learning techniques reached very high performance on FER by exploiting different architectures and learning paradigms. In such a context, we propose a multi-resolution approach to solve the FER task. We ground our intuition on the observation that often faces images are acquired at different resolutions. Thus, directly considering such property while training a model can help achieve higher performance on recognizing facial expressions. To our aim, we use a ResNet-like architecture, equipped with Squeeze-and-Excitation blocks, trained on the Affect-in-the-Wild 2 dataset. Not being available a test set, we conduct tests and models selection by employing the validation set only on which we achieve more than 90\% accuracy on classifying the seven expressions that the dataset comprises.

NLP方向重复(4篇)

[1]：Relational Weight Priors in Neural Networks for Abstract Pattern Learning and Language Modelling
标题：用于抽象模式学习和语言建模的神经网络中的关系权值先验
作者：Radha Kopparti, Tillman Weyde
备注：29 pages
链接：https://arxiv.org/abs/2103.06198
摘要：Deep neural networks have become the dominant approach in natural language processing (NLP). However, in recent years, it has become apparent that there are shortcomings in systematicity that limit the performance and data efficiency of deep learning in NLP. These shortcomings can be clearly shown in lower-level artificial tasks, mostly on synthetic data. Abstract patterns are the best known examples of a hard problem for neural networks in terms of generalisation to unseen data. They are defined by relations between items, such as equality, rather than their values. It has been argued that these low-level problems demonstrate the inability of neural networks to learn systematically. In this study, we propose Embedded Relation Based Patterns (ERBP) as a novel way to create a relational inductive bias that encourages learning equality and distance-based relations for abstract patterns. ERBP is based on Relation Based Patterns (RBP), but modelled as a Bayesian prior on network weights and implemented as a regularisation term in otherwise standard network learning. ERBP is is easy to integrate into standard neural networks and does not affect their learning capacity. In our experiments, ERBP priors lead to almost perfect generalisation when learning abstract patterns from synthetic noise-free sequences. ERBP also improves natural language models on the word and character level and pitch prediction in melodies with RNN, GRU and LSTM networks. We also find improvements in in the more complex tasks of learning of graph edit distance and compositional sentence entailment. ERBP consistently improves over RBP and over standard networks, showing that it enables abstract pattern learning which contributes to performance in natural language tasks.[2]：DeepCPCFG: Deep Learning and Context Free Grammars for End-to-End Information Extraction
标题：DeepCPCFG：用于端到端信息提取的深度学习和上下文无关语法
作者：Freddy C. Chua, Nigel P. Duffy
链接：https://arxiv.org/abs/2103.05908
摘要：We combine deep learning and Conditional Probabilistic Context Free Grammars (CPCFG) to create an end-to-end system for extracting structured information from complex documents. For each class of documents, we create a CPCFG that describes the structure of the information to be extracted. Conditional probabilities are modeled by deep neural networks. We use this grammar to parse 2-D documents to directly produce structured records containing the extracted information. This system is trained end-to-end with (Document, Record) pairs. We apply this approach to extract information from scanned invoices achieving state-of-the-art results.[3]：ELLA: Exploration through Learned Language Abstraction
标题：通过学习语言抽象进行探索
作者：Suvir Mirchandani, Siddharth Karamcheti, Dorsa Sadigh
备注：13 pages, 8 figures
链接：https://arxiv.org/abs/2103.05825
摘要：Building agents capable of understanding language instructions is critical to effective and robust human-AI collaboration. Recent work focuses on training these instruction following agents via reinforcement learning in environments with synthetic language; however, these instructions often define long-horizon, sparse-reward tasks, and learning policies requires many episodes of experience. To this end, we introduce ELLA: Exploration through Learned Language Abstraction, a reward shaping approach that correlates high-level instructions with simpler low-level instructions to enrich the sparse rewards afforded by the environment. ELLA has two key elements: 1) A termination classifier that identifies when agents complete low-level instructions, and 2) A relevance classifier that correlates low-level instructions with success on high-level tasks. We learn the termination classifier offline from pairs of instructions and terminal states. Notably, in departure from prior work in language and abstraction, we learn the relevance classifier online, without relying on an explicit decomposition of high-level instructions to low-level instructions. On a suite of complex grid world environments with varying instruction complexities and reward sparsity, ELLA shows a significant gain in sample efficiency across several environments compared to competitive language-based reward shaping and no-shaping methods.[4]：An Amharic News Text classification Dataset
标题：阿姆哈拉语新闻文本分类数据集
作者：Israel Abebe Azime, Nebil Mohammed
链接：https://arxiv.org/abs/2103.05639
摘要：In NLP, text classification is one of the primary problems we try to solve and its uses in language analyses are indisputable. The lack of labeled training data made it harder to do these tasks in low resource languages like Amharic. The task of collecting, labeling, annotating, and making valuable this kind of data will encourage junior researchers, schools, and machine learning practitioners to implement existing classification models in their language. In this short paper, we aim to introduce the Amharic text classification dataset that consists of more than 50k news articles that were categorized into 6 classes. This dataset is made available with easy baseline performances to encourage studies and better performance experiments.中文来自机器翻译，仅供参考。

表情

图片

附件

热门资讯

北京大学CCL语料库语系、语族、语支——世界语言万花筒【前沿】R语言元分析专题第七章：亚组分析【前沿】交叉滞后中介模型Mplus的应用语言学的主要分支【网上课堂】雨课堂+腾讯会议操作攻略揭开句法学之谜：主谓框架－成分分析法的由... R语言元分析专题：计算效应量的大小 2020年最新语言学SSCI期刊影响因子排名... R语言元分析专题第五章：森林图

推荐工具