Search | arXiv e-print repository

HunFlair2 in a cross-corpus evaluation of biomedical named entity recognition and normalization tools

Authors: Mario Sänger, Samuele Garda, Xing David Wang, Leon Weber-Genzel, Pia Droop, Benedikt Fuchs, Alan Akbik, Ulf Leser

Abstract: With the exponential growth of the life science literature, biomedical text mining (BTM) has become an essential technology for accelerating the extraction of insights from publications. Identifying named entities (e.g., diseases, drugs, or genes) in texts and their linkage to reference knowledge bases are crucial steps in BTM pipelines to enable information aggregation from different documents. H… ▽ More With the exponential growth of the life science literature, biomedical text mining (BTM) has become an essential technology for accelerating the extraction of insights from publications. Identifying named entities (e.g., diseases, drugs, or genes) in texts and their linkage to reference knowledge bases are crucial steps in BTM pipelines to enable information aggregation from different documents. However, tools for these two steps are rarely applied in the same context in which they were developed. Instead, they are applied in the wild, i.e., on application-dependent text collections different from those used for the tools' training, varying, e.g., in focus, genre, style, and text type. This raises the question of whether the reported performance of BTM tools can be trusted for downstream applications. Here, we report on the results of a carefully designed cross-corpus benchmark for named entity extraction, where tools were applied systematically to corpora not used during their training. Based on a survey of 28 published systems, we selected five for an in-depth analysis on three publicly available corpora encompassing four different entity types. Comparison between tools results in a mixed picture and shows that, in a cross-corpus setting, the performance is significantly lower than the one reported in an in-corpus setting. HunFlair2 showed the best performance on average, being closely followed by PubTator. Our results indicate that users of BTM tools should expect diminishing performances when applying them in the wild compared to original publications and show that further research is necessary to make BTM tools more robust. △ Less

Submitted 20 February, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

arXiv:2202.12795 [pdf, other]

Equilibrium Aggregation: Encoding Sets via Optimization

Authors: Sergey Bartunov, Fabian B. Fuchs, Timothy Lillicrap

Abstract: Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation. While a number of aggregation methods already exist from simple sum pooling to multi-head attention, they are limited in their representational power both from theoretical and empirical perspectives. On the search of a pr… ▽ More Processing sets or other unordered, potentially variable-sized inputs in neural networks is usually handled by aggregating a number of input tensors into a single representation. While a number of aggregation methods already exist from simple sum pooling to multi-head attention, they are limited in their representational power both from theoretical and empirical perspectives. On the search of a principally more powerful aggregation strategy, we propose an optimization-based method called Equilibrium Aggregation. We show that many existing aggregation methods can be recovered as special cases of Equilibrium Aggregation and that it is provably more efficient in some important cases. Equilibrium Aggregation can be used as a drop-in replacement in many existing architectures and applications. We validate its efficiency on three different tasks: median estimation, class counting, and molecular property prediction. In all experiments, Equilibrium Aggregation achieves higher performance than the other aggregation techniques we test. △ Less

Submitted 3 July, 2022; v1 submitted 25 February, 2022; originally announced February 2022.

Comments: Published at UAI 2022

arXiv:2107.01959 [pdf, other]

Universal Approximation of Functions on Sets

Authors: Edward Wagstaff, Fabian B. Fuchs, Martin Engelcke, Michael A. Osborne, Ingmar Posner

Abstract: Modelling functions of sets, or equivalently, permutation-invariant functions, is a long-standing challenge in machine learning. Deep Sets is a popular method which is known to be a universal approximator for continuous set functions. We provide a theoretical analysis of Deep Sets which shows that this universal approximation property is only guaranteed if the model's latent space is sufficiently… ▽ More Modelling functions of sets, or equivalently, permutation-invariant functions, is a long-standing challenge in machine learning. Deep Sets is a popular method which is known to be a universal approximator for continuous set functions. We provide a theoretical analysis of Deep Sets which shows that this universal approximation property is only guaranteed if the model's latent space is sufficiently high-dimensional. If the latent space is even one dimension lower than necessary, there exist piecewise-affine functions for which Deep Sets performs no better than a naïve constant baseline, as judged by worst-case error. Deep Sets may be viewed as the most efficient incarnation of the Janossy pooling paradigm. We identify this paradigm as encompassing most currently popular set-learning methods. Based on this connection, we discuss the implications of our results for set learning more broadly, and identify some open questions on the universality of Janossy pooling in general. △ Less

Submitted 5 July, 2021; originally announced July 2021.

Comments: 54 pages, 13 figures

arXiv:2105.09016 [pdf, other]

E(n) Equivariant Normalizing Flows

Authors: Victor Garcia Satorras, Emiel Hoogeboom, Fabian B. Fuchs, Ingmar Posner, Max Welling

Abstract: This paper introduces a generative model equivariant to Euclidean symmetries: E(n) Equivariant Normalizing Flows (E-NFs). To construct E-NFs, we take the discriminative E(n) graph neural networks and integrate them as a differential equation to obtain an invertible equivariant function: a continuous-time normalizing flow. We demonstrate that E-NFs considerably outperform baselines and existing met… ▽ More This paper introduces a generative model equivariant to Euclidean symmetries: E(n) Equivariant Normalizing Flows (E-NFs). To construct E-NFs, we take the discriminative E(n) graph neural networks and integrate them as a differential equation to obtain an invertible equivariant function: a continuous-time normalizing flow. We demonstrate that E-NFs considerably outperform baselines and existing methods from the literature on particle systems such as DW4 and LJ13, and on molecules from QM9 in terms of log-likelihood. To the best of our knowledge, this is the first flow that jointly generates molecule features and positions in 3D. △ Less

Submitted 14 January, 2022; v1 submitted 19 May, 2021; originally announced May 2021.

Comments: Accepted at Neural Information Processing Systems (NeurIPS 2021)

arXiv:2102.13419 [pdf, other]

Iterative SE(3)-Transformers

Authors: Fabian B. Fuchs, Edward Wagstaff, Justas Dauparas, Ingmar Posner

Abstract: When manipulating three-dimensional data, it is possible to ensure that rotational and translational symmetries are respected by applying so-called SE(3)-equivariant models. Protein structure prediction is a prominent example of a task which displays these symmetries. Recent work in this area has successfully made use of an SE(3)-equivariant model, applying an iterative SE(3)-equivariant attention… ▽ More When manipulating three-dimensional data, it is possible to ensure that rotational and translational symmetries are respected by applying so-called SE(3)-equivariant models. Protein structure prediction is a prominent example of a task which displays these symmetries. Recent work in this area has successfully made use of an SE(3)-equivariant model, applying an iterative SE(3)-equivariant attention mechanism. Motivated by this application, we implement an iterative version of the SE(3)-Transformer, an SE(3)-equivariant attention-based model for graph data. We address the additional complications which arise when applying the SE(3)-Transformer in an iterative fashion, compare the iterative and single-pass versions on a toy problem, and consider why an iterative model may be beneficial in some problem settings. We make the code for our implementation available to the community. △ Less

Submitted 16 March, 2021; v1 submitted 26 February, 2021; originally announced February 2021.

arXiv:2006.10503 [pdf, other]

SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks

Authors: Fabian B. Fuchs, Daniel E. Worrall, Volker Fischer, Max Welling

Abstract: We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations. Equivariance is important to ensure stable and predictable performance in the presence of nuisance transformations of the data input. A positive corollary of equivariance is increased weight-tying within the model. The SE(3)-Transfor… ▽ More We introduce the SE(3)-Transformer, a variant of the self-attention module for 3D point clouds and graphs, which is equivariant under continuous 3D roto-translations. Equivariance is important to ensure stable and predictable performance in the presence of nuisance transformations of the data input. A positive corollary of equivariance is increased weight-tying within the model. The SE(3)-Transformer leverages the benefits of self-attention to operate on large point clouds and graphs with varying number of points, while guaranteeing SE(3)-equivariance for robustness. We evaluate our model on a toy N-body particle simulation dataset, showcasing the robustness of the predictions under rotations of the input. We further achieve competitive performance on two real-world datasets, ScanObjectNN and QM9. In all cases, our model outperforms a strong, non-equivariant attention baseline and an equivariant model without attention. △ Less

Submitted 24 November, 2020; v1 submitted 18 June, 2020; originally announced June 2020.

arXiv:1907.12887 [pdf, other]

End-to-end Recurrent Multi-Object Tracking and Trajectory Prediction with Relational Reasoning

Authors: Fabian B. Fuchs, Adam R. Kosiorek, Li Sun, Oiwi Parker Jones, Ingmar Posner

Abstract: The majority of contemporary object-tracking approaches do not model interactions between objects. This contrasts with the fact that objects' paths are not independent: a cyclist might abruptly deviate from a previously planned trajectory in order to avoid colliding with a car. Building upon HART, a neural class-agnostic single-object tracker, we introduce a multi-object tracking method MOHART cap… ▽ More The majority of contemporary object-tracking approaches do not model interactions between objects. This contrasts with the fact that objects' paths are not independent: a cyclist might abruptly deviate from a previously planned trajectory in order to avoid colliding with a car. Building upon HART, a neural class-agnostic single-object tracker, we introduce a multi-object tracking method MOHART capable of relational reasoning. Importantly, the entire system, including the understanding of interactions and relations between objects, is class-agnostic and learned simultaneously in an end-to-end fashion. We explore a number of relational reasoning architectures and show that permutation-invariant models outperform non-permutation-invariant alternatives. We also find that architectures using a single permutation invariant operation like DeepSets, despite, in theory, being universal function approximators, are nonetheless outperformed by a more complex architecture based on multi-headed attention. The latter better accounts for complex physical interactions in a challenging toy experiment. Further, we find that modelling interactions leads to consistent performance gains in tracking as well as future trajectory prediction on three real-world datasets (MOTChallenge, UA-DETRAC, and Stanford Drone dataset), particularly in the presence of ego-motion, occlusions, crowded scenes, and faulty sensor inputs. △ Less

Submitted 28 September, 2020; v1 submitted 12 July, 2019; originally announced July 2019.

arXiv:1901.09006 [pdf, other]

On the Limitations of Representing Functions on Sets

Authors: Edward Wagstaff, Fabian B. Fuchs, Martin Engelcke, Ingmar Posner, Michael Osborne

Abstract: Recent work on the representation of functions on sets has considered the use of summation in a latent space to enforce permutation invariance. In particular, it has been conjectured that the dimension of this latent space may remain fixed as the cardinality of the sets under consideration increases. However, we demonstrate that the analysis leading to this conjecture requires mappings which are h… ▽ More Recent work on the representation of functions on sets has considered the use of summation in a latent space to enforce permutation invariance. In particular, it has been conjectured that the dimension of this latent space may remain fixed as the cardinality of the sets under consideration increases. However, we demonstrate that the analysis leading to this conjecture requires mappings which are highly discontinuous and argue that this is only of limited practical use. Motivated by this observation, we prove that an implementation of this model via continuous mappings (as provided by e.g. neural networks or Gaussian processes) actually imposes a constraint on the dimensionality of the latent space. Practical universal function representation for set inputs can only be achieved with a latent dimension at least the size of the maximum number of input elements. △ Less

Submitted 7 October, 2019; v1 submitted 25 January, 2019; originally announced January 2019.

Comments: Published at the International Conference on Machine Learning (2019)

arXiv:1806.05502 [pdf, other]

Scrutinizing and De-Biasing Intuitive Physics with Neural Stethoscopes

Authors: Fabian B. Fuchs, Oliver Groth, Adam R. Kosiorek, Alex Bewley, Markus Wulfmeier, Andrea Vedaldi, Ingmar Posner

Abstract: Visually predicting the stability of block towers is a popular task in the domain of intuitive physics. While previous work focusses on prediction accuracy, a one-dimensional performance measure, we provide a broader analysis of the learned physical understanding of the final model and how the learning process can be guided. To this end, we introduce neural stethoscopes as a general purpose framew… ▽ More Visually predicting the stability of block towers is a popular task in the domain of intuitive physics. While previous work focusses on prediction accuracy, a one-dimensional performance measure, we provide a broader analysis of the learned physical understanding of the final model and how the learning process can be guided. To this end, we introduce neural stethoscopes as a general purpose framework for quantifying the degree of importance of specific factors of influence in deep neural networks as well as for actively promoting and suppressing information as appropriate. In doing so, we unify concepts from multitask learning as well as training with auxiliary and adversarial losses. We apply neural stethoscopes to analyse the state-of-the-art neural network for stability prediction. We show that the baseline model is susceptible to being misled by incorrect visual cues. This leads to a performance breakdown to the level of random guessing when training on scenarios where visual cues are inversely correlated with stability. Using stethoscopes to promote meaningful feature extraction increases performance from 51% to 90% prediction accuracy. Conversely, training on an easy dataset where visual cues are positively correlated with stability, the baseline model learns a bias leading to poor performance on a harder dataset. Using an adversarial stethoscope, the network is successfully de-biased, leading to a performance increase from 66% to 88%. △ Less

Submitted 6 September, 2019; v1 submitted 14 June, 2018; originally announced June 2018.

arXiv:1804.08018 [pdf, other]

ShapeStacks: Learning Vision-Based Physical Intuition for Generalised Object Stacking

Authors: Oliver Groth, Fabian B. Fuchs, Ingmar Posner, Andrea Vedaldi

Abstract: Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of e… ▽ More Physical intuition is pivotal for intelligent agents to perform complex tasks. In this paper we investigate the passive acquisition of an intuitive understanding of physical principles as well as the active utilisation of this intuition in the context of generalised object stacking. To this end, we provide: a simulation-based dataset featuring 20,000 stack configurations composed of a variety of elementary geometric primitives richly annotated regarding semantics and structural stability. We train visual classifiers for binary stability prediction on the ShapeStacks data and scrutinise their learned physical intuition. Due to the richness of the training data our approach also generalises favourably to real-world scenarios achieving state-of-the-art stability prediction on a publicly available benchmark of block towers. We then leverage the physical intuition learned by our model to actively construct stable stacks and observe the emergence of an intuitive notion of stackability - an inherent object affordance - induced by the active stacking task. Our approach performs well even in challenging conditions where it considerably exceeds the stack height observed during training or in cases where initially unstable structures must be stabilised via counterbalancing. △ Less

Submitted 6 July, 2018; v1 submitted 21 April, 2018; originally announced April 2018.

Comments: revised version to appear at ECCV 2018

arXiv:1412.2958 [pdf, other]

doi 10.1371/journal.pone.0133185

Physical forces between humans and how humans attract and repel each other based on their social interactions in an online world

Authors: Stefan Thurner, Benedikt Fuchs

Abstract: Physical interactions between particles are the result of the exchange of gauge bosons. Human interactions are mediated by the exchange of messages, goods, money, promises, hostilities, etc. While in the physical world interactions and their associated forces have immediate dynamical consequences (Newton's law) the situation is not clear for human interactions. Here we study the acceleration betwe… ▽ More Physical interactions between particles are the result of the exchange of gauge bosons. Human interactions are mediated by the exchange of messages, goods, money, promises, hostilities, etc. While in the physical world interactions and their associated forces have immediate dynamical consequences (Newton's law) the situation is not clear for human interactions. Here we study the acceleration between humans who interact through the exchange of messages, goods and hostilities in a massive multiplayer online game. For this game we have complete information about all interactions (exchange events) between about 1/2 million players, and about their trajectories (positions) in a metric space of the game universe at any point in time. We derive the interaction potentials for communication, trade and attacks and show that they are harmonic in nature. Individuals who exchange messages and trade goods generally attract each other and start to separate immediately after exchange events stop. The interaction potential for attacks mirrors the usual "hit-and-run" tactics of aggressive players. By measuring interaction intensities as a function of distance, velocity and acceleration, we show that "forces" between players are directly related to the number of exchange events. The power-law of the likelihood for interactions vs. distance is in accordance with previous real world empirical work. We show that the obtained potentials can be understood with a simple model assuming an exchange-driven force in combination with a distance dependent exchange rate. △ Less

Submitted 9 December, 2014; originally announced December 2014.

Comments: 7 pages, 7 figures

arXiv:1407.2006 [pdf, ps, other]

doi 10.1016/j.physa.2014.09.056

Interevent time distributions of human multi-level activity in a virtual world

Authors: Olesya Mryglod, Benedikt Fuchs, Michael Szell, Yurij Holovatch, Stefan Thurner

Abstract: Studying human behaviour in virtual environments provides extraordinary opportunities for a quantitative analysis of social phenomena with levels of accuracy that approach those of the natural sciences. In this paper we use records of player activities in the massive multiplayer online game Pardus over 1,238 consecutive days, and analyze dynamical features of sequences of actions of players. We bu… ▽ More Studying human behaviour in virtual environments provides extraordinary opportunities for a quantitative analysis of social phenomena with levels of accuracy that approach those of the natural sciences. In this paper we use records of player activities in the massive multiplayer online game Pardus over 1,238 consecutive days, and analyze dynamical features of sequences of actions of players. We build on previous work were temporal structures of human actions of the same type were quantified, and extend provide an empirical understanding of human actions of different types. This study of multi-level human activity can be seen as a dynamic counterpart of static multiplex network analysis. We show that the interevent time distributions of actions in the Pardus universe follow highly non-trivial distribution functions, from which we extract action-type specific characteristic "decay constants". We discuss characteristic features of interevent time distributions, including periodic patterns on different time scales, bursty dynamics, and various functional forms on different time scales. We comment on gender differences of players in emotional actions, and find that while male and female act similarly when performing some positive actions, females are slightly faster for negative actions. We also observe effects on the age of players: more experienced players are generally faster in making decisions about engaging and terminating in enmity and friendship, respectively. △ Less

Submitted 8 July, 2014; originally announced July 2014.

Comments: 19 pages

Journal ref: Physica A 419, 681-690 (2014)

arXiv:1403.3228 [pdf, other]

Fractal multi-level organisation of human groups in a virtual world

Authors: Benedikt Fuchs, Didier Sornette, Stefan Thurner

Abstract: Humans are fundamentally social. They have progressively dominated their environment by the strength and creativity provided by and within their grouping. It is well recognised that human groups are highly structured, and the anthropological literature has loosely classified them according to their size and function, such as support cliques, sympathy groups, bands, cognitive groups, tribes, lingui… ▽ More Humans are fundamentally social. They have progressively dominated their environment by the strength and creativity provided by and within their grouping. It is well recognised that human groups are highly structured, and the anthropological literature has loosely classified them according to their size and function, such as support cliques, sympathy groups, bands, cognitive groups, tribes, linguistic groups and so on. Recently, combining data on human grouping patterns in a comprehensive and systematic study, Zhou et al. identified a quantitative discrete hierarchy of group sizes with a preferred scaling ratio close to $3$, which was later confirmed for hunter-gatherer groups and for other mammalian societies. Using high precision large scale Internet-based social network data, we extend these early findings on a very large data set. We analyse the organisational structure of a complete, multi-relational, large social multiplex network of a human society consisting of about 400,000 odd players of a massive multiplayer online game for which we know all about the group memberships of every player. Remarkably, the online players exhibit the same type of structured hierarchical layers as the societies studied by anthropologists, where each of these layers is three to four times the size of the lower layer. Our findings suggest that the hierarchical organisation of human society is deeply nested in human psychology. △ Less

Submitted 13 March, 2014; originally announced March 2014.

Comments: 9 pages, 3 figures

arXiv:1309.6740 [pdf, other]

Detection of the elite structure in a virtual multiplex social system by means of a generalized $K$-core

Authors: Bernat Corominas-Murtra, Benedikt Fuchs, Stefan Thurner

Abstract: Elites are subgroups of individuals within a society that have the ability and means to influence, lead, govern, and shape societies. Members of elites are often well connected individuals, which enables them to impose their influence to many and to quickly gather, process, and spread information. Here we argue that elites are not only composed of highly connected individuals, but also of intermed… ▽ More Elites are subgroups of individuals within a society that have the ability and means to influence, lead, govern, and shape societies. Members of elites are often well connected individuals, which enables them to impose their influence to many and to quickly gather, process, and spread information. Here we argue that elites are not only composed of highly connected individuals, but also of intermediaries connecting hubs to form a cohesive and structured elite-subgroup at the core of a social network. For this purpose we present a generalization of the $K$-core algorithm that allows to identify a social core that is composed of well-connected hubs together with their `connectors'. We show the validity of the idea in the framework of a virtual world defined by a massive multiplayer online game, on which we have complete information of various social networks. Exploiting this multiplex structure, we find that the hubs of the generalized $K$-core identify those individuals that are high social performers in terms of a series of indicators that are available in the game. In addition, using a combined strategy which involves the generalized $K$-core and the recently introduced $M$-core, the elites of the different 'nations' present in the game are perfectly identified as modules of the generalized $K$-core. Interesting sudden shifts in the composition of the elite cores are observed at deep levels. We show that elite detection with the traditional $K$-core is not possible in a reliable way. The proposed method might be useful in a series of more general applications, such as community detection. △ Less

Submitted 22 December, 2014; v1 submitted 26 September, 2013; originally announced September 2013.

Comments: 13 figures, 3 tables, 19 pages. Accepted for publication in PLoS ONE

Showing 1–14 of 14 results for author: Fuchs, B