Search | arXiv e-print repository

The Tracking Machine Learning challenge : Throughput phase

Authors: Sabrina Amrouche, Laurent Basara, Paolo Calafiura, Dmitry Emeliyanov, Victor Estrade, Steven Farrell, Cécile Germain, Vladimir Vava Gligorov, Tobias Golling, Sergey Gorbunov, Heather Gray, Isabelle Guyon, Mikhail Hushchyn, Vincenzo Innocente, Moritz Kiehn, Marcel Kunze, Edward Moyse, David Rousseau, Andreas Salzburger, Andrey Ustyuzhanin, Jean-Roch Vlimant

Abstract: This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them in… ▽ More This paper reports on the second "Throughput" phase of the Tracking Machine Learning (TrackML) challenge on the Codalab platform. As in the first "Accuracy" phase, the participants had to solve a difficult experimental problem linked to tracking accurately the trajectory of particles as e.g. created at the Large Hadron Collider (LHC): given O($10^5$) points, the participants had to connect them into O($10^4$) individual groups that represent the particle trajectories which are approximated helical. While in the first phase only the accuracy mattered, the goal of this second phase was a compromise between the accuracy and the speed of inference. Both were measured on the Codalab platform where the participants had to upload their software. The best three participants had solutions with good accuracy and speed an order of magnitude faster than the state of the art when the challenge was designed. Although the core algorithms were less diverse than in the first phase, a diversity of techniques have been used and are described in this paper. The performance of the algorithms are analysed in depth and lessons derived. △ Less

Submitted 14 May, 2021; v1 submitted 3 May, 2021; originally announced May 2021.

Comments: submitted to Computing and Software for Big Science

arXiv:cs/0501096 [pdf, ps, other]

Transforming and Enriching Documents for the Semantic Web

Authors: Dietmar Roesner, Manuela Kunze, Sylke Kroetzsch

Abstract: We suggest to employ techniques from Natural Language Processing (NLP) and Knowledge Representation (KR) to transform existing documents into documents amenable for the Semantic Web. Semantic Web documents have at least part of their semantics and pragmatics marked up explicitly in both a machine processable as well as human readable manner. XML and its related standards (XSLT, RDF, Topic Maps e… ▽ More We suggest to employ techniques from Natural Language Processing (NLP) and Knowledge Representation (KR) to transform existing documents into documents amenable for the Semantic Web. Semantic Web documents have at least part of their semantics and pragmatics marked up explicitly in both a machine processable as well as human readable manner. XML and its related standards (XSLT, RDF, Topic Maps etc.) are the unifying platform for the tools and methodologies developed for different application scenarios. △ Less

Submitted 31 January, 2005; originally announced January 2005.

Comments: 10 pages, 1 figure

ACM Class: H3.1; I.2.7

Journal ref: KI (1), 2004

arXiv:cs/0501095 [pdf, ps, other]

Context Related Derivation of Word Senses

Authors: Manuela Kunze, Dietmar Roesner

Abstract: Real applications of natural language document processing are very often confronted with domain specific lexical gaps during the analysis of documents of a new domain. This paper describes an approach for the derivation of domain specific concepts for the extension of an existing ontology. As resources we need an initial ontology and a partially processed corpus of a domain. We exploit the speci… ▽ More Real applications of natural language document processing are very often confronted with domain specific lexical gaps during the analysis of documents of a new domain. This paper describes an approach for the derivation of domain specific concepts for the extension of an existing ontology. As resources we need an initial ontology and a partially processed corpus of a domain. We exploit the specific characteristic of the sublanguage in the corpus. Our approach is based on syntactical structures (noun phrases) and compound analyses to extract information required for the extension of GermaNet's lexical resources. △ Less

Submitted 31 January, 2005; originally announced January 2005.

Comments: 5 pages, 2 figures

ACM Class: I.2.7; I.2.6

Journal ref: in Proceedings of Ontolex- Workshop 2004

arXiv:cs/0501094 [pdf, ps, other]

Corpus based Enrichment of GermaNet Verb Frames

Authors: Manuela Kunze, Dietmar Roesner

Abstract: Lexical semantic resources, like WordNet, are often used in real applications of natural language document processing. For example, we integrated GermaNet in our document suite XDOC of processing of German forensic autopsy protocols. In addition to the hypernymy and synonymy relation, we want to adapt GermaNet's verb frames for our analysis. In this paper we outline an approach for the domain re… ▽ More Lexical semantic resources, like WordNet, are often used in real applications of natural language document processing. For example, we integrated GermaNet in our document suite XDOC of processing of German forensic autopsy protocols. In addition to the hypernymy and synonymy relation, we want to adapt GermaNet's verb frames for our analysis. In this paper we outline an approach for the domain related enrichment of GermaNet verb frames by corpus based syntactic and co-occurred data analyses of real documents. △ Less

Submitted 1 February, 2005; v1 submitted 31 January, 2005; originally announced January 2005.

Comments: 4 pages

ACM Class: I.2.7; I.2.6

Journal ref: in Proceedings of LREC 2004

arXiv:cs/0501093 [pdf, ps, other]

Transforming Business Rules Into Natural Language Text

Authors: Manuela Kunze, Dietmar Roesner

Abstract: The aim of the project presented in this paper is to design a system for an NLG architecture, which supports the documentation process of eBusiness models. A major task is to enrich the formal description of an eBusiness model with additional information needed in an NLG task. The aim of the project presented in this paper is to design a system for an NLG architecture, which supports the documentation process of eBusiness models. A major task is to enrich the formal description of an eBusiness model with additional information needed in an NLG task. △ Less

Submitted 31 January, 2005; originally announced January 2005.

Comments: 3 pages

ACM Class: I.2.7

Journal ref: in Proceedings of IWCS-6, 2005

arXiv:cs/0501089 [pdf, ps, other]

Issues in Exploiting GermaNet as a Resource in Real Applications

Authors: Manuela Kunze, Dietmar Roesner

Abstract: This paper reports about experiments with GermaNet as a resource within domain specific document analysis. The main question to be answered is: How is the coverage of GermaNet in a specific domain? We report about results of a field test of GermaNet for analyses of autopsy protocols and present a sketch about the integration of GermaNet inside XDOC. Our remarks will contribute to a GermaNet user… ▽ More This paper reports about experiments with GermaNet as a resource within domain specific document analysis. The main question to be answered is: How is the coverage of GermaNet in a specific domain? We report about results of a field test of GermaNet for analyses of autopsy protocols and present a sketch about the integration of GermaNet inside XDOC. Our remarks will contribute to a GermaNet user's wish list. △ Less

Submitted 31 January, 2005; originally announced January 2005.

Comments: 10 pages, 3 figures

ACM Class: H3.1; I.2.7

arXiv:cs/0501086 [pdf]

Clever Search: A WordNet Based Wrapper for Internet Search Engines

Authors: Peter M. Kruse, Andre Naujoks, Dietmar Roesner, Manuela Kunze

Abstract: This paper presents an approach to enhance search engines with information about word senses available in WordNet. The approach exploits information about the conceptual relations within the lexical-semantic net. In the wrapper for search engines presented, WordNet information is used to specify user's request or to classify the results of a publicly available web search engine, like google, yah… ▽ More This paper presents an approach to enhance search engines with information about word senses available in WordNet. The approach exploits information about the conceptual relations within the lexical-semantic net. In the wrapper for search engines presented, WordNet information is used to specify user's request or to classify the results of a publicly available web search engine, like google, yahoo, etc. △ Less

Submitted 31 January, 2005; originally announced January 2005.

ACM Class: H 3.3, H 5.2

Journal ref: Proceedings of 2nd GermaNet Workshop 2005

arXiv:cs/0304036 [pdf, ps, other]

An Approach for Resource Sharing in Multilingual NLP

Authors: Manuela Kunze, Chun Xiao

Abstract: In this paper we describe an approach for the analysis of documents in German and English with a shared pool of resources. For the analysis of German documents we use a document suite, which supports the user in tasks like information retrieval and information extraction. The core of the document suite is based on our tool XDOC. Now we want to exploit these methods for the analysis of English do… ▽ More In this paper we describe an approach for the analysis of documents in German and English with a shared pool of resources. For the analysis of German documents we use a document suite, which supports the user in tasks like information retrieval and information extraction. The core of the document suite is based on our tool XDOC. Now we want to exploit these methods for the analysis of English documents as well. For this aim we need a multilingual presentation format of the resources. These resources must be transformed into an unified format, in which we can set additional information about linguistic characteristics of the language depending on the analyzed documents. In this paper we describe our approach for such an exchange model for multilingual resources based on XML. △ Less

Submitted 23 April, 2003; originally announced April 2003.

Comments: poster

ACM Class: H3.1; I.2.7

Journal ref: STAIRS 2002 - STarting Artificial Intelligence Researchers Symposium at the ECAI 2002. Lyon, France. ISBN 158603 259 3. IOS Press Amsterdam, p. 123-124

arXiv:cs/0304035 [pdf, ps, other]

Exploiting Sublanguage and Domain Characteristics in a Bootstrapping Approach to Lexicon and Ontology Creation

Authors: Dietmar Roesner, Manuela Kunze

Abstract: It is very costly to build up lexical resources and domain ontologies. Especially when confronted with a new application domain lexical gaps and a poor coverage of domain concepts are a problem for the successful exploitation of natural language document analysis systems that need and exploit such knowledge sources. In this paper we report about ongoing experiments with `bootstrapping techniques… ▽ More It is very costly to build up lexical resources and domain ontologies. Especially when confronted with a new application domain lexical gaps and a poor coverage of domain concepts are a problem for the successful exploitation of natural language document analysis systems that need and exploit such knowledge sources. In this paper we report about ongoing experiments with `bootstrapping techniques' for lexicon and ontology creation. △ Less

Submitted 23 April, 2003; originally announced April 2003.

ACM Class: H.3.1; I.2.7

Journal ref: Workshop-Proceedings of the OntoLex 2002 - Ontologies and Lexical Knowledge Bases at the LREC 2002, p. 68-73

arXiv:cs/0304029 [pdf, ps, other]

An XML based Document Suite

Authors: Dietmar Roesner, Manuela Kunze

Abstract: We report about the current state of development of a document suite and its applications. This collection of tools for the flexible and robust processing of documents in German is based on the use of XML as unifying formalism for encoding input and output data as well as process information. It is organized in modules with limited responsibilities that can easily be combined into pipelines to s… ▽ More We report about the current state of development of a document suite and its applications. This collection of tools for the flexible and robust processing of documents in German is based on the use of XML as unifying formalism for encoding input and output data as well as process information. It is organized in modules with limited responsibilities that can easily be combined into pipelines to solve complex tasks. Strong emphasis is laid on a number of techniques to deal with lexical and conceptual gaps that are typical when starting a new application. △ Less

Submitted 22 April, 2003; originally announced April 2003.

ACM Class: I.2.7; H.3.1

Journal ref: Proceedings of COLING 2002; p. 1278-1282

Showing 1–10 of 10 results for author: Kunze, M