Search | arXiv e-print repository

doi 10.1016/j.ascom.2014.08.003

Virtual Observatory Publishing with DaCHS

Authors: Markus Demleitner, Margarida Castro Neves, Florian Rothmaier, Joachim Wambsganss

Abstract: The Data Center Helper Suite DaCHS is an integrated publication package for building Virtual Observatory (VO) and Web services, supporting the entire workflow from ingestion to data mapping to service definition. It implements all major data discovery, data access, and registry protocols defined by the VO. DaCHS in this sense works as glue between data produced by the data providers and the standa… ▽ More The Data Center Helper Suite DaCHS is an integrated publication package for building Virtual Observatory (VO) and Web services, supporting the entire workflow from ingestion to data mapping to service definition. It implements all major data discovery, data access, and registry protocols defined by the VO. DaCHS in this sense works as glue between data produced by the data providers and the standard protocols and formats defined by the VO. This paper discusses central elements of the design of the package and gives two case studies of how VO protocols are implemented using DaCHS' concepts. △ Less

Submitted 25 August, 2014; originally announced August 2014.

MSC Class: 68U35

arXiv:1407.3083 [pdf, other]

The Virtual Observatory Registry

Authors: Markus Demleitner, Gretchen Greene, Pierre Le Sidaner, Raymond L. Plante

Abstract: In the Virtual Observatory (VO), the Registry provides the mechanism with which users and applications discover and select resources -- typically, data and services -- that are relevant for a particular scientific problem. Even though the VO adopted technologies in particular from the bibliographic community where available, building the Registry system involved a major standardisation effort, inv… ▽ More In the Virtual Observatory (VO), the Registry provides the mechanism with which users and applications discover and select resources -- typically, data and services -- that are relevant for a particular scientific problem. Even though the VO adopted technologies in particular from the bibliographic community where available, building the Registry system involved a major standardisation effort, involving about a dozen interdependent standard texts. This paper discusses the server-side aspects of the standards and their application, as regards the functional components (registries), the resource records in both format and content, the exchange of resource records between registries (harvesting), as well as the creation and management of the identifiers used in the system based on the notion of authorities. Registry record authors, registry operators or even advanced users thus receive a big picture serving as a guideline through the body of relevant standard texts. To complete this picture, we also mention common usage patterns and open issues as appropriate. △ Less

Submitted 11 July, 2014; originally announced July 2014.

MSC Class: 68U35

arXiv:1402.4750 [pdf]

doi 10.5479/ADS/bib/2013ivoa.spec.1129D

IVOA Recommendation: DALI: Data Access Layer Interface Version 1.0

Authors: Patrick Dowler, Markus Demleitner, Mark Taylor, Doug Tody

Abstract: This document describes the Data Access Layer Interface (DALI). DALI defines the base web service interface common to all Data Access Layer (DAL) services. This standard defines the behaviour of common resources, the meaning and use of common parameters, success and error responses, and DAL service registration. The goal of this specification is to define the common elements that are shared across… ▽ More This document describes the Data Access Layer Interface (DALI). DALI defines the base web service interface common to all Data Access Layer (DAL) services. This standard defines the behaviour of common resources, the meaning and use of common parameters, success and error responses, and DAL service registration. The goal of this specification is to define the common elements that are shared across DAL services in order to foster consistency across concrete DAL service specifications and to enable standard re-usable client and service implementations and libraries to be written and widely adopted. △ Less

Submitted 19 February, 2014; originally announced February 2014.

Report number: REC-DALI-1.0-20131129

arXiv:1402.4742 [pdf]

doi 10.5479/ADS/bib/2012ivoa.spec.0827D

IVOA Recommendation: TAPRegExt: a VOResource Schema Extension for Describing TAP Services

Authors: Markus Demleitner, Patrick Dowler, Ray Plante, Guy Rixon, Mark Taylor

Abstract: This document describes an XML encoding standard for metadata about services implementing the table access protocol TAP [TAP], referred to as TAPRegExt. Instance documents are part of the service's registry record or can be obtained from the service itself. They deliver information to both humans and software on the languages, output formats, and upload methods supported by the service, as well as… ▽ More This document describes an XML encoding standard for metadata about services implementing the table access protocol TAP [TAP], referred to as TAPRegExt. Instance documents are part of the service's registry record or can be obtained from the service itself. They deliver information to both humans and software on the languages, output formats, and upload methods supported by the service, as well as data models implemented by the exposed tables, optional language features, and certain limits enforced by the service. △ Less

Submitted 19 February, 2014; originally announced February 2014.

Report number: REC-TAPRegExt-1.0-20120827

arXiv:0909.4789 [pdf]

doi 10.1002/asi.20096

The Bibliometric Properties of Article Readership Information

Authors: Michael J. Kurtz, Guenther Eichhorn, Alberto Accomazzi, Carolyn S. Grant, Markus Demleitner, Stephen S. Murray, Nathalie Martimbeau, Barbara Elwell

Abstract: The NASA Astrophysics Data System (ADS), along with astronomy's journals and data centers (a collaboration dubbed URANIA), has developed a distributed on-line digital library which has become the dominant means by which astronomers search, access and read their technical literature. Digital libraries such as the NASA Astrophysics Data System permit the easy accumulation of a new type of bibliome… ▽ More The NASA Astrophysics Data System (ADS), along with astronomy's journals and data centers (a collaboration dubbed URANIA), has developed a distributed on-line digital library which has become the dominant means by which astronomers search, access and read their technical literature. Digital libraries such as the NASA Astrophysics Data System permit the easy accumulation of a new type of bibliometric measure, the number of electronic accesses (``reads'') of individual articles. We explore various aspects of this new measure. We examine the obsolescence function as measured by actual reads, and show that it can be well fit by the sum of four exponentials with very different time constants. We compare the obsolescence function as measured by readership with the obsolescence function as measured by citations. We find that the citation function is proportional to the sum of two of the components of the readership function. This proves that the normative theory of citation is true in the mean. We further examine in detail the similarities and differences between the citation rate, the readership rate and the total citations for individual articles, and discuss some of the causes. Using the number of reads as a bibliometric measure for individuals, we introduce the read-cite diagram to provide a two-dimensional view of an individual's scientific productivity. We develop a simple model to account for an individual's reads and cites and use it to show that the position of a person in the read-cite diagram is a function of age, innate productivity, and work history. We show the age biases of both reads and cites, and develop two new bibliometric measures which have substantially less age bias than citations △ Less

Submitted 25 September, 2009; originally announced September 2009.

Comments: ADS bibcode: 2005JASIS..56..111K This is the second paper (the first is Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library) from the original article The NASA Astrophysics Data System: Sociology, Bibliometrics, and Impact, which went on-line in the summer of 2003

Journal ref: The Journal of the American Society for Information Science and Technology, Vol. 56, p. 111 (2005)

arXiv:0909.4786 [pdf]

doi 10.1002/asi.20095

Worldwide Use and Impact of the NASA Astrophysics Data System Digital Library

Authors: Michael J. Kurtz, Guenther Eichhorn, Alberto Accomazzi, Carolyn Grant, Markus Demleitner, Stephen S. Murray

Abstract: By combining data from the text, citation, and reference databases with data from the ADS readership logs we have been able to create Second Order Bibliometric Operators, a customizable class of collaborative filters which permits substantially improved accuracy in literature queries. Using the ADS usage logs along with membership statistics from the International Astronomical Union and data o… ▽ More By combining data from the text, citation, and reference databases with data from the ADS readership logs we have been able to create Second Order Bibliometric Operators, a customizable class of collaborative filters which permits substantially improved accuracy in literature queries. Using the ADS usage logs along with membership statistics from the International Astronomical Union and data on the population and gross domestic product (GDP) we develop an accurate model for world-wide basic research where the number of scientists in a country is proportional to the GDP of that country, and the amount of basic research done by a country is proportional to the number of scientists in that country times that country's per capita GDP. We introduce the concept of utility time to measure the impact of the ADS/URANIA and the electronic astronomical library on astronomical research. We find that in 2002 it amounted to the equivalent of 736 FTE researchers, or $250 Million, or the astronomical research done in France. Subject headings: digital libraries; bibliometrics; sociology of science; information retrieval △ Less

Submitted 25 September, 2009; originally announced September 2009.

Comments: ADS bibcode: 2005JASIS..56...36K This is a portion (The bibliometric properties of article readership information is the other part) of the article: The NASA Astrophysics Data System: Sociology, bibliometrics and impact, which went on-line in the summer of 2003

Journal ref: The Journal of the American Society for Information Science and Technology, Vol. 56, p. 36. (2005)

arXiv:cs/0610011 [pdf, ps, other]

Creation and use of Citations in the ADS

Authors: Alberto Accomazzi, Gunther Eichhorn, Michael J. Kurtz, Carolyn S. Grant, Edwin Henneken, Markus Demleitner, Donna Thompson, Elizabeth Bohlen, Stephen S. Murray

Abstract: With over 20 million records, the ADS citation database is regularly used by researchers and librarians to measure the scientific impact of individuals, groups, and institutions. In addition to the traditional sources of citations, the ADS has recently added references extracted from the arXiv e-prints on a nightly basis. We review the procedures used to harvest and identify the reference data u… ▽ More With over 20 million records, the ADS citation database is regularly used by researchers and librarians to measure the scientific impact of individuals, groups, and institutions. In addition to the traditional sources of citations, the ADS has recently added references extracted from the arXiv e-prints on a nightly basis. We review the procedures used to harvest and identify the reference data used in the creation of citations, the policies and procedures that we follow to avoid double-counting and to eliminate contributions which may not be scholarly in nature. Finally, we describe how users and institutions can easily obtain quantitative citation data from the ADS, both interactively and via web-based programming tools. The ADS is available at http://ads.harvard.edu. △ Less

Submitted 3 October, 2006; originally announced October 2006.

Comments: 9 pages; to be published in the proceedings of the conference "Library and Information Services V," June 2006, Cambridge, MA, USA

arXiv:cs/0511002 [pdf, ps, other]

Bibliographic Classification using the ADS Databases

Authors: Alberto Accomazzi, Michael J. Kurtz, Guenther Eichhorn, Edwin Henneken, Carolyn S. Grant, Markus Demleitner, Stephen S. Murray

Abstract: We discuss two techniques used to characterize bibliographic records based on their similarity to and relationship with the contents of the NASA Astrophysics Data System (ADS) databases. The first method has been used to classify input text as being relevant to one or more subject areas based on an analysis of the frequency distribution of its individual words. The second method has been used to… ▽ More We discuss two techniques used to characterize bibliographic records based on their similarity to and relationship with the contents of the NASA Astrophysics Data System (ADS) databases. The first method has been used to classify input text as being relevant to one or more subject areas based on an analysis of the frequency distribution of its individual words. The second method has been used to classify existing records as being relevant to one or more databases based on the distribution of the papers citing them. Both techniques have proven to be valuable tools in assigning new and existing bibliographic records to different disciplines within the ADS databases. △ Less

Submitted 31 October, 2005; originally announced November 2005.

Comments: Latex, 4 pages, 1 Figure. To be published in the Proceedings of the Conference "Astronomical Data Analysis Software & Systems XV" held October 2-5, 2005, in San Lorenzo de El Escorial, Spain

arXiv:cs/0503029 [pdf, ps, other]

doi 10.1016/j.ipm.2005.03.010

The Effect of Use and Access on Citations

Authors: Michael J. Kurtz, Guenther Eichhorn, Alberto Accomazzi, Carolyn Grant, Markus Demleitner, Edwin Henneken, Stephen S. Murray

Abstract: It has been shown (S. Lawrence, 2001, Nature, 411, 521) that journal articles which have been posted without charge on the internet are more heavily cited than those which have not been. Using data from the NASA Astrophysics Data System (ads.harvard.edu) and from the ArXiv e-print archive at Cornell University (arXiv.org) we examine the causes of this effect. It has been shown (S. Lawrence, 2001, Nature, 411, 521) that journal articles which have been posted without charge on the internet are more heavily cited than those which have not been. Using data from the NASA Astrophysics Data System (ads.harvard.edu) and from the ArXiv e-print archive at Cornell University (arXiv.org) we examine the causes of this effect. △ Less

Submitted 14 March, 2005; originally announced March 2005.

Comments: Accepted for publication in Information Processing & Management, special issue on scientometrics

ACM Class: H.3.7

Journal ref: Inform Process Manag 41:1395-1402 (2005)

arXiv:cs/0401028 [pdf, ps, other]

Automated Resolution of Noisy Bibliographic References

Authors: Markus Demleitner, Michael Kurtz, Alberto Accomazzi, Günther Eichhorn, Carolyn S. Grant, Steven S. Murray

Abstract: We describe a system used by the NASA Astrophysics Data System to identify bibliographic references obtained from scanned article pages by OCR methods with records in a bibliographic database. We analyze the process generating the noisy references and conclude that the three-step procedure of correcting the OCR results, parsing the corrected string and matching it against the database provides u… ▽ More We describe a system used by the NASA Astrophysics Data System to identify bibliographic references obtained from scanned article pages by OCR methods with records in a bibliographic database. We analyze the process generating the noisy references and conclude that the three-step procedure of correcting the OCR results, parsing the corrected string and matching it against the database provides unsatisfactory results. Instead, we propose a method that allows a controlled merging of correction, parsing and matching, inspired by dependency grammars. We also report on the effectiveness of various heuristics that we have employed to improve recall. △ Less

Submitted 27 January, 2004; originally announced January 2004.

Comments: 10 pages, 1 figure; accepted for publication in the proceedings of the 2004 Meeting of the International Federation of Classification Societies

ACM Class: H.3.7; H.3.2

Showing 1–10 of 10 results for author: Demleitner, M