-
Synthetic Electroretinogram Signal Generation Using Conditional Generative Adversarial Network for Enhancing Classification of Autism Spectrum Disorder
Authors:
Mikhail Kulyabin,
Paul A. Constable,
Aleksei Zhdanov,
Irene O. Lee,
David H. Skuse,
Dorothy A. Thompson,
Andreas Maier
Abstract:
The electroretinogram (ERG) is a clinical test that records the retina's electrical response to light. The ERG is a promising way to study different neurodevelopmental and neurodegenerative disorders, including autism spectrum disorder (ASD) - a neurodevelopmental condition that impacts language, communication, and reciprocal social interactions. However, in heterogeneous populations, such as ASD,…
▽ More
The electroretinogram (ERG) is a clinical test that records the retina's electrical response to light. The ERG is a promising way to study different neurodevelopmental and neurodegenerative disorders, including autism spectrum disorder (ASD) - a neurodevelopmental condition that impacts language, communication, and reciprocal social interactions. However, in heterogeneous populations, such as ASD, where the ability to collect large datasets is limited, the application of artificial intelligence (AI) is complicated. Synthetic ERG signals generated from real ERG recordings carry similar information as natural ERGs and, therefore, could be used as an extension for natural data to increase datasets so that AI applications can be fully utilized. As proof of principle, this study presents a Generative Adversarial Network capable of generating synthetic ERG signals of children with ASD and typically developing control individuals. We applied a Time Series Transformer and Visual Transformer with Continuous Wavelet Transform to enhance classification results on the extended synthetic signals dataset. This approach may support classification models in related psychiatric conditions where the ERG may help classify disorders.
△ Less
Submitted 11 July, 2024;
originally announced July 2024.
-
Assessing your Observatory's Impact: Best Practices in Establishing and Maintaining Observatory Bibliographies
Authors:
Observatory Bibliographers Collaboration,
Raffaele D'Abrusco,
Monique Gomez,
Uta Grothkopf,
Sharon Hunt,
Ruth Kneale,
Mika Konuma,
Jenny Novacescu,
Luisa Rebull,
Elena Scire,
Erin Scott,
Donna Thompson,
Lance Utley,
Christopher Wilkinson,
Sherry Winkelman
Abstract:
Observatories need to measure and evaluate the scientific output and overall impact of their facilities. An observatory bibliography consists of the papers published using that observatory's data, typically gathered by searching the major journals for relevant keywords. Recently, the volume of literature and methods by which the publications pool is evaluated has increased. Efficient and standardi…
▽ More
Observatories need to measure and evaluate the scientific output and overall impact of their facilities. An observatory bibliography consists of the papers published using that observatory's data, typically gathered by searching the major journals for relevant keywords. Recently, the volume of literature and methods by which the publications pool is evaluated has increased. Efficient and standardized procedures are necessary to assign meaningful metadata; enable user-friendly retrieval; and provide the opportunity to derive reports, statistics, and visualizations to impart a deeper understanding of the research output. In 2021, a group of observatory bibliographers from around the world convened online to continue the discussions presented in Lagerstrom (2015). We worked to extract general guidelines from our experiences, techniques, and lessons learnt. The paper explores the development, application, and current status of telescope bibliographies and future trends. This paper briefly describes the methodologies employed in constructing databases, along with the various bibliometric techniques used to analyze and interpret them. We explain reasons for non-standardization and why it is essential for each observatory to identify metadata and metrics that are meaningful for them; caution the (over-)use of comparisons among facilities that are, ultimately, not comparable through bibliometrics; and highlight the benefits of telescope bibliographies, both for researchers within the astronomical community and for stakeholders beyond the specific observatories. There is tremendous diversity in the ways bibliographers track publications and maintain databases, due to parameters such as resources, type of observatory, historical practices, and reporting requirements to funders and outside agencies. However, there are also common sets of Best Practices.
△ Less
Submitted 28 July, 2024; v1 submitted 29 December, 2023;
originally announced January 2024.
-
Experimenting with Large Language Models and vector embeddings in NASA SciX
Authors:
Sergi Blanco-Cuaresma,
Ioana Ciucă,
Alberto Accomazzi,
Michael J. Kurtz,
Edwin A. Henneken,
Kelly E. Lockhart,
Felix Grezes,
Thomas Allen,
Golnaz Shapurian,
Carolyn S. Grant,
Donna M. Thompson,
Timothy W. Hostetler,
Matthew R. Templeton,
Shinyi Chen,
Jennifer Koch,
Taylor Jacovich,
Daniel Chivvis,
Fernanda de Macedo Alves,
Jean-Claude Paquin,
Jennifer Bartlett,
Mugdha Polimera,
Stephanie Jarmak
Abstract:
Open-source Large Language Models enable projects such as NASA SciX (i.e., NASA ADS) to think out of the box and try alternative approaches for information retrieval and data augmentation, while respecting data copyright and users' privacy. However, when large language models are directly prompted with questions without any context, they are prone to hallucination. At NASA SciX we have developed a…
▽ More
Open-source Large Language Models enable projects such as NASA SciX (i.e., NASA ADS) to think out of the box and try alternative approaches for information retrieval and data augmentation, while respecting data copyright and users' privacy. However, when large language models are directly prompted with questions without any context, they are prone to hallucination. At NASA SciX we have developed an experiment where we created semantic vectors for our large collection of abstracts and full-text content, and we designed a prompt system to ask questions using contextual chunks from our system. Based on a non-systematic human evaluation, the experiment shows a lower degree of hallucination and better responses when using Retrieval Augmented Generation. Further exploration is required to design new features and data augmentation processes at NASA SciX that leverages this technology while respecting the high level of trust and quality that the project holds.
△ Less
Submitted 21 December, 2023;
originally announced December 2023.
-
Improving astroBERT using Semantic Textual Similarity
Authors:
Felix Grezes,
Thomas Allen,
Sergi Blanco-Cuaresma,
Alberto Accomazzi,
Michael J. Kurtz,
Golnaz Shapurian,
Edwin Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Timothy W. Hostetler,
Matthew R. Templeton,
Kelly E. Lockhart,
Shinyi Chen,
Jennifer Koch,
Taylor Jacovich,
Pavlos Protopapas
Abstract:
The NASA Astrophysics Data System (ADS) is an essential tool for researchers that allows them to explore the astronomy and astrophysics scientific literature, but it has yet to exploit recent advances in natural language processing. At ADASS 2021, we introduced astroBERT, a machine learning language model tailored to the text used in astronomy papers in ADS. In this work we:
- announce the first…
▽ More
The NASA Astrophysics Data System (ADS) is an essential tool for researchers that allows them to explore the astronomy and astrophysics scientific literature, but it has yet to exploit recent advances in natural language processing. At ADASS 2021, we introduced astroBERT, a machine learning language model tailored to the text used in astronomy papers in ADS. In this work we:
- announce the first public release of the astroBERT language model;
- show how astroBERT improves over existing public language models on astrophysics specific tasks;
- and detail how ADS plans to harness the unique structure of scientific papers, the citation graph and citation context, to further improve astroBERT.
△ Less
Submitted 29 November, 2022;
originally announced December 2022.
-
Sturm's Theorem with Endpoints
Authors:
Philippe Pébay,
J. Maurice Rojas,
David C. Thompson
Abstract:
Sturm's Theorem is a fundamental 19th century result relating the number of real roots of a polynomial $f$ in an interval to the number of sign alternations in a sequence of polynomial division-like calculations. We provide a short direct proof of Sturm's Theorem, including the numerically vexing case (ignored in many published accounts) where an interval endpoint is a root of $f$.
Sturm's Theorem is a fundamental 19th century result relating the number of real roots of a polynomial $f$ in an interval to the number of sign alternations in a sequence of polynomial division-like calculations. We provide a short direct proof of Sturm's Theorem, including the numerically vexing case (ignored in many published accounts) where an interval endpoint is a root of $f$.
△ Less
Submitted 16 August, 2022;
originally announced August 2022.
-
Bias amplification in experimental social networks is reduced by resampling
Authors:
Mathew D. Hardy,
Bill D. Thompson,
P. M. Krafft,
Thomas L. Griffiths
Abstract:
Large-scale social networks are thought to contribute to polarization by amplifying people's biases. However, the complexity of these technologies makes it difficult to identify the mechanisms responsible and to evaluate mitigation strategies. Here we show under controlled laboratory conditions that information transmission through social networks amplifies motivational biases on a simple perceptu…
▽ More
Large-scale social networks are thought to contribute to polarization by amplifying people's biases. However, the complexity of these technologies makes it difficult to identify the mechanisms responsible and to evaluate mitigation strategies. Here we show under controlled laboratory conditions that information transmission through social networks amplifies motivational biases on a simple perceptual decision-making task. Participants in a large behavioral experiment showed increased rates of biased decision-making when part of a social network relative to asocial participants, across 40 independently evolving populations. Drawing on techniques from machine learning and Bayesian statistics, we identify a simple adjustment to content-selection algorithms that is predicted to mitigate bias amplification. This algorithm generates a sample of perspectives from within an individual's network that is more representative of the population as a whole. In a second large experiment, this strategy reduced bias amplification while maintaining the benefits of information sharing.
△ Less
Submitted 5 October, 2022; v1 submitted 15 August, 2022;
originally announced August 2022.
-
Counterfactual harm
Authors:
Jonathan G. Richens,
Rory Beard,
Daniel H. Thompson
Abstract:
To act safely and ethically in the real world, agents must be able to reason about harm and avoid harmful actions. However, to date there is no statistical method for measuring harm and factoring it into algorithmic decisions. In this paper we propose the first formal definition of harm and benefit using causal models. We show that any factual definition of harm must violate basic intuitions in ce…
▽ More
To act safely and ethically in the real world, agents must be able to reason about harm and avoid harmful actions. However, to date there is no statistical method for measuring harm and factoring it into algorithmic decisions. In this paper we propose the first formal definition of harm and benefit using causal models. We show that any factual definition of harm must violate basic intuitions in certain scenarios, and show that standard machine learning algorithms that cannot perform counterfactual reasoning are guaranteed to pursue harmful policies following distributional shifts. We use our definition of harm to devise a framework for harm-averse decision making using counterfactual objective functions. We demonstrate this framework on the problem of identifying optimal drug doses using a dose-response model learned from randomized control trial data. We find that the standard method of selecting doses using treatment effects results in unnecessarily harmful doses, while our counterfactual approach allows us to identify doses that are significantly less harmful without sacrificing efficacy.
△ Less
Submitted 2 November, 2022; v1 submitted 27 April, 2022;
originally announced April 2022.
-
Web accessibility trends and implementation in dynamic web applications
Authors:
Timothy W. Hostetler,
Shinyi Chen,
Sergi Blanco-Cuaresma,
Alberto Accomazzi,
Michael J. Kurtz,
Carolyn S. Grant,
Edwin Henneken,
Donna M. Thompson,
Roman Chyla,
Golnaz Shapurian,
Matthew R. Templeton,
Kelly E. Lockhart,
Nemanja Martinovic,
Stephen McDonald,
Felix Grezes
Abstract:
The NASA Astrophysics Data System (ADS), a critical research service for the astrophysics community, strives to provide the most accessible and inclusive environment for the discovery and exploration of the astronomical literature. Part of this goal involves creating a digital platform that can accommodate everybody, including those with disabilities that would benefit from alternative ways to pre…
▽ More
The NASA Astrophysics Data System (ADS), a critical research service for the astrophysics community, strives to provide the most accessible and inclusive environment for the discovery and exploration of the astronomical literature. Part of this goal involves creating a digital platform that can accommodate everybody, including those with disabilities that would benefit from alternative ways to present the information provided by the website. NASA ADS follows the official Web Content Accessibility Guidelines (WCAG) standard for ensuring accessibility of all its applications, striving to exceed this standard where possible. Through the use of both internal audits and external expert review based on these guidelines, we have identified many areas for improving accessibility in our current web application, and have implemented a number of updates to the UI as a result of this. We present an overview of some current web accessibility trends, discuss our experience incorporating these trends in our web application, and discuss the lessons learned and recommendations for future projects.
△ Less
Submitted 1 February, 2022;
originally announced February 2022.
-
Building astroBERT, a language model for Astronomy & Astrophysics
Authors:
Felix Grezes,
Sergi Blanco-Cuaresma,
Alberto Accomazzi,
Michael J. Kurtz,
Golnaz Shapurian,
Edwin Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Roman Chyla,
Stephen McDonald,
Timothy W. Hostetler,
Matthew R. Templeton,
Kelly E. Lockhart,
Nemanja Martinovic,
Shinyi Chen,
Chris Tanner,
Pavlos Protopapas
Abstract:
The existing search tools for exploring the NASA Astrophysics Data System (ADS) can be quite rich and empowering (e.g., similar and trending operators), but researchers are not yet allowed to fully leverage semantic search. For example, a query for "results from the Planck mission" should be able to distinguish between all the various meanings of Planck (person, mission, constant, institutions and…
▽ More
The existing search tools for exploring the NASA Astrophysics Data System (ADS) can be quite rich and empowering (e.g., similar and trending operators), but researchers are not yet allowed to fully leverage semantic search. For example, a query for "results from the Planck mission" should be able to distinguish between all the various meanings of Planck (person, mission, constant, institutions and more) without further clarification from the user. At ADS, we are applying modern machine learning and natural language processing techniques to our dataset of recent astronomy publications to train astroBERT, a deeply contextual language model based on research at Google. Using astroBERT, we aim to enrich the ADS dataset and improve its discoverability, and in particular we are developing our own named entity recognition tool. We present here our preliminary results and lessons learned.
△ Less
Submitted 1 December, 2021;
originally announced December 2021.
-
SaLinA: Sequential Learning of Agents
Authors:
Ludovic Denoyer,
Alfredo de la Fuente,
Song Duong,
Jean-Baptiste Gaya,
Pierre-Alexandre Kamienny,
Daniel H. Thompson
Abstract:
SaLinA is a simple library that makes implementing complex sequential learning models easy, including reinforcement learning algorithms. It is built as an extension of PyTorch: algorithms coded with \SALINA{} can be understood in few minutes by PyTorch users and modified easily. Moreover, SaLinA naturally works with multiple CPUs and GPUs at train and test time, thus being a good fit for the large…
▽ More
SaLinA is a simple library that makes implementing complex sequential learning models easy, including reinforcement learning algorithms. It is built as an extension of PyTorch: algorithms coded with \SALINA{} can be understood in few minutes by PyTorch users and modified easily. Moreover, SaLinA naturally works with multiple CPUs and GPUs at train and test time, thus being a good fit for the large-scale training use cases. In comparison to existing RL libraries, SaLinA has a very low adoption cost and capture a large variety of settings (model-based RL, batch RL, hierarchical RL, multi-agent RL, etc.). But SaLinA does not only target RL practitioners, it aims at providing sequential learning capabilities to any deep learning programmer.
△ Less
Submitted 15 October, 2021;
originally announced October 2021.
-
A Tensor-Based Formulation of Hetero-functional Graph Theory
Authors:
Amro M. Farid,
Dakota Thompson,
Wester Schoonenberg
Abstract:
Recently, hetero-functional graph theory (HFGT) has developed as a means to mathematically model the structure of large-scale complex flexible engineering systems. It does so by fusing concepts from network science and model-based systems engineering (MBSE). For the former, it utilizes multiple graph-based data structures to support a matrix-based quantitative analysis. For the latter, HFGT inheri…
▽ More
Recently, hetero-functional graph theory (HFGT) has developed as a means to mathematically model the structure of large-scale complex flexible engineering systems. It does so by fusing concepts from network science and model-based systems engineering (MBSE). For the former, it utilizes multiple graph-based data structures to support a matrix-based quantitative analysis. For the latter, HFGT inherits the heterogeneity of conceptual and ontological constructs found in model-based systems engineering including system form, system function, and system concept. These diverse conceptual constructs indicate multi-dimensional rather than two-dimensional relationships. This paper provides the first tensor-based treatment of hetero-functional graph theory. In particular, it addresses the ``system concept" and the hetero-functional adjacency matrix from the perspective of tensors and introduces the hetero-functional incidence tensor as a new data structure. The tensor-based formulation described in this work makes a stronger tie between HFGT and its ontological foundations in MBSE. Finally, the tensor-based formulation facilitates several analytical results that provide an understanding of the relationships between HFGT and multi-layer networks.
△ Less
Submitted 12 October, 2022; v1 submitted 14 January, 2021;
originally announced January 2021.
-
Agile methodologies in teams with highly creative and autonomous members
Authors:
Sergi Blanco-Cuaresma,
Alberto Accomazzi,
Michael J. Kurtz,
Edwin Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Roman Chyla,
Stephen McDonald,
Golnaz Shapurian,
Timothy W. Hostetler,
Matthew R. Templeton,
Kelly E. Lockhart,
Kris Bukovi
Abstract:
The Agile manifesto encourages us to value individuals and interactions over processes and tools, while Scrum, the most adopted Agile development methodology, is essentially based on roles, events, artifacts, and the rules that bind them together (i.e., processes). Moreover, it is generally proclaimed that whenever a Scrum project does not succeed, the reason is because Scrum was not implemented c…
▽ More
The Agile manifesto encourages us to value individuals and interactions over processes and tools, while Scrum, the most adopted Agile development methodology, is essentially based on roles, events, artifacts, and the rules that bind them together (i.e., processes). Moreover, it is generally proclaimed that whenever a Scrum project does not succeed, the reason is because Scrum was not implemented correctly and not because Scrum may have its own flaws. This grants irrefutability to the methodology, discouraging deviations to fit the actual needs and peculiarities of the developers. In particular, the members of the NASA ADS team are highly creative and autonomous whose motivation can be affected if their freedom is too strongly constrained. We present our experience following Agile principles, reusing certain Scrum elements and seeking the satisfaction of the team members, while rapidly reacting/keeping the project in line with our stakeholders expectations.
△ Less
Submitted 10 September, 2020;
originally announced September 2020.
-
The Hetero-functional Graph Theory Toolbox
Authors:
Dakota Thompson,
Prabhat Hegde,
Wester C. H. Schoonenberg,
Inas Khayal,
Amro M. Farid
Abstract:
In the 20th century, newly invented technical artifacts were connected to form large-scale complex engineering systems. Furthermore, the interactions found within these networked systems has grown in both degree as well as heterogeneity. Consequently, these already complex engineering systems have converged in what is now called systems-of-systems. The analysis, design, planning, and operation of…
▽ More
In the 20th century, newly invented technical artifacts were connected to form large-scale complex engineering systems. Furthermore, the interactions found within these networked systems has grown in both degree as well as heterogeneity. Consequently, these already complex engineering systems have converged in what is now called systems-of-systems. The analysis, design, planning, and operation of these engineering systems from a holistic perspective has necessitated ever-more sophisticated modeling techniques. Despite significant advancements in model-based systems engineering and network science, these seemingly disparate fields have experienced similar limitations in addressing the complexity of engineering systems. Hetero-Functional Graph Theory (HFGT) has emerged as a means to address some of these limitations. This paper serves as a user guide to a recently developed Hetero-functional Graph Theory Toolbox which facilitates the computation of HFGT mathematical models. It is written in the MATLAB language and has been tested with v9.6 (R2019a). It is openly available on GitHub together with a sample input file for straightforward re-use. The paper details the syntax and semantics of the input file, the principal data structure of the toolbox, and the functions used to construct and populate this data structure. The toolbox has been fully validated against several peer-review HFGT publications.
△ Less
Submitted 2 October, 2020; v1 submitted 8 May, 2020;
originally announced May 2020.
-
Learning medical triage from clinicians using Deep Q-Learning
Authors:
Albert Buchard,
Baptiste Bouvier,
Giulia Prando,
Rory Beard,
Michail Livieratos,
Dan Busbridge,
Daniel Thompson,
Jonathan Richens,
Yuanzhao Zhang,
Adam Baker,
Yura Perov,
Kostis Gourgoulias,
Saurabh Johri
Abstract:
Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other ha…
▽ More
Medical Triage is of paramount importance to healthcare systems, allowing for the correct orientation of patients and allocation of the necessary resources to treat them adequately. While reliable decision-tree methods exist to triage patients based on their presentation, those trees implicitly require human inference and are not immediately applicable in a fully automated setting. On the other hand, learning triage policies directly from experts may correct for some of the limitations of hard-coded decision-trees. In this work, we present a Deep Reinforcement Learning approach (a variant of DeepQ-Learning) to triage patients using curated clinical vignettes. The dataset, consisting of 1374 clinical vignettes, was created by medical doctors to represent real-life cases. Each vignette is associated with an average of 3.8 expert triage decisions given by medical doctors relying solely on medical history. We show that this approach is on a par with human performance, yielding safe triage decisions in 94% of cases, and matching expert decisions in 85% of cases. The trained agent learns when to stop asking questions, acquires optimized decision policies requiring less evidence than supervised approaches, and adapts to the novelty of a situation by asking for more information. Overall, we demonstrate that a Deep Reinforcement Learning approach can learn effective medical triage policies directly from expert decisions, without requiring expert knowledge engineering. This approach is scalable and can be deployed in healthcare settings or geographical regions with distinct triage specifications, or where trained experts are scarce, to improve decision making in the early stage of care.
△ Less
Submitted 24 June, 2020; v1 submitted 28 March, 2020;
originally announced March 2020.
-
Fast and Accurate Retrieval of Methane Concentration from Imaging Spectrometer Data Using Sparsity Prior
Authors:
Markus D. Foote,
Philip E. Dennison,
Andrew K. Thorpe,
David R. Thompson,
Siraput Jongaramrungruang,
Christian Frankenberg,
Sarang C. Joshi
Abstract:
The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Air…
▽ More
The strong radiative forcing by atmospheric methane has stimulated interest in identifying natural and anthropogenic sources of this potent greenhouse gas. Point sources are important targets for quantification, and anthropogenic targets have potential for emissions reduction. Methane point source plume detection and concentration retrieval have been previously demonstrated using data from the Airborne Visible InfraRed Imaging Spectrometer Next Generation (AVIRIS-NG). Current quantitative methods have tradeoffs between computational requirements and retrieval accuracy, creating obstacles for processing real-time data or large datasets from flight campaigns. We present a new computationally efficient algorithm that applies sparsity and an albedo correction to matched filter retrieval of trace gas concentration-pathlength. The new algorithm was tested using AVIRIS-NG data acquired over several point source plumes in Ahmedabad, India. The algorithm was validated using simulated AVIRIS-NG data including synthetic plumes of known methane concentration. Sparsity and albedo correction together reduced the root mean squared error of retrieved methane concentration-pathlength enhancement by 60.7% compared with a previous robust matched filter method. Background noise was reduced by a factor of 2.64. The new algorithm was able to process the entire 300 flightline 2016 AVIRIS-NG India campaign in just over 8 hours on a desktop computer with GPU acceleration.
△ Less
Submitted 5 March, 2020;
originally announced March 2020.
-
Masking schemes for universal marginalisers
Authors:
Divya Gautam,
Maria Lomeli,
Kostis Gourgoulias,
Daniel H. Thompson,
Saurabh Johri
Abstract:
We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words,…
▽ More
We consider the effect of structure-agnostic and structure-dependent masking schemes when training a universal marginaliser (arXiv:1711.00695) in order to learn conditional distributions of the form $P(x_i |\mathbf x_{\mathbf b})$, where $x_i$ is a given random variable and $\mathbf x_{\mathbf b}$ is some arbitrary subset of all random variables of the generative model of interest. In other words, we mimic the self-supervised training of a denoising autoencoder, where a dataset of unlabelled data is used as partially observed input and the neural approximator is optimised to minimise reconstruction loss. We focus on studying the underlying process of the partially observed data---how good is the neural approximator at learning all conditional distributions when the observation process at prediction time differs from the masking process during training? We compare networks trained with different masking schemes in terms of their predictive performance and generalisation properties.
△ Less
Submitted 16 January, 2020;
originally announced January 2020.
-
Learning Radiative Transfer Models for Climate Change Applications in Imaging Spectroscopy
Authors:
Shubhankar Deshpande,
Brian D. Bue,
David R. Thompson,
Vijay Natraj,
Mario Parente
Abstract:
According to a recent investigation, an estimated 33-50% of the world's coral reefs have undergone degradation, believed to be as a result of climate change. A strong driver of climate change and the subsequent environmental impact are greenhouse gases such as methane. However, the exact relation climate change has to the environmental condition cannot be easily established. Remote sensing methods…
▽ More
According to a recent investigation, an estimated 33-50% of the world's coral reefs have undergone degradation, believed to be as a result of climate change. A strong driver of climate change and the subsequent environmental impact are greenhouse gases such as methane. However, the exact relation climate change has to the environmental condition cannot be easily established. Remote sensing methods are increasingly being used to quantify and draw connections between rapidly changing climatic conditions and environmental impact. A crucial part of this analysis is processing spectroscopy data using radiative transfer models (RTMs) which is a computationally expensive process and limits their use with high volume imaging spectrometers. This work presents an algorithm that can efficiently emulate RTMs using neural networks leading to a multifold speedup in processing time, and yielding multiple downstream benefits.
△ Less
Submitted 8 June, 2019;
originally announced June 2019.
-
Fundamentals of effective cloud management for the new NASA Astrophysics Data System
Authors:
Sergi Blanco-Cuaresma,
Alberto Accomazzi,
Michael J. Kurtz,
Edwin Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Roman Chyla,
Stephen McDonald,
Golnaz Shapurian,
Timothy W. Hostetler,
Matthew R. Templeton,
Kelly E. Lockhart,
Kris Bukovi,
Nathan Rapport
Abstract:
The new NASA Astrophysics Data System (ADS) is designed with a serviceoriented architecture (SOA) that consists of multiple customized Apache Solr search engine instances plus a collection of microservices, containerized using Docker, and deployed in Amazon Web Services (AWS). For complex systems, like the ADS, this loosely coupled architecture can lead to a more scalable, reliable and resilient s…
▽ More
The new NASA Astrophysics Data System (ADS) is designed with a serviceoriented architecture (SOA) that consists of multiple customized Apache Solr search engine instances plus a collection of microservices, containerized using Docker, and deployed in Amazon Web Services (AWS). For complex systems, like the ADS, this loosely coupled architecture can lead to a more scalable, reliable and resilient system if some fundamental questions are addressed. After having experimented with different AWS environments and deployment methods, we decided in December 2017 to go with Kubernetes as our container orchestration. Defining the best strategy to properly setup Kubernetes has shown to be challenging: automatic scaling services and load balancing traffic can lead to errors whose origin is difficult to identify, monitoring and logging the activity that happens across multiple layers for a single request needs to be carefully addressed, and the best workflow for a Continuous Integration and Delivery (CI/CD) system is not self-evident. We present here how we tackle these challenges and our plans for the future.
△ Less
Submitted 16 January, 2019;
originally announced January 2019.
-
New ADS Functionality for the Curator
Authors:
Alberto Accomazzi,
Michael J. Kurtz,
Edwin A. Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Roman Chyla,
Steven McDonald,
Taylor J. Shaulis,
Sergi Blanco-Cuaresma,
Golnaz Shapurian,
Timothy W. Hostetler,
Matthew R. Templeton
Abstract:
In this paper we provide an update concerning the operations of the NASA Astrophysics Data System (ADS), its services and user interface, and the content currently indexed in its database. As the primary information system used by researchers in Astronomy, the ADS aims to provide a comprehensive index of all scholarly resources appearing in the literature. With the current effort in our community…
▽ More
In this paper we provide an update concerning the operations of the NASA Astrophysics Data System (ADS), its services and user interface, and the content currently indexed in its database. As the primary information system used by researchers in Astronomy, the ADS aims to provide a comprehensive index of all scholarly resources appearing in the literature. With the current effort in our community to support data and software citations, we discuss what steps the ADS is taking to provide the needed infrastructure in collaboration with publishers and data providers. A new API provides access to the ADS search interface, metrics, and libraries allowing users to programmatically automate discovery and curation tasks. The new ADS interface supports a greater integration of content and services with a variety of partners, including ORCID claiming, indexing of SIMBAD objects, and article graphics from a variety of publishers. Finally, we highlight how librarians can facilitate the ingest of gray literature that they curate into our system.
△ Less
Submitted 23 October, 2017;
originally announced October 2017.
-
Aggregation and Linking of Observational Metadata in the ADS
Authors:
Alberto Accomazzi,
Michael J. Kurtz,
Edwin A. Henneken,
Carolyn S. Grant,
Donna M. Thompson,
Roman Chyla,
Alexandra Holachek,
Jonathan Elliott
Abstract:
We discuss current efforts behind the curation of observing proposals, archive bibliographies, and data links in the NASA Astrophysics Data System (ADS). The primary data in the ADS is the bibliographic content from scholarly articles in Astronomy and Physics, which ADS aggregates from publishers, arXiv and conference proceeding sites. This core bibliographic information is then further enriched b…
▽ More
We discuss current efforts behind the curation of observing proposals, archive bibliographies, and data links in the NASA Astrophysics Data System (ADS). The primary data in the ADS is the bibliographic content from scholarly articles in Astronomy and Physics, which ADS aggregates from publishers, arXiv and conference proceeding sites. This core bibliographic information is then further enriched by ADS via the generation of citations and usage data, and through the aggregation of external resources from astronomy data archives and libraries. Important sources of such additional information are the metadata describing observing proposals and high level data products, which, once ingested in ADS, become easily discoverable and citeable by the science community. Bibliographic studies have shown that the integration of links between data archives and the ADS provides greater visibility to data products and increased citations to the literature associated with them.
△ Less
Submitted 28 January, 2016;
originally announced January 2016.
-
Teaching natural deduction in the right order with Natural Deduction Planner
Authors:
Jeremy Seligman,
Declan Thompson
Abstract:
We describe a strategy-based approach to teaching natural deduction using a notation that emphasises the order in which deductions are constructed, together with a {\LaTeX} package and Java app to aid in the production of teaching resources and classroom demonstrations. Our approach is aimed at students with little exposure to mathematical method and has been developed while teaching undergraduate…
▽ More
We describe a strategy-based approach to teaching natural deduction using a notation that emphasises the order in which deductions are constructed, together with a {\LaTeX} package and Java app to aid in the production of teaching resources and classroom demonstrations. Our approach is aimed at students with little exposure to mathematical method and has been developed while teaching undergraduate classes for philosophy students over the last ten years.
△ Less
Submitted 13 July, 2015;
originally announced July 2015.
-
ADS 2.0: new architecture, API and services
Authors:
Roman Chyla,
Alberto Accomazzi,
Alexandra Holachek,
Carolyn S. Grant,
Jonathan Elliott,
Edwin A. Henneken,
Donna M. Thompson,
Michael J. Kurtz,
Stephen S. Murray,
Vladimir Sudilovsky
Abstract:
The ADS platform is undergoing the biggest rewrite of its 20-year history. While several components have been added to its architecture over the past couple of years, this talk will concentrate on the underpinnings of ADS's search layer and its API. To illustrate the design of the components in the new system, we will show how the new ADS user interface is built exclusively on top of the API using…
▽ More
The ADS platform is undergoing the biggest rewrite of its 20-year history. While several components have been added to its architecture over the past couple of years, this talk will concentrate on the underpinnings of ADS's search layer and its API. To illustrate the design of the components in the new system, we will show how the new ADS user interface is built exclusively on top of the API using RESTful web services. Taking one step further, we will discuss how we plan to expose the treasure trove of information hosted by ADS (10 million records and fulltext for much of the Astronomy and Physics refereed literature) to partners interested in using this API. This will provide you (and your intelligent applications) with access to ADS's underlying data to enable the extraction of new knowledge and the ingestion of these results back into the ADS. Using this framework, researchers could run controlled experiments with content extraction, machine learning, natural language processing, etc. In this talk, we will discuss what is already implemented, what will be available soon, and where we are going next.
△ Less
Submitted 19 March, 2015;
originally announced March 2015.
-
ADS: The Next Generation Search Platform
Authors:
Alberto Accomazzi,
Michael J. Kurtz,
Edwin A. Henneken,
Roman Chyla,
James Luker,
Carolyn S. Grant,
Donna M. Thompson,
Alexandra Holachek,
Rahul Dave,
Stephen S. Murray
Abstract:
Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Starting in 2011, the ADS started to systematically collect, parse…
▽ More
Four years after the last LISA meeting, the NASA Astrophysics Data System (ADS) finds itself in the middle of major changes to the infrastructure and contents of its database. In this paper we highlight a number of features of great importance to librarians and discuss the additional functionality that we are currently developing. Starting in 2011, the ADS started to systematically collect, parse and index full-text documents for all the major publications in Physics and Astronomy as well as many smaller Astronomy journals and arXiv e-prints, for a total of over 3.5 million papers. Our citation coverage has doubled since 2010 and now consists of over 70 million citations. We are normalizing the affiliation information in our records and, in collaboration with the CfA library and NASA, we have started collecting and linking funding sources with papers in our system. At the same time, we are undergoing major technology changes in the ADS platform which affect all aspects of the system and its operations. We have rolled out and are now enhancing a new high-performance search engine capable of performing full-text as well as metadata searches using an intuitive query language which supports fielded, unfielded and functional searches. We are currently able to index acknowledgments, affiliations, citations, funding sources, and to the extent that these metadata are available to us they are now searchable under our new platform. The ADS private library system is being enhanced to support reading groups, collaborative editing of lists of papers, tagging, and a variety of privacy settings when managing one's paper collection. While this effort is still ongoing, some of its benefits are already available through the ADS Labs user interface and API at http://adslabs.org/adsabs/
△ Less
Submitted 13 March, 2015;
originally announced March 2015.
-
Computational Analysis of Perfect-Information Position Auctions
Authors:
David R. M Thompson,
Kevin Leyton-Brown
Abstract:
After experimentation with other designs, the major search engines converged on the weighted, generalized second-price auction (wGSP) for selling keyword advertisements. Notably, this convergence occurred before position auctions were well understood (or, indeed, widely studied) theoretically. While much progress has been made since, theoretical analysis is still not able to settle the question of…
▽ More
After experimentation with other designs, the major search engines converged on the weighted, generalized second-price auction (wGSP) for selling keyword advertisements. Notably, this convergence occurred before position auctions were well understood (or, indeed, widely studied) theoretically. While much progress has been made since, theoretical analysis is still not able to settle the question of why search engines found wGSP preferable to other position auctions. We approach this question in a new way, adopting a new analytical paradigm we dub "computational mechanism analysis." By sampling position auction games from a given distribution, encoding them in a computationally efficient representation language, computing their Nash equilibria, and then calculating economic quantities of interest, we can quantitatively answer questions that theoretical methods have not. We considered seven widely studied valuation models from the literature and three position auction variants (generalized first price, unweighted generalized second price, and wGSP). We found that wGSP consistently showed the best ads of any position auction, measured both by social welfare and by relevance (expected number of clicks). Even in models where wGSP was already known to have bad worse-case efficiency, we found that it almost always performed well on average. In contrast, we found that revenue was extremely variable across auction mechanisms, and was highly sensitive to equilibrium selection, the preference model, and the valuation distribution.
△ Less
Submitted 4 August, 2014;
originally announced August 2014.
-
Computing and Using Metrics in the ADS
Authors:
Edwin A. Henneken,
Alberto Accomazzi,
Michael J. Kurtz,
Carolyn S. Grant,
Donna Thompson,
Jay Luker,
Roman Chyla,
Alexandra Holachek,
Stephen S. Murray
Abstract:
Finding measures for research impact, be it for individuals, institutions, instruments or projects, has gained a lot of popularity. More papers than ever are being written on new impact measures, and problems with existing measures are being pointed out on a regular basis. Funding agencies require impact statistics in their reports, job candidates incorporate them in their resumes, and publication…
▽ More
Finding measures for research impact, be it for individuals, institutions, instruments or projects, has gained a lot of popularity. More papers than ever are being written on new impact measures, and problems with existing measures are being pointed out on a regular basis. Funding agencies require impact statistics in their reports, job candidates incorporate them in their resumes, and publication metrics have even been used in at least one recent court case. To support this need for research impact indicators, the SAO/NASA Astrophysics Data System (ADS) has developed a service which provides a broad overview of various impact measures. In this presentation we discuss how the ADS can be used to quench the thirst for impact measures. We will also discuss a couple of the lesser known indicators in the metrics overview and the main issues to be aware of when compiling publication-based metrics in the ADS, namely author name ambiguity and citation incompleteness.
△ Less
Submitted 17 June, 2014;
originally announced June 2014.
-
ADS Labs - Supporting Information Discovery in Science Education
Authors:
Edwin A. Henneken,
Donna Thompson
Abstract:
The SAO/NASA Astrophysics Data System (ADS) is an open access digital library portal for researchers in astronomy and physics, operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant, successfully serving the professional science community for two decades. Currently there are about 55,000 frequent users (100+ queries per year), and up to 10 million infrequent users per year.…
▽ More
The SAO/NASA Astrophysics Data System (ADS) is an open access digital library portal for researchers in astronomy and physics, operated by the Smithsonian Astrophysical Observatory (SAO) under a NASA grant, successfully serving the professional science community for two decades. Currently there are about 55,000 frequent users (100+ queries per year), and up to 10 million infrequent users per year. Access by the general public now accounts for about half of all ADS use, demonstrating the vast reach of the content in our databases. The visibility and use of content in the ADS can be measured by the fact that there are over 17,000 links from Wikipedia pages to ADS content, a figure comparable to the number of links that Wikipedia has to OCLCs WorldCat catalog. The ADS, through its holdings and innovative techniques available in ADS Labs (http://adslabs.org), offers an environment for information discovery that is unlike any other service currently available to the astrophysics community. Literature discovery and review are important components of science education, aiding the process of preparing for a class, project, or presentation. The ADS has been recognized as a rich source of information for the science education community in astronomy, thanks to its collaborations within the astronomy community, publishers and projects like Com- PADRE. One element that makes the ADS uniquely relevant for the science education community is the availability of powerful tools to explore aspects of the astronomy literature as well as the relationship between topics, people, observations and scientific papers. The other element is the extensive repository of scanned literature, a significant fraction of which consists of historical literature.
△ Less
Submitted 2 October, 2012;
originally announced October 2012.
-
Finding Your Literature Match -- A Recommender System
Authors:
Edwin A. Henneken,
Michael J. Kurtz,
Alberto Accomazzi,
Carolyn Grant,
Donna Thompson,
Elizabeth Bohlen,
Giovanni Di Milia,
Jay Luker,
Stephen S. Murray
Abstract:
The universe of potentially interesting, searchable literature is expanding continuously. Besides the normal expansion, there is an additional influx of literature because of interdisciplinary boundaries becoming more and more diffuse. Hence, the need for accurate, efficient and intelligent search tools is bigger than ever. Even with a sophisticated search engine, looking for information can still…
▽ More
The universe of potentially interesting, searchable literature is expanding continuously. Besides the normal expansion, there is an additional influx of literature because of interdisciplinary boundaries becoming more and more diffuse. Hence, the need for accurate, efficient and intelligent search tools is bigger than ever. Even with a sophisticated search engine, looking for information can still result in overwhelming results. An overload of information has the intrinsic danger of scaring visitors away, and any organization, for-profit or not-for-profit, in the business of providing scholarly information wants to capture and keep the attention of its target audience. Publishers and search engine engineers alike will benefit from a service that is able to provide visitors with recommendations that closely meet their interests. Providing visitors with special deals, new options and highlights may be interesting to a certain degree, but what makes more sense (especially from a commercial point of view) than to let visitors do most of the work by the mere action of making choices? Hiring psychics is not an option, so a technological solution is needed to recommend items that a visitor is likely to be looking for. In this presentation we will introduce such a solution and argue that it is practically feasible to incorporate this approach into a useful addition to any information retrieval system with enough usage.
△ Less
Submitted 13 May, 2010;
originally announced May 2010.
-
Use of Astronomical Literature - A Report on Usage Patterns
Authors:
Edwin A. Henneken,
Michael J. Kurtz,
Alberto Accomazzi,
Carolyn S. Grant,
Donna Thompson,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
In this paper we present a number of metrics for usage of the SAO/NASA Astrophysics Data System (ADS). Since the ADS is used by the entire astronomical community, these are indicative of how the astronomical literature is used. We will show how the use of the ADS has changed both quantitatively and qualitatively. We will also show that different types of users access the system in different ways…
▽ More
In this paper we present a number of metrics for usage of the SAO/NASA Astrophysics Data System (ADS). Since the ADS is used by the entire astronomical community, these are indicative of how the astronomical literature is used. We will show how the use of the ADS has changed both quantitatively and qualitatively. We will also show that different types of users access the system in different ways. Finally, we show how use of the ADS has evolved over the years in various regions of the world.
The ADS is funded by NASA Grant NNG06GG68G.
△ Less
Submitted 3 October, 2008; v1 submitted 1 August, 2008;
originally announced August 2008.
-
Finding Astronomical Communities Through Co-readership Analysis
Authors:
Edwin A. Henneken,
Michael J. Kurtz,
Guenther Eichhorn,
Alberto Accomazzi,
Carolyn S. Grant,
Donna Thompson,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
Whenever a large group of people are engaged in an activity, communities will form. The nature of these communities depends on the relationship considered. In the group of people who regularly use scholarly literature, a relationship like ``person i and person j have cited the same paper'' might reveal communities of people working in a particular field. On this poster, we will investigate the r…
▽ More
Whenever a large group of people are engaged in an activity, communities will form. The nature of these communities depends on the relationship considered. In the group of people who regularly use scholarly literature, a relationship like ``person i and person j have cited the same paper'' might reveal communities of people working in a particular field. On this poster, we will investigate the relationship ``person i and person j have read the same paper''. Using the data logs of the NASA/Smithsonian Astrophysics Data System (ADS), we first determine the population that will participate by requiring that a user queries the ADS at a certain rate. Next, we apply the relationship to this population. The result of this will be an abstract ``relationship space'', which we will describe in terms of various ``representations''. Examples of such ``representations'' are the projection of co-read vectors onto Principal Components and the spectral density of the co-read network. We will show that the co-read relationship results in structure, we will describe this structure and we will provide a first attempt in the classification of this structure in terms of astronomical communities.
The ADS is funded by NASA Grant NNG06GG68G.
△ Less
Submitted 5 January, 2007;
originally announced January 2007.
-
Paper to Screen: Processing Historical Scans in the ADS
Authors:
Donna M. Thompson,
Alberto Accomazzi,
Guenther Eichhorn,
Carolyn Grant,
Edwin Henneken,
Michael J. Kurtz,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
The NASA Astrophysics Data System in conjunction with the Wolbach Library at the Harvard-Smithsonian Center for Astrophysics is working on a project to microfilm historical observatory publications. The microfilm is then scanned for inclusion in the ADS. The ADS currently contains over 700,000 scanned pages of volumes of historical literature. Many of these volumes lack clear pagination or other…
▽ More
The NASA Astrophysics Data System in conjunction with the Wolbach Library at the Harvard-Smithsonian Center for Astrophysics is working on a project to microfilm historical observatory publications. The microfilm is then scanned for inclusion in the ADS. The ADS currently contains over 700,000 scanned pages of volumes of historical literature. Many of these volumes lack clear pagination or other bibliographic data that are necessary to take advantage of the searching capabilities of the ADS. This paper will address some of the interesting challenges that needed to be resolved during the processing of the Observatory Reports included in the ADS.
△ Less
Submitted 5 October, 2006;
originally announced October 2006.
-
Data in the ADS -- Understanding How to Use it Better
Authors:
Carolyn S. Grant,
Alberto Accomazzi,
Donna Thompson,
Edwin Henneken,
Guenther Eichhorn,
Michael J. Kurtz,
Stephen S. Murray
Abstract:
The Smithsonian/NASA ADS Abstract Service contains a wealth of data for astronomers and librarians alike, yet the vast majority of usage consists of rudimentary searches. Hints on how to obtain more focused search results by using more of the various capabilities of the ADS are presented, including searching by affiliation. We also discuss the classification of articles by content and by referee…
▽ More
The Smithsonian/NASA ADS Abstract Service contains a wealth of data for astronomers and librarians alike, yet the vast majority of usage consists of rudimentary searches. Hints on how to obtain more focused search results by using more of the various capabilities of the ADS are presented, including searching by affiliation. We also discuss the classification of articles by content and by referee status.
The ADS is funded by NASA Grant NNG06GG68G-16613687.
△ Less
Submitted 5 October, 2006;
originally announced October 2006.
-
Creation and use of Citations in the ADS
Authors:
Alberto Accomazzi,
Gunther Eichhorn,
Michael J. Kurtz,
Carolyn S. Grant,
Edwin Henneken,
Markus Demleitner,
Donna Thompson,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
With over 20 million records, the ADS citation database is regularly used by researchers and librarians to measure the scientific impact of individuals, groups, and institutions. In addition to the traditional sources of citations, the ADS has recently added references extracted from the arXiv e-prints on a nightly basis. We review the procedures used to harvest and identify the reference data u…
▽ More
With over 20 million records, the ADS citation database is regularly used by researchers and librarians to measure the scientific impact of individuals, groups, and institutions. In addition to the traditional sources of citations, the ADS has recently added references extracted from the arXiv e-prints on a nightly basis. We review the procedures used to harvest and identify the reference data used in the creation of citations, the policies and procedures that we follow to avoid double-counting and to eliminate contributions which may not be scholarly in nature. Finally, we describe how users and institutions can easily obtain quantitative citation data from the ADS, both interactively and via web-based programming tools.
The ADS is available at http://ads.harvard.edu.
△ Less
Submitted 3 October, 2006;
originally announced October 2006.
-
Connectivity in the Astronomy Digital Library
Authors:
Günther Eichhorn,
Alberto Accomazzi,
Carolyn S. Grant,
Edwin A. Henneken,
Donna M. Thompson,
Michael J. Kurtz,
Stephen S. Murray
Abstract:
The Astrophysics Data System (ADS) provides an extensive system of links between the literature and other on-line information. Recently, the journals of the American Astronomical Society (AAS) and a group of NASA data centers have collaborated to provide more links between on-line data obtained by space missions and the on-line journals. Authors can now specify which data sets they have used in…
▽ More
The Astrophysics Data System (ADS) provides an extensive system of links between the literature and other on-line information. Recently, the journals of the American Astronomical Society (AAS) and a group of NASA data centers have collaborated to provide more links between on-line data obtained by space missions and the on-line journals. Authors can now specify which data sets they have used in their article. This information is used by the participants to provide the links between the literature and the data.
The ADS is available at: http://ads.harvard.edu
△ Less
Submitted 2 October, 2006;
originally announced October 2006.
-
Full Text Searching in the Astrophysics Data System
Authors:
Günther Eichhorn,
Alberto Accomazzi,
Carolyn S. Grant,
Edwin A. Henneken,
Donna M. Thompson,
Michael J. Kurtz,
Stephen S. Murray
Abstract:
The Smithsonian/NASA Astrophysics Data System (ADS) provides a search system for the astronomy and physics scholarly literature. All major and many smaller astronomy journals that were published on paper have been scanned back to volume 1 and are available through the ADS free of charge. All scanned pages have been converted to text and can be searched through the ADS Full Text Search System. In…
▽ More
The Smithsonian/NASA Astrophysics Data System (ADS) provides a search system for the astronomy and physics scholarly literature. All major and many smaller astronomy journals that were published on paper have been scanned back to volume 1 and are available through the ADS free of charge. All scanned pages have been converted to text and can be searched through the ADS Full Text Search System. In addition, searches can be fanned out to several external search systems to include the literature published in electronic form. Results from the different search systems are combined into one results list.
The ADS Full Text Search System is available at: http://adsabs.harvard.edu/fulltext_service.html
△ Less
Submitted 5 October, 2006; v1 submitted 2 October, 2006;
originally announced October 2006.
-
E-prints and Journal Articles in Astronomy: a Productive Co-existence
Authors:
Edwin A. Henneken,
Michael J. Kurtz,
Simeon Warner,
Paul Ginsparg,
Guenther Eichhorn,
Alberto Accomazzi,
Carolyn S. Grant,
Donna Thompson,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
Are the e-prints (electronic preprints) from the arXiv repository being used instead of the journal articles? In this paper we show that the e-prints have not undermined the usage of journal papers in the astrophysics community. As soon as the journal article is published, the astronomical community prefers to read the journal article and the use of e-prints through the NASA Astrophysics Data Sy…
▽ More
Are the e-prints (electronic preprints) from the arXiv repository being used instead of the journal articles? In this paper we show that the e-prints have not undermined the usage of journal papers in the astrophysics community. As soon as the journal article is published, the astronomical community prefers to read the journal article and the use of e-prints through the NASA Astrophysics Data System drops to zero. This suggests that the majority of astronomers have access to institutional subscriptions and that they choose to read the journal article when given the choice. Within the NASA Astrophysics Data System they are given this choice, because the e-print and the journal article are treated equally, since both are just one click away. In other words, the e-prints have not undermined journal use in the astrophysics community and thus currently do not pose a financial threat to the publishers. We present readership data for the arXiv category "astro-ph" and the 4 core journals in astronomy (Astrophysical Journal, Astronomical Journal, Monthly Notices of the Royal Astronomical Society and Astronomy & Astrophysics). Furthermore, we show that the half-life (the point where the use of an article drops to half the use of a newly published article) for an e-print is shorter than for a journal paper.
The ADS is funded by NASA Grant NNG06GG68G. arXiv receives funding from NSF award #0404553
△ Less
Submitted 22 September, 2006;
originally announced September 2006.
-
The Future of Technical Libraries
Authors:
Michael J. Kurtz,
Guenther Eichhorn,
Alberto Accomazzi,
Carolyn Grant,
Edwin Henneken,
Donna Thompson,
Elizabeth Bohlen,
Stephen S. Murray
Abstract:
Technical libraries are currently experiencing very rapid change. In the near future their mission will change, their physical nature will change, and the skills of their employees will change. While some will not be able to make these changes, and will fail, others will lead us into a new era.
Technical libraries are currently experiencing very rapid change. In the near future their mission will change, their physical nature will change, and the skills of their employees will change. While some will not be able to make these changes, and will fail, others will lead us into a new era.
△ Less
Submitted 28 September, 2006;
originally announced September 2006.
-
myADS-arXiv - a Tailor-Made, Open Access, Virtual Journal
Authors:
E. Henneken,
M. J. Kurtz,
G. Eichhorn,
A. Accomazzi,
C. S. Grant,
D. Thompson,
E. Bohlen,
S. S. Murray
Abstract:
The myADS-arXiv service provides the scientific community with a one stop shop for staying up-to-date with a researcher's field of interest. The service provides a powerful and unique filter on the enormous amount of bibliographic information added to the ADS on a daily basis. It also provides a complete view with the most relevant papers available in the subscriber's field of interest. With thi…
▽ More
The myADS-arXiv service provides the scientific community with a one stop shop for staying up-to-date with a researcher's field of interest. The service provides a powerful and unique filter on the enormous amount of bibliographic information added to the ADS on a daily basis. It also provides a complete view with the most relevant papers available in the subscriber's field of interest. With this service, the subscriber will get to know the lastest developments, popular trends and the most important papers. This makes the service not only unique from a technical point of view, but also from a content point of view. On this poster we will argue why myADS-arXiv is a tailor-made, open access, virtual journal and we will illustrate its unique character.
△ Less
Submitted 4 August, 2006;
originally announced August 2006.
-
Effect of E-printing on Citation Rates in Astronomy and Physics
Authors:
Edwin A. Henneken,
Michael J. Kurtz,
Guenther Eichhorn,
Alberto Accomazzi,
Carolyn Grant,
Donna Thompson,
Stephen S. Murray
Abstract:
In this report we examine the change in citation behavior since the introduction of the arXiv e-print repository (Ginsparg, 2001). It has been observed that papers that initially appear as arXiv e-prints get cited more than papers that do not (Lawrence, 2001; Brody et al., 2004; Schwarz & Kennicutt, 2004; Kurtz et al., 2005a, Metcalfe, 2005). Using the citation statistics from the NASA-Smithsoni…
▽ More
In this report we examine the change in citation behavior since the introduction of the arXiv e-print repository (Ginsparg, 2001). It has been observed that papers that initially appear as arXiv e-prints get cited more than papers that do not (Lawrence, 2001; Brody et al., 2004; Schwarz & Kennicutt, 2004; Kurtz et al., 2005a, Metcalfe, 2005). Using the citation statistics from the NASA-Smithsonian Astrophysics Data System (ADS; Kurtz et al., 1993, 2000), we confirm the findings from other studies, we examine the average citation rate to e-printed papers in the Astrophysical Journal, and we show that for a number of major astronomy and physics journals the most important papers are submitted to the arXiv e-print repository first.
△ Less
Submitted 5 June, 2006; v1 submitted 13 April, 2006;
originally announced April 2006.