-
GenoML: Automated Machine Learning for Genomics
Authors:
Mary B. Makarious,
Hampton L. Leonard,
Dan Vitale,
Hirotaka Iwaki,
David Saffo,
Lana Sargent,
Anant Dadu,
Eduardo Salmerón Castaño,
John F. Carter,
Melina Maleknia,
Juan A. Botia,
Cornelis Blauwendraat,
Roy H. Campbell,
Sayed Hadi Hashemi,
Andrew B. Singleton,
Mike A. Nalls,
Faraz Faghri
Abstract:
GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlyin…
▽ More
GenoML is a Python package automating machine learning workflows for genomics (genetics and multi-omics) with an open science philosophy. Genomics data require significant domain expertise to clean, pre-process, harmonize and perform quality control of the data. Furthermore, tuning, validation, and interpretation involve taking into account the biology and possibly the limitations of the underlying data collection, protocols, and technology. GenoML's mission is to bring machine learning for genomics and clinical data to non-experts by developing an easy-to-use tool that automates the full development, evaluation, and deployment process. Emphasis is put on open science to make workflows easily accessible, replicable, and transferable within the scientific community. Source code and documentation is available at https://genoml.com.
△ Less
Submitted 4 March, 2021;
originally announced March 2021.
-
Evaluating the Effect of Timeline Shape on Visualization Task Performance
Authors:
Sara Di Bartolomeo,
Aditeya Pandey,
Aristotelis Leventidis,
David Saffo,
Uzma Haque Syeda,
Elin Carstensdottir,
Magy Seif El-Nasr,
Michelle A. Borkin,
Cody Dunne
Abstract:
Timelines are commonly represented on a horizontal line, which is not necessarily the most effective way to visualize temporal event sequences. However, few experiments have evaluated how timeline shape influences task performance. We present the design and results of a controlled experiment run on Amazon Mechanical Turk (n=192) in which we evaluate how timeline shape affects task completion time,…
▽ More
Timelines are commonly represented on a horizontal line, which is not necessarily the most effective way to visualize temporal event sequences. However, few experiments have evaluated how timeline shape influences task performance. We present the design and results of a controlled experiment run on Amazon Mechanical Turk (n=192) in which we evaluate how timeline shape affects task completion time, correctness, and user preference. We tested 12 combinations of 4 shapes -- horizontal line, vertical line, circle, and spiral -- and 3 data types -- recurrent, non-recurrent, and mixed event sequences. We found good evidence that timeline shape meaningfully affects user task completion time but not correctness and that users have a strong shape preference. Building on our results, we present design guidelines for creating effective timeline visualizations based on user task and data types. A free copy of this paper, the evaluation stimuli and data, and code are available at https://osf.io/qr5yu/
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Two Dimensions for Organizing Immersive Analytics: Toward a Taxonomy for Facet and Position
Authors:
David Saffo,
Sara Di Bartolomeo,
Caglar Yildirim,
Cody Dunne
Abstract:
As immersive analytics continues to grow as a discipline, so too should its underlying methodological support. Taxonomies play an important role for information visualization and human computer interaction. They provide an organization of the techniques used in a particular domain that better enable researchers to describe their work, discover existing methods, and identify gaps in the literature.…
▽ More
As immersive analytics continues to grow as a discipline, so too should its underlying methodological support. Taxonomies play an important role for information visualization and human computer interaction. They provide an organization of the techniques used in a particular domain that better enable researchers to describe their work, discover existing methods, and identify gaps in the literature. Existing taxonomies in related fields do not capture or describe the unique paradigms employed in immersive analytics. We conceptualize a taxonomy that organizes immersive analytics according to two dimensions: spatial and visual presentation. Each intersection of this taxonomy represents a unique design paradigm which, when thoroughly explored, can aid in the design and research of new immersive analytic applications.
△ Less
Submitted 12 May, 2020;
originally announced May 2020.
-
Data Comets: Designing a Visualization Tool for Analyzing Autonomous Aerial Vehicle Logs with Grounded Evaluation
Authors:
David Saffo,
Aristotelis Leventidis,
Twinkle Jain,
Michelle A. Borkin,
Cody Dunne
Abstract:
Autonomous unmanned aerial vehicles are complex systems of hardware, software, and human input. Understanding this complexity is key to their development and operation. Information visualizations already exist for exploring flight logs but comprehensive analyses currently require several disparate and custom tools. This design study helps address the pain points faced by autonomous unmanned aerial…
▽ More
Autonomous unmanned aerial vehicles are complex systems of hardware, software, and human input. Understanding this complexity is key to their development and operation. Information visualizations already exist for exploring flight logs but comprehensive analyses currently require several disparate and custom tools. This design study helps address the pain points faced by autonomous unmanned aerial vehicle developers and operators. We contribute: a spiral development process model for grounded evaluation visualization development focused on progressively broadening target user involvement and refining user goals; a demonstration of the model as part of developing a deployed and adopted visualization system; a data and task abstraction for developers and operators performing post-flight analysis of autonomous unmanned aerial vehicle logs; the design and implementation of DATA COMETS, an open-source and web-based interactive visualization tool for post-flight log analysis incorporating temporal, geospatial, and multivariate data; and the results of a summative evaluation of the visualization system and our abstractions based on in-the-wild usage. A free copy of this paper and source code are available at osf.io/h4p7g
△ Less
Submitted 12 May, 2020;
originally announced May 2020.