-
A survey on Big Data and Machine Learning for Chemistry
Authors:
Jose F Rodrigues Jr,
Larisa Florea,
Maria C F de Oliveira,
Dermot Diamond,
Osvaldo N Oliveira Jr
Abstract:
Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production;…
▽ More
Herein we review aspects of leading-edge research and innovation in chemistry which exploits big data and machine learning (ML), two computer science fields that combine to yield machine intelligence. ML can accelerate the solution of intricate chemical problems and even solve problems that otherwise would not be tractable. But the potential benefits of ML come at the cost of big data production; that is, the algorithms, in order to learn, demand large volumes of data of various natures and from different sources, from materials properties to sensor data. In the survey, we propose a roadmap for future developments, with emphasis on materials discovery and chemical sensing, and within the context of the Internet of Things (IoT), both prominent research fields for ML in the context of big data. In addition to providing an overview of recent advances, we elaborate upon the conceptual and practical limitations of big data and ML applied to chemistry, outlining processes, discussing pitfalls, and reviewing cases of success and failure.
△ Less
Submitted 23 April, 2019;
originally announced April 2019.
-
BoWFire: Detection of Fire in Still Images by Integrating Pixel Color and Texture Analysis
Authors:
Daniel Y. T. Chino,
Letricia P. S. Avalhais,
Jose F. Rodrigues Jr.,
Agma J. M. Traina
Abstract:
Emergency events involving fire are potentially harmful, demanding a fast and precise decision making. The use of crowdsourcing image and videos on crisis management systems can aid in these situations by providing more information than verbal/textual descriptions. Due to the usual high volume of data, automatic solutions need to discard non-relevant content without losing relevant information. Th…
▽ More
Emergency events involving fire are potentially harmful, demanding a fast and precise decision making. The use of crowdsourcing image and videos on crisis management systems can aid in these situations by providing more information than verbal/textual descriptions. Due to the usual high volume of data, automatic solutions need to discard non-relevant content without losing relevant information. There are several methods for fire detection on video using color-based models. However, they are not adequate for still image processing, because they can suffer on high false-positive results. These methods also suffer from parameters with little physical meaning, which makes fine tuning a difficult task. In this context, we propose a novel fire detection method for still images that uses classification based on color features combined with texture classification on superpixel regions. Our method uses a reduced number of parameters if compared to previous works, easing the process of fine tuning the method. Results show the effectiveness of our method of reducing false-positives while its precision remains compatible with the state-of-the-art methods.
△ Less
Submitted 10 June, 2015;
originally announced June 2015.
-
Reviewing Data Visualization: an Analytical Taxonomical Study
Authors:
Jose F. Rodrigues Jr.,
Agma J. M. Traina,
Maria Cristina F. de Oliveira,
Caetano Traina Jr
Abstract:
This paper presents an analytical taxonomy that can suitably describe, rather than simply classify, techniques for data presentation. Unlike previous works, we do not consider particular aspects of visualization techniques, but their mechanisms and foundational vision perception. Instead of just adjusting visualization research to a classification system, our aim is to better understand its proces…
▽ More
This paper presents an analytical taxonomy that can suitably describe, rather than simply classify, techniques for data presentation. Unlike previous works, we do not consider particular aspects of visualization techniques, but their mechanisms and foundational vision perception. Instead of just adjusting visualization research to a classification system, our aim is to better understand its process. For doing so, we depart from elementary concepts to reach a model that can describe how visualization techniques work and how they convey meaning.
△ Less
Submitted 9 June, 2015;
originally announced June 2015.
-
The Spatial-Perceptual Design Space: a new comprehension for Data Visualization
Authors:
Jose F. Rodrigues Jr,
Agma J. M. Traina,
Maria C. F. Oliveira,
Caetano Traina Jr
Abstract:
We revisit the design space of visualizations aiming at identifying and relating its components. In this sense, we establish a model to examine the process through which visualizations become expressive for users. This model has leaded us to a taxonomy oriented to the human visual perception, a conceptualization that provides natural criteria in order to delineate a novel understanding for the vis…
▽ More
We revisit the design space of visualizations aiming at identifying and relating its components. In this sense, we establish a model to examine the process through which visualizations become expressive for users. This model has leaded us to a taxonomy oriented to the human visual perception, a conceptualization that provides natural criteria in order to delineate a novel understanding for the visualization design space. The new organization of concepts that we introduce is our main contribution: a grammar for the visualization design based on the review of former works and of classical and state-of-the-art techniques. Like so, the paper is presented as a survey whose structure introduces a new conceptualization for the space of techniques concerning visual analysis.
△ Less
Submitted 28 May, 2015;
originally announced May 2015.
-
Large Graph Analysis in the GMine System
Authors:
Jose F. Rodrigues Jr.,
Hanghang Tong,
Jia-Yu Pan,
Agma J. M. Traina,
Caetano Traina Jr.,
Christos Faloutsos
Abstract:
Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers and communities. These tasks are better performed in an interactive environment, where human expertise can guide the process. For large graphs, though, there are some challenges: the excessive processing requir…
▽ More
Current applications have produced graphs on the order of hundreds of thousands of nodes and millions of edges. To take advantage of such graphs, one must be able to find patterns, outliers and communities. These tasks are better performed in an interactive environment, where human expertise can guide the process. For large graphs, though, there are some challenges: the excessive processing requirements are prohibitive, and drawing hundred-thousand nodes results in cluttered images hard to comprehend. To cope with these problems, we propose an innovative framework suited for any kind of tree-like graph visual design. GMine integrates (a) a representation for graphs organized as hierarchies of partitions - the concepts of SuperGraph and Graph-Tree; and (b) a graph summarization methodology - CEPS. Our graph representation deals with the problem of tracing the connection aspects of a graph hierarchy with sub linear complexity, allowing one to grasp the neighborhood of a single node or of a group of nodes in a single click. As a proof of concept, the visual environment of GMine is instantiated as a system in which large graphs can be investigated globally and locally.
△ Less
Submitted 28 May, 2015;
originally announced May 2015.