-
Extracting and visualizing a new classification system for Colombia's National Administrative Department of Statistics. A visual analytics framework case study
Authors:
Pierre Raimbaud,
Jaime Camilo Espitia Castillo,
John Guerra-Gomez
Abstract:
In a world filled with data, it is expected for a nation to take decisions informed by data. However, countries need to first collect and publish such data in a way meaningful for both citizens and policy makers. A good thematic classification could be instrumental in helping users navigate and find the right resources on a rich data repository as the one collected by Colombia's National Administr…
▽ More
In a world filled with data, it is expected for a nation to take decisions informed by data. However, countries need to first collect and publish such data in a way meaningful for both citizens and policy makers. A good thematic classification could be instrumental in helping users navigate and find the right resources on a rich data repository as the one collected by Colombia's National Administrative Department of Statistics (DANE). The Visual Analytics Framework is a methodology for conducting visual analysis developed by T. Munzner et al. [T. Munzner, Visualization Analysis and Design, A K Peters Visualization Series, 1, 2014] that could help with this task. This paper presents a case study applying such framework conducted to help the DANE better visualize their data repository, and present a more understandable classification of it. It describes three main analysis tasks identified, the proposed solutions and the collection of insights generated from them.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
3D human pose estimation with adaptive receptive fields and dilated temporal convolutions
Authors:
Michael Shin,
Eduardo Castillo,
Irene Font Peradejordi,
Shobhna Jayaraman
Abstract:
In this work, we demonstrate that receptive fields in 3D pose estimation can be effectively specified using optical flow. We introduce adaptive receptive fields, a simple and effective method to aid receptive field selection in pose estimation models based on optical flow inference. We contrast the performance of a benchmark state-of-the-art model running on fixed receptive fields with their adapt…
▽ More
In this work, we demonstrate that receptive fields in 3D pose estimation can be effectively specified using optical flow. We introduce adaptive receptive fields, a simple and effective method to aid receptive field selection in pose estimation models based on optical flow inference. We contrast the performance of a benchmark state-of-the-art model running on fixed receptive fields with their adaptive field counterparts. By using a reduced receptive field, our model can process slow-motion sequences (10x longer) 23% faster than the benchmark model running at regular speed. The reduction in computational cost is achieved while producing a pose prediction accuracy to within 0.36% of the benchmark model.
△ Less
Submitted 28 May, 2020;
originally announced May 2020.
-
The Panacea Threat Intelligence and Active Defense Platform
Authors:
Adam Dalton,
Ehsan Aghaei,
Ehab Al-Shaer,
Archna Bhatia,
Esteban Castillo,
Zhuo Cheng,
Sreekar Dhaduvai,
Qi Duan,
Md Mazharul Islam,
Younes Karimi,
Amir Masoumzadeh,
Brodie Mather,
Sashank Santhanam,
Samira Shaikh,
Tomek Strzalkowski,
Bonnie J. Dorr
Abstract:
We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative appro…
▽ More
We describe Panacea, a system that supports natural language processing (NLP) components for active defenses against social engineering attacks. We deploy a pipeline of human language technology, including Ask and Framing Detection, Named Entity Recognition, Dialogue Engineering, and Stylometry. Panacea processes modern message formats through a plug-in architecture to accommodate innovative approaches for message analysis, knowledge representation and dialogue generation. The novelty of the Panacea system is that uses NLP for cyber defense and engages the attacker using bots to elicit evidence to attribute to the attacker and to waste the attacker's time and resources.
△ Less
Submitted 20 April, 2020;
originally announced April 2020.
-
A social Network Analysis of the Operations Research/Industrial Engineering Faculty Hiring Network
Authors:
Enrique del Castillo,
Adam Meyers,
Peng Chen
Abstract:
We study the U.S. Operations Research/Industrial-Systems Engineering (ORIE) faculty hiring network, consisting of 1,179 faculty origin and destination data together with attribute data from 83 ORIE departments. A social network analysis of faculty hires can reveal important patterns in an academic field, such as the existence of a hierarchy or sociological aspects such as the presence of communiti…
▽ More
We study the U.S. Operations Research/Industrial-Systems Engineering (ORIE) faculty hiring network, consisting of 1,179 faculty origin and destination data together with attribute data from 83 ORIE departments. A social network analysis of faculty hires can reveal important patterns in an academic field, such as the existence of a hierarchy or sociological aspects such as the presence of communities of departments. We first statistically test for the existence of a linear hierarchy in the network and for its steepness. We find a near linear hierarchical order of the departments, proposing a new index for hiring networks, which we contrast with other indicators of hierarchy, including published rankings. A single index is not capable to capture the full structure of a complex network, however, so we next fit a latent exponential random graph model (ERGM) to the network, which is able to reproduce its main observed characteristics: high incidence of self-hiring, skewed out-degree distribution, low density and clustering. Finally, we use the latent variables in the ERGM to simplify the network to one where faculty hires take place among three groups of departments. We contrast our findings with those reported for other related disciplines, Computer Science and Business.
△ Less
Submitted 8 March, 2018; v1 submitted 28 February, 2018;
originally announced March 2018.
-
Efficient image deployment in cloud environments
Authors:
Álvaro López García,
Enol Fernández del Castillo
Abstract:
The biggest overhead for the instantiation of a virtual machine in a cloud infrastructure is the time spent in transferring the image of the virtual machine into the physical node that executes it. This overhead becomes larger for requests composed of several virtual machines to be started concurrently, and the illusion of flexibility and elasticity usually associated with the cloud computing mode…
▽ More
The biggest overhead for the instantiation of a virtual machine in a cloud infrastructure is the time spent in transferring the image of the virtual machine into the physical node that executes it. This overhead becomes larger for requests composed of several virtual machines to be started concurrently, and the illusion of flexibility and elasticity usually associated with the cloud computing model may vanish. This poses a problem for both the resource providers and the software developers, since tackling those overheads is not a trivial issue.
In this work we implement and evaluate several improvements for virtual machine image distribution problem in a cloud infrastructure and propose a method based on BitTorrent and local caching of the virtual machine images that reduces the transfer time when large requests are made
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Standards for enabling heterogeneous IaaS cloud federations
Authors:
Álvaro López García,
Enol Fernández del Castillo,
Pablo Orviz Fernández
Abstract:
Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Grasping…
▽ More
Technology market is continuing a rapid growth phase where different resource providers and Cloud Management Frameworks are positioning to provide ad-hoc solutions -in terms of management interfaces, information discovery or billing- trying to differentiate from competitors but that as a result remain incompatible between them when addressing more complex scenarios like federated clouds. Grasping interoperability problems present in current infrastructures is then a must-do, tackled by studying how existing and emerging standards could enhance user experience in the cloud ecosystem. In this paper we will review the current open challenges in Infrastructure as a Service cloud interoperability and federation, as well as point to the potential standards that should alleviate these problems.
△ Less
Submitted 21 November, 2017;
originally announced November 2017.
-
Automatic Response Assessment in Regions of Language Cortex in Epilepsy Patients Using ECoG-based Functional Mapping and Machine Learning
Authors:
Harish RaviPrakash,
Milena Korostenskaja,
Eduardo Castillo,
Ki Lee,
James Baumgartner,
Ulas Bagci
Abstract:
Accurate localization of brain regions responsible for language and cognitive functions in Epilepsy patients should be carefully determined prior to surgery. Electrocorticography (ECoG)-based Real Time Functional Mapping (RTFM) has been shown to be a safer alternative to the electrical cortical stimulation mapping (ESM), which is currently the clinical/gold standard. Conventional methods for analy…
▽ More
Accurate localization of brain regions responsible for language and cognitive functions in Epilepsy patients should be carefully determined prior to surgery. Electrocorticography (ECoG)-based Real Time Functional Mapping (RTFM) has been shown to be a safer alternative to the electrical cortical stimulation mapping (ESM), which is currently the clinical/gold standard. Conventional methods for analyzing RTFM signals are based on statistical comparison of signal power at certain frequency bands. Compared to gold standard (ESM), they have limited accuracies when assessing channel responses.
In this study, we address the accuracy limitation of the current RTFM signal estimation methods by analyzing the full frequency spectrum of the signal and replacing signal power estimation methods with machine learning algorithms, specifically random forest (RF), as a proof of concept. We train RF with power spectral density of the time-series RTFM signal in supervised learning framework where ground truth labels are obtained from the ESM. Results obtained from RTFM of six adult patients in a strictly controlled experimental setup reveal the state of the art detection accuracy of $\approx 78\%$ for the language comprehension task, an improvement of $23\%$ over the conventional RTFM estimation method. To the best of our knowledge, this is the first study exploring the use of machine learning approaches for determining RTFM signal characteristics, and using the whole-frequency band for better region localization. Our results demonstrate the feasibility of machine learning based RTFM signal analysis method over the full spectrum to be a clinical routine in the near future.
△ Less
Submitted 6 August, 2017; v1 submitted 26 May, 2017;
originally announced June 2017.
-
Truss Analysis Discussion and Interpretation Using Linear Systems of Equalities and Inequalities
Authors:
R. Mínguez,
E. Castillo,
R. Pruneda,
C. Solares
Abstract:
This paper shows the complementary roles of mathematical and engineering points of view when dealing with truss analysis problems involving systems of linear equations and inequalities. After the compatibility condition and the mathematical structure of the general solution of a system of linear equations is discussed, the truss analysis problem is used to illustrate its mathematical and engineeri…
▽ More
This paper shows the complementary roles of mathematical and engineering points of view when dealing with truss analysis problems involving systems of linear equations and inequalities. After the compatibility condition and the mathematical structure of the general solution of a system of linear equations is discussed, the truss analysis problem is used to illustrate its mathematical and engineering multiple aspects, including an analysis of the compatibility conditions and a physical interpretation of the general solution, and the generators of the resulting affine space. Next, the compatibility and the mathematical structure of the general solution of linear systems of inequalities are analyzed and the truss analysis problem revisited adding some inequality constraints, and discussing how they affect the resulting general solution and many other aspects of it. Finally, some conclusions are drawn.
△ Less
Submitted 27 January, 2015;
originally announced January 2015.
-
Analysis of Scientific Cloud Computing requirements
Authors:
Álvaro López García,
Enol Fernández del Castillo
Abstract:
While the requirements of enterprise and web applications have driven the development of Cloud computing, some of its key features, such as customized environments and rapid elasticity, could also benefit scientific applications. However, neither virtualization techniques nor Cloud-like access to resources is common in scientific computing centers due to the negative perception of the impact that…
▽ More
While the requirements of enterprise and web applications have driven the development of Cloud computing, some of its key features, such as customized environments and rapid elasticity, could also benefit scientific applications. However, neither virtualization techniques nor Cloud-like access to resources is common in scientific computing centers due to the negative perception of the impact that virtualization techniques introduce.
In this paper we discuss the feasibility of the IaaS cloud model to satisfy some of the computational science requirements and the main drawbacks that need to be addressed by cloud resource providers so that the maximum benefit can be obtained from a given cloud infrastructure.
△ Less
Submitted 22 June, 2015; v1 submitted 24 September, 2013;
originally announced September 2013.
-
Error Estimation in Approximate Bayesian Belief Network Inference
Authors:
Enrique F. Castillo,
Remco R. Bouckaert,
Jose M. Sarabia,
Cristina Solares
Abstract:
We can perform inference in Bayesian belief networks by enumerating instantiations with high probability thus approximating the marginals. In this paper, we present a method for determining the fraction of instantiations that has to be considered such that the absolute error in the marginals does not exceed a predefined value. The method is based on extreme value theory. Essentially, the propose…
▽ More
We can perform inference in Bayesian belief networks by enumerating instantiations with high probability thus approximating the marginals. In this paper, we present a method for determining the fraction of instantiations that has to be considered such that the absolute error in the marginals does not exceed a predefined value. The method is based on extreme value theory. Essentially, the proposed method uses the reversed generalized Pareto distribution to model probabilities of instantiations below a given threshold. Based on this distribution, an estimate of the maximal absolute error if instantiations with probability smaller than u are disregarded can be made.
△ Less
Submitted 20 February, 2013;
originally announced February 2013.
-
Tail Sensitivity Analysis in Bayesian Networks
Authors:
Enrique F. Castillo,
Cristina Solares,
Patricia Gomez
Abstract:
The paper presents an efficient method for simulating the tails of a target variable Z=h(X) which depends on a set of basic variables X=(X_1, ..., X_n). To this aim, variables X_i, i=1, ..., n are sequentially simulated in such a manner that Z=h(x_1, ..., x_i-1, X_i, ..., X_n) is guaranteed to be in the tail of Z. When this method is difficult to apply, an alternative method is proposed, which l…
▽ More
The paper presents an efficient method for simulating the tails of a target variable Z=h(X) which depends on a set of basic variables X=(X_1, ..., X_n). To this aim, variables X_i, i=1, ..., n are sequentially simulated in such a manner that Z=h(x_1, ..., x_i-1, X_i, ..., X_n) is guaranteed to be in the tail of Z. When this method is difficult to apply, an alternative method is proposed, which leads to a low rejection proportion of sample values, when compared with the Monte Carlo method. Both methods are shown to be very useful to perform a sensitivity analysis of Bayesian networks, when very large confidence intervals for the marginal/conditional probabilities are required, as in reliability or risk analysis. The methods are shown to behave best when all scores coincide. The required modifications for this to occur are discussed. The methods are illustrated with several examples and one example of application to a real case is used to illustrate the whole process.
△ Less
Submitted 13 February, 2013;
originally announced February 2013.
-
Marginalizing in Undirected Graph and Hypergraph Models
Authors:
Enrique F. Castillo,
Juan Ferrándiz,
Pilar Sanmartin
Abstract:
Given an undirected graph G or hypergraph X model for a given set of variables V, we introduce two marginalization operators for obtaining the undirected graph GA or hypergraph HA associated with a given subset A c V such that the marginal distribution of A factorizes according to GA or HA, respectively. Finally, we illustrate the method by its application to some practical examples. With them we…
▽ More
Given an undirected graph G or hypergraph X model for a given set of variables V, we introduce two marginalization operators for obtaining the undirected graph GA or hypergraph HA associated with a given subset A c V such that the marginal distribution of A factorizes according to GA or HA, respectively. Finally, we illustrate the method by its application to some practical examples. With them we show that hypergraph models allow defining a finer factorization or performing a more precise conditional independence analysis than undirected graph models.
△ Less
Submitted 30 January, 2013;
originally announced January 2013.
-
Phenomenology Tools on Cloud Infrastructures using OpenStack
Authors:
I. Campos,
E. Fernandez del Castillo,
S. Heinemeyer,
A. Lopez-Garcia,
F. v. d. Pahlen
Abstract:
We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing. On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way. We analyze the performance of this environment based on "virtual" machines versus the utilization of "real" physical har…
▽ More
We present a new environment for computations in particle physics phenomenology employing recent developments in cloud computing. On this environment users can create and manage "virtual" machines on which the phenomenology codes/tools can be deployed easily in an automated way. We analyze the performance of this environment based on "virtual" machines versus the utilization of "real" physical hardware. In this way we provide a qualitative result for the influence of the host operating system on the performance of a representative set of applications for phenomenology calculations.
△ Less
Submitted 17 March, 2013; v1 submitted 19 December, 2012;
originally announced December 2012.