Search | arXiv e-print repository

Fading memory and the convolution theorem

Authors: Juan-Pablo Ortega, Florian Rossmannek

Abstract: Several topological and analytical notions of continuity and fading memory for causal and time-invariant filters are introduced, and the relations between them are analysed. A significant generalization of the convolution theorem that establishes the equivalence between the fading memory property and the availability of convolution representations of linear filters is proved. This result extends a… ▽ More Several topological and analytical notions of continuity and fading memory for causal and time-invariant filters are introduced, and the relations between them are analysed. A significant generalization of the convolution theorem that establishes the equivalence between the fading memory property and the availability of convolution representations of linear filters is proved. This result extends a previous such characterization to a complete array of weighted norms in the definition of the fading memory property. Additionally, the main theorem shows that the availability of convolution representations can be characterized, at least when the codomain is finite-dimensional, not only by the fading memory property but also by the reunion of two purely topological notions that are called minimal continuity and minimal fading memory property. Finally, when the input space and the codomain of a linear functional are Hilbert spaces, it is shown that minimal continuity and the minimal fading memory property guarantee the existence of interesting embeddings of the associated reproducing kernel Hilbert spaces and approximation results of solutions of kernel regressions in the presence of finite data sets. △ Less

Submitted 14 August, 2024; originally announced August 2024.

arXiv:2407.02631 [pdf, other]

Nollywood: Let's Go to the Movies!

Authors: John E. Ortega, Ibrahim Said Ahmad, William Chen

Abstract: Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria. Unfortunately, while the movies are in English, they are hard to understand for many native speakers due to the dialect of English that is spoken. In this article, we accomplish two goals: (1) create a phonetic sub-title model that is able to translate Nigerian English speech to Ame… ▽ More Nollywood, based on the idea of Bollywood from India, is a series of outstanding movies that originate from Nigeria. Unfortunately, while the movies are in English, they are hard to understand for many native speakers due to the dialect of English that is spoken. In this article, we accomplish two goals: (1) create a phonetic sub-title model that is able to translate Nigerian English speech to American English and (2) use the most advanced toxicity detectors to discover how toxic the speech is. Our aim is to highlight the text in these videos which is often times ignored for lack of dialectal understanding due the fact that many people in Nigeria speak a native language like Hausa at home. △ Less

Submitted 2 July, 2024; originally announced July 2024.

Comments: 8 pages, 4 figures, 2 tables

arXiv:2407.02424 [pdf, other]

A Pattern Language for Machine Learning Tasks

Authors: Benjamin Rodatz, Ian Fan, Tuomas Laakkonen, Neil John Ortega, Thomas Hoffman, Vincent Wang-Mascianica

Abstract: Idealised as universal approximators, learners such as neural networks can be viewed as "variable functions" that may become one of a range of concrete functions after training. In the same way that equations constrain the possible values of variables in algebra, we may view objective functions as constraints on the behaviour of learners. We extract the equivalences perfectly optimised objective f… ▽ More Idealised as universal approximators, learners such as neural networks can be viewed as "variable functions" that may become one of a range of concrete functions after training. In the same way that equations constrain the possible values of variables in algebra, we may view objective functions as constraints on the behaviour of learners. We extract the equivalences perfectly optimised objective functions impose, calling them "tasks". For these tasks, we develop a formal graphical language that allows us to: (1) separate the core tasks of a behaviour from its implementation details; (2) reason about and design behaviours model-agnostically; and (3) simply describe and unify approaches in machine learning across domains. As proof-of-concept, we design a novel task that enables converting classifiers into generative models we call "manipulators", which we implement by directly translating task specifications into code. The resulting models exhibit capabilities such as style transfer and interpretable latent-space editing, without the need for custom architectures, adversarial training or random sampling. We formally relate the behaviour of manipulators to GANs, and empirically demonstrate their competitive performance with VAEs. We report on experiments across vision and language domains aiming to characterise manipulators as approximate Bayesian inversions of discriminative classifiers. △ Less

Submitted 2 July, 2024; originally announced July 2024.

MSC Class: 18M30; 68T01 ACM Class: I.2.6

arXiv:2404.08717 [pdf, ps, other]

State-Space Systems as Dynamic Generative Models

Authors: Juan-Pablo Ortega, Florian Rossmannek

Abstract: A probabilistic framework to study the dependence structure induced by deterministic discrete-time state-space systems between input and output processes is introduced. General sufficient conditions are formulated under which output processes exist and are unique once an input process has been fixed, a property that in the deterministic state-space literature is known as the echo state property. W… ▽ More A probabilistic framework to study the dependence structure induced by deterministic discrete-time state-space systems between input and output processes is introduced. General sufficient conditions are formulated under which output processes exist and are unique once an input process has been fixed, a property that in the deterministic state-space literature is known as the echo state property. When those conditions are satisfied, the given state-space system becomes a generative model for probabilistic dependences between two sequence spaces. Moreover, those conditions guarantee that the output depends continuously on the input when using the Wasserstein metric. The output processes whose existence is proved are shown to be causal in a specific sense and to generalize those studied in purely deterministic situations. The results in this paper constitute a significant stochastic generalization of sufficient conditions for the deterministic echo state property to hold, in the sense that the stochastic echo state property can be satisfied under contractivity conditions that are strictly weaker than those in deterministic situations. This means that state-space systems can induce a purely probabilistic dependence structure between input and output sequence spaces even when there is no functional relation between those two spaces. △ Less

Submitted 12 April, 2024; originally announced April 2024.

MSC Class: 37H05; 37N35; 62M10; 68T05

arXiv:2403.10070 [pdf, other]

A Structure-Preserving Kernel Method for Learning Hamiltonian Systems

Authors: Jianyu Hu, Juan-Pablo Ortega, Daiying Yin

Abstract: A structure-preserving kernel ridge regression method is presented that allows the recovery of potentially high-dimensional and nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes a closed-form solution that yields excellent numerical performances that surpass other techniques proposed in the literature in this setup. From th… ▽ More A structure-preserving kernel ridge regression method is presented that allows the recovery of potentially high-dimensional and nonlinear Hamiltonian functions out of datasets made of noisy observations of Hamiltonian vector fields. The method proposes a closed-form solution that yields excellent numerical performances that surpass other techniques proposed in the literature in this setup. From the methodological point of view, the paper extends kernel regression methods to problems in which loss functions involving linear functions of gradients are required and, in particular, a differential reproducing property and a Representer Theorem are proved in this context. The relation between the structure-preserving kernel estimator and the Gaussian posterior mean estimator is analyzed. A full error analysis is conducted that provides convergence rates using fixed and adaptive regularization parameters. The good performance of the proposed estimator is illustrated with various numerical experiments. △ Less

Submitted 15 March, 2024; originally announced March 2024.

arXiv:2401.06019 [pdf, other]

doi 10.1117/12.2679734

Automatic UAV-based Airport Pavement Inspection Using Mixed Real and Virtual Scenarios

Authors: Pablo Alonso, Jon Ander Iñiguez de Gordoa, Juan Diego Ortega, Sara García, Francisco Javier Iriarte, Marcos Nieto

Abstract: Runway and taxiway pavements are exposed to high stress during their projected lifetime, which inevitably leads to a decrease in their condition over time. To make sure airport pavement condition ensure uninterrupted and resilient operations, it is of utmost importance to monitor their condition and conduct regular inspections. UAV-based inspection is recently gaining importance due to its wide ra… ▽ More Runway and taxiway pavements are exposed to high stress during their projected lifetime, which inevitably leads to a decrease in their condition over time. To make sure airport pavement condition ensure uninterrupted and resilient operations, it is of utmost importance to monitor their condition and conduct regular inspections. UAV-based inspection is recently gaining importance due to its wide range monitoring capabilities and reduced cost. In this work, we propose a vision-based approach to automatically identify pavement distress using images captured by UAVs. The proposed method is based on Deep Learning (DL) to segment defects in the image. The DL architecture leverages the low computational capacities of embedded systems in UAVs by using an optimised implementation of EfficientNet feature extraction and Feature Pyramid Network segmentation. To deal with the lack of annotated data for training we have developed a synthetic dataset generation methodology to extend available distress datasets. We demonstrate that the use of a mixed dataset composed of synthetic and real training images yields better results when testing the training models in real application scenarios. △ Less

Submitted 11 January, 2024; originally announced January 2024.

Comments: 12 pages, 6 figures, published in proceedings of 15th International Conference on Machine Vision (ICMV)

Journal ref: Proc. SPIE 12701, Fifteenth International Conference on Machine Vision (ICMV 2022), 1270118

arXiv:2312.15850 [pdf]

High Efficiency Inference Accelerating Algorithm for NOMA-based Mobile Edge Computing

Authors: Xin Yuan, Ning Li, Tuo Zhang, Muqing Li, Yuwen Chen, Jose Fernan Martinez Ortega, Song Guo

Abstract: Splitting the inference model between device, edge server, and cloud can improve the performance of EI greatly. Additionally, the non-orthogonal multiple access (NOMA), which is the key supporting technologies of B5G/6G, can achieve massive connections and high spectrum efficiency. Motivated by the benefits of NOMA, integrating NOMA with model split in MEC to reduce the inference latency further b… ▽ More Splitting the inference model between device, edge server, and cloud can improve the performance of EI greatly. Additionally, the non-orthogonal multiple access (NOMA), which is the key supporting technologies of B5G/6G, can achieve massive connections and high spectrum efficiency. Motivated by the benefits of NOMA, integrating NOMA with model split in MEC to reduce the inference latency further becomes attractive. However, the NOMA based communication during split inference has not been properly considered in previous works. Therefore, in this paper, we integrate the NOMA into split inference in MEC, and propose the effective communication and computing resource allocation algorithm to accelerate the model inference at edge. Specifically, when the mobile user has a large model inference task needed to be calculated in the NOMA-based MEC, it will take the energy consumption of both device and edge server and the inference latency into account to find the optimal model split strategy, subchannel allocation strategy (uplink and downlink), and transmission power allocation strategy (uplink and downlink). Since the minimum inference delay and energy consumption cannot be satisfied simultaneously, and the variables of subchannel allocation and model split are discrete, the gradient descent (GD) algorithm is adopted to find the optimal tradeoff between them. Moreover, the loop iteration GD approach (Li-GD) is proposed to reduce the complexity of GD algorithm that caused by the parameter discrete. Additionally, the properties of the proposed algorithm are also investigated, which demonstrate the effectiveness of the proposed algorithms. △ Less

Submitted 25 December, 2023; originally announced December 2023.

Comments: 13 pages, 11 figures

arXiv:2310.19270 [pdf, other]

Invariant kernels on Riemannian symmetric spaces: a harmonic-analytic approach

Authors: Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega, Salem Said

Abstract: This work aims to prove that the classical Gaussian kernel, when defined on a non-Euclidean symmetric space, is never positive-definite for any choice of parameter. To achieve this goal, the paper develops new geometric and analytical arguments. These provide a rigorous characterization of the positive-definiteness of the Gaussian kernel, which is complete but for a limited number of scenarios in… ▽ More This work aims to prove that the classical Gaussian kernel, when defined on a non-Euclidean symmetric space, is never positive-definite for any choice of parameter. To achieve this goal, the paper develops new geometric and analytical arguments. These provide a rigorous characterization of the positive-definiteness of the Gaussian kernel, which is complete but for a limited number of scenarios in low dimensions that are treated by numerical computations. Chief among these results are the L$^{\!\scriptscriptstyle p}$-$\hspace{0.02cm}$Godement theorems (where $p = 1,2$), which provide verifiable necessary and sufficient conditions for a kernel defined on a symmetric space of non-compact type to be positive-definite. A celebrated theorem, sometimes called the Bochner-Godement theorem, already gives such conditions and is far more general in its scope, but is especially hard to apply. Beyond the connection with the Gaussian kernel, the new results in this work lay out a blueprint for the study of invariant kernels on symmetric spaces, bringing forth specific harmonic analysis tools that suggest many future applications. △ Less

Submitted 30 October, 2023; originally announced October 2023.

MSC Class: 43A35; 43A85; 43A90; 46E22; 53C35; 53Z50

arXiv:2310.13821 [pdf, other]

Geometric Learning with Positively Decomposable Kernels

Authors: Nathael Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega, Salem Said

Abstract: Kernel methods are powerful tools in machine learning. Classical kernel methods are based on positive-definite kernels, which map data spaces into reproducing kernel Hilbert spaces (RKHS). For non-Euclidean data spaces, positive-definite kernels are difficult to come by. In this case, we propose the use of reproducing kernel Krein space (RKKS) based methods, which require only kernels that admit a… ▽ More Kernel methods are powerful tools in machine learning. Classical kernel methods are based on positive-definite kernels, which map data spaces into reproducing kernel Hilbert spaces (RKHS). For non-Euclidean data spaces, positive-definite kernels are difficult to come by. In this case, we propose the use of reproducing kernel Krein space (RKKS) based methods, which require only kernels that admit a positive decomposition. We show that one does not need to access this decomposition in order to learn in RKKS. We then investigate the conditions under which a kernel is positively decomposable. We show that invariant kernels admit a positive decomposition on homogeneous spaces under tractable regularity assumptions. This makes them much easier to construct than positive-definite kernels, providing a route for learning with kernels for non-Euclidean data. By the same token, this provides theoretical foundations for RKKS-based methods in general. △ Less

Submitted 29 July, 2024; v1 submitted 20 October, 2023; originally announced October 2023.

arXiv:2310.03639 [pdf, ps, other]

Evaluating Self-Supervised Speech Representations for Indigenous American Languages

Authors: Chih-Chen Chen, William Chen, Rodolfo Zevallos, John E. Ortega

Abstract: The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data. However, much progress, both in terms of pre-training and downstream evaluation, has remained concentrated in monolingual models that only consider English. Few models consider other languages, and even fewer consider in… ▽ More The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data. However, much progress, both in terms of pre-training and downstream evaluation, has remained concentrated in monolingual models that only consider English. Few models consider other languages, and even fewer consider indigenous ones. In our submission to the New Language Track of the ASRU 2023 ML-SUPERB Challenge, we present an ASR corpus for Quechua, an indigenous South American Language. We benchmark the efficacy of large SSL models on Quechua, along with 6 other indigenous languages such as Guarani and Bribri, on low-resource ASR. Our results show surprisingly strong performance by state-of-the-art SSL models, showing the potential generalizability of large-scale models to real-world data. △ Less

Submitted 8 October, 2023; v1 submitted 5 October, 2023; originally announced October 2023.

arXiv:2309.06745 [pdf, other]

VEATIC: Video-based Emotion and Affect Tracking in Context Dataset

Authors: Zhihang Ren, Jefferson Ortega, Yifan Wang, Zhimin Chen, Yunhui Guo, Stella X. Yu, David Whitney

Abstract: Human affect recognition has been a significant topic in psychophysics and computer vision. However, the currently published datasets have many limitations. For example, most datasets contain frames that contain only information about facial expressions. Due to the limitations of previous datasets, it is very hard to either understand the mechanisms for affect recognition of humans or generalize w… ▽ More Human affect recognition has been a significant topic in psychophysics and computer vision. However, the currently published datasets have many limitations. For example, most datasets contain frames that contain only information about facial expressions. Due to the limitations of previous datasets, it is very hard to either understand the mechanisms for affect recognition of humans or generalize well on common cases for computer vision models trained on those datasets. In this work, we introduce a brand new large dataset, the Video-based Emotion and Affect Tracking in Context Dataset (VEATIC), that can conquer the limitations of the previous datasets. VEATIC has 124 video clips from Hollywood movies, documentaries, and home videos with continuous valence and arousal ratings of each frame via real-time annotation. Along with the dataset, we propose a new computer vision task to infer the affect of the selected character via both context and character information in each video frame. Additionally, we propose a simple model to benchmark this new computer vision task. We also compare the performance of the pretrained model using our dataset with other similar datasets. Experiments show the competing results of our pretrained model via VEATIC, indicating the generalizability of VEATIC. Our dataset is available at https://veatic.github.io. △ Less

Submitted 14 September, 2023; v1 submitted 13 September, 2023; originally announced September 2023.

arXiv:2309.00380 [pdf, other]

Learning multi-modal generative models with permutation-invariant encoders and tighter variational bounds

Authors: Marcel Hirt, Domenico Campolo, Victoria Leong, Juan-Pablo Ortega

Abstract: Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations that jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal d… ▽ More Devising deep latent variable models for multi-modal data has been a long-standing theme in machine learning research. Multi-modal Variational Autoencoders (VAEs) have been a popular generative model class that learns latent representations that jointly explain multiple modalities. Various objective functions for such models have been suggested, often motivated as lower bounds on the multi-modal data log-likelihood or from information-theoretic considerations. To encode latent variables from different modality subsets, Product-of-Experts (PoE) or Mixture-of-Experts (MoE) aggregation schemes have been routinely used and shown to yield different trade-offs, for instance, regarding their generative quality or consistency across multiple modalities. In this work, we consider a variational bound that can tightly approximate the data log-likelihood. We develop more flexible aggregation schemes that generalize PoE or MoE approaches by combining encoded features from different modalities based on permutation-invariant neural networks. Our numerical experiments illustrate trade-offs for multi-modal variational bounds and various aggregation schemes. We show that tighter variational bounds and more flexible aggregation models can become beneficial when one wants to approximate the true joint distribution over observed modalities and latent variables in identifiable models. △ Less

Submitted 18 April, 2024; v1 submitted 1 September, 2023; originally announced September 2023.

arXiv:2305.01457 [pdf, other]

Memory of recurrent networks: Do we compute it right?

Authors: Giovanni Ballarin, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: Numerical evaluations of the memory capacity (MC) of recurrent neural networks reported in the literature often contradict well-established theoretical bounds. In this paper, we study the case of linear echo state networks, for which the total memory capacity has been proven to be equal to the rank of the corresponding Kalman controllability matrix. We shed light on various reasons for the inaccur… ▽ More Numerical evaluations of the memory capacity (MC) of recurrent neural networks reported in the literature often contradict well-established theoretical bounds. In this paper, we study the case of linear echo state networks, for which the total memory capacity has been proven to be equal to the rank of the corresponding Kalman controllability matrix. We shed light on various reasons for the inaccurate numerical estimations of the memory, and we show that these issues, often overlooked in the recent literature, are of an exclusively numerical nature. More explicitly, we prove that when the Krylov structure of the linear MC is ignored, a gap between the theoretical MC and its empirical counterpart is introduced. As a solution, we develop robust numerical approaches by exploiting a result of MC neutrality with respect to the input mask matrix. Simulations show that the memory curves that are recovered using the proposed methods fully agree with the theory. △ Less

Submitted 2 May, 2023; originally announced May 2023.

Comments: 31 pages, 6 figures

arXiv:2304.08649 [pdf, other]

Classification of US Supreme Court Cases using BERT-Based Techniques

Authors: Shubham Vatsal, Adam Meyers, John E. Ortega

Abstract: Models based on bidirectional encoder representations from transformers (BERT) produce state of the art (SOTA) results on many natural language processing (NLP) tasks such as named entity recognition (NER), part-of-speech (POS) tagging etc. An interesting phenomenon occurs when classifying long documents such as those from the US supreme court where BERT-based models can be considered difficult to… ▽ More Models based on bidirectional encoder representations from transformers (BERT) produce state of the art (SOTA) results on many natural language processing (NLP) tasks such as named entity recognition (NER), part-of-speech (POS) tagging etc. An interesting phenomenon occurs when classifying long documents such as those from the US supreme court where BERT-based models can be considered difficult to use on a first-pass or out-of-the-box basis. In this paper, we experiment with several BERT-based classification techniques for US supreme court decisions or supreme court database (SCDB) and compare them with the previous SOTA results. We then compare our results specifically with SOTA models for long documents. We compare our results for two classification tasks: (1) a broad classification task with 15 categories and (2) a fine-grained classification task with 279 categories. Our best result produces an accuracy of 80\% on the 15 broad categories and 60\% on the fine-grained 279 categories which marks an improvement of 8\% and 28\% respectively from previously reported SOTA results. △ Less

Submitted 24 July, 2023; v1 submitted 17 April, 2023; originally announced April 2023.

arXiv:2304.00490 [pdf, ps, other]

Infinite-dimensional reservoir computing

Authors: Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: Reservoir computing approximation and generalization bounds are proved for a new concept class of input/output systems that extends the so-called generalized Barron functionals to a dynamic context. This new class is characterized by the readouts with a certain integral representation built on infinite-dimensional state-space systems. It is shown that this class is very rich and possesses useful f… ▽ More Reservoir computing approximation and generalization bounds are proved for a new concept class of input/output systems that extends the so-called generalized Barron functionals to a dynamic context. This new class is characterized by the readouts with a certain integral representation built on infinite-dimensional state-space systems. It is shown that this class is very rich and possesses useful features and universal approximation properties. The reservoir architectures used for the approximation and estimation of elements in the new class are randomly generated echo state networks with either linear or ReLU activation functions. Their readouts are built using randomly generated neural networks in which only the output layer is trained (extreme learning machines or random feature neural networks). The results in the paper yield a fully implementable recurrent neural network-based learning algorithm with provable convergence guarantees that do not suffer from the curse of dimensionality. △ Less

Submitted 2 April, 2023; originally announced April 2023.

arXiv:2302.10623 [pdf, other]

The Gaussian kernel on the circle and spaces that admit isometric embeddings of the circle

Authors: Nathaël Da Costa, Cyrus Mostajeran, Juan-Pablo Ortega

Abstract: On Euclidean spaces, the Gaussian kernel is one of the most widely used kernels in applications. It has also been used on non-Euclidean spaces, where it is known that there may be (and often are) scale parameters for which it is not positive definite. Hope remains that this kernel is positive definite for many choices of parameter. However, we show that the Gaussian kernel is not positive definite… ▽ More On Euclidean spaces, the Gaussian kernel is one of the most widely used kernels in applications. It has also been used on non-Euclidean spaces, where it is known that there may be (and often are) scale parameters for which it is not positive definite. Hope remains that this kernel is positive definite for many choices of parameter. However, we show that the Gaussian kernel is not positive definite on the circle for any choice of parameter. This implies that on metric spaces in which the circle can be isometrically embedded, such as spheres, projective spaces and Grassmannians, the Gaussian kernel is not positive definite for any parameter. △ Less

Submitted 21 February, 2023; originally announced February 2023.

arXiv:2302.07912 [pdf, other]

Meeting the Needs of Low-Resource Languages: The Value of Automatic Alignments via Pretrained Models

Authors: Abteen Ebrahimi, Arya D. McCarthy, Arturo Oncevay, Luis Chiruzzo, John E. Ortega, Gustavo A. Giménez-Lugo, Rolando Coto-Solano, Katharina Kann

Abstract: Large multilingual models have inspired a new class of word alignment methods, which work well for the model's pretraining languages. However, the languages most in need of automatic alignment are low-resource and, thus, not typically included in the pretraining data. In this work, we ask: How do modern aligners perform on unseen languages, and are they better than traditional methods? We contribu… ▽ More Large multilingual models have inspired a new class of word alignment methods, which work well for the model's pretraining languages. However, the languages most in need of automatic alignment are low-resource and, thus, not typically included in the pretraining data. In this work, we ask: How do modern aligners perform on unseen languages, and are they better than traditional methods? We contribute gold-standard alignments for Bribri--Spanish, Guarani--Spanish, Quechua--Spanish, and Shipibo-Konibo--Spanish. With these, we evaluate state-of-the-art aligners with and without model adaptation to the target language. Finally, we also evaluate the resulting alignments extrinsically through two downstream tasks: named entity recognition and part-of-speech tagging. We find that although transformer-based methods generally outperform traditional models, the two classes of approach remain competitive with each other. △ Less

Submitted 15 February, 2023; originally announced February 2023.

Comments: EACL 2023

arXiv:2212.14641 [pdf, other]

Reservoir kernels and Volterra series

Authors: Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter. It is hence called the Volte… ▽ More A universal kernel is constructed whose sections approximate any causal and time-invariant filter in the fading memory category with inputs and outputs in a finite-dimensional Euclidean space. This kernel is built using the reservoir functional associated with a state-space representation of the Volterra series expansion available for any analytic fading memory filter. It is hence called the Volterra reservoir kernel. Even though the state-space representation and the corresponding reservoir feature map are defined on an infinite-dimensional tensor algebra space, the kernel map is characterized by explicit recursions that are readily computable for specific data sets when employed in estimation problems using the representer theorem. We showcase the performance of the Volterra reservoir kernel in a popular data science application in relation to bitcoin price prediction. △ Less

Submitted 30 December, 2022; originally announced December 2022.

Comments: 10 pages, 2 figures, 1 table

arXiv:2212.02384 [pdf, ps, other]

Addressing Distribution Shift at Test Time in Pre-trained Language Models

Authors: Ayush Singh, John E. Ortega

Abstract: State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks. However, PLMs have been found to degrade in performance under distribution shift, a phenomenon that occurs when data at test-time does not come from the same distribution as the source training set. Equally as challenging is the task of obtaining labels in real-tim… ▽ More State-of-the-art pre-trained language models (PLMs) outperform other models when applied to the majority of language processing tasks. However, PLMs have been found to degrade in performance under distribution shift, a phenomenon that occurs when data at test-time does not come from the same distribution as the source training set. Equally as challenging is the task of obtaining labels in real-time due to issues like long-labeling feedback loops. The lack of adequate methods that address the aforementioned challenges constitutes the need for approaches that continuously adapt the PLM to a distinct distribution. Unsupervised domain adaptation adapts a source model to an unseen as well as unlabeled target domain. While some techniques such as data augmentation can adapt models in several scenarios, they have only been sparsely studied for addressing the distribution shift problem. In this work, we present an approach (MEMO-CL) that improves the performance of PLMs at test-time under distribution shift. Our approach takes advantage of the latest unsupervised techniques in data augmentation and adaptation to minimize the entropy of the PLM's output distribution. MEMO-CL operates on a batch of augmented samples from a single observation in the test set. The technique introduced is unsupervised, domain-agnostic, easy to implement, and requires no additional data. Our experiments result in a 3% improvement over current test-time adaptation baselines. △ Less

Submitted 5 December, 2022; originally announced December 2022.

Comments: Accepted to 2nd International Workshop on Practical Deep Learning in the Wild at AAAI 2023

ACM Class: I.2.7; I.2.6

arXiv:2209.07946 [pdf, other]

doi 10.1016/j.physd.2023.133744

Transport in reservoir computing

Authors: G Manjunath, Juan-Pablo Ortega

Abstract: Reservoir computing systems are constructed using a driven dynamical system in which external inputs can alter the evolving states of a system. These paradigms are used in information processing, machine learning, and computation. A fundamental question that needs to be addressed in this framework is the statistical relationship between the input and the system states. This paper provides conditio… ▽ More Reservoir computing systems are constructed using a driven dynamical system in which external inputs can alter the evolving states of a system. These paradigms are used in information processing, machine learning, and computation. A fundamental question that needs to be addressed in this framework is the statistical relationship between the input and the system states. This paper provides conditions that guarantee the existence and uniqueness of asymptotically invariant measures for driven systems and shows that their dependence on the input process is continuous when the set of input and output processes are endowed with the Wasserstein distance. The main tool in these developments is the characterization of those invariant measures as fixed points of naturally defined Foias operators that appear in this context and which have been profusely studied in the paper. Those fixed points are obtained by imposing a newly introduced stochastic state contractivity on the driven system that is readily verifiable in examples. Stochastic state contractivity can be satisfied by systems that are not state-contractive, which is a need typically evoked to guarantee the echo state property in reservoir computing. As a result, it may actually be satisfied even if the echo state property is not present. △ Less

Submitted 16 September, 2022; originally announced September 2022.

Comments: 33 pages, 5 figures

arXiv:2206.10924 [pdf]

doi 10.5121/csit.2022.121013

Enhancing Networking Cipher Algorithms with Natural Language

Authors: John E. Ortega

Abstract: This work provides a survey of several networking cipher algorithms and proposes a method for integrating natural language processing (NLP) as a protective agent for them. Two main proposals are covered for the use of NLP in networking. First, NLP is considered as the weakest link in a networking encryption model; and, second, as a hefty deterrent when combined as an extra layer over what could be… ▽ More This work provides a survey of several networking cipher algorithms and proposes a method for integrating natural language processing (NLP) as a protective agent for them. Two main proposals are covered for the use of NLP in networking. First, NLP is considered as the weakest link in a networking encryption model; and, second, as a hefty deterrent when combined as an extra layer over what could be considered a strong type of encryption -- the stream cipher. This paper summarizes how languages can be integrated into symmetric encryption as a way to assist in the encryption of vulnerable streams that may be found under attack due to the natural frequency distribution of letters or words in a local language stream. △ Less

Submitted 22 June, 2022; originally announced June 2022.

Comments: 12 pages, David C. Wyld et al. (Eds): CONEDU, CSITA, MLCL, ISPR, NATAP, ARIN - 2022 pp. 43-54, 2022. CS & IT - CSCP 2022 DOI: 10.5121/csit.2022.121013

Journal ref: David C. Wyld et al. (Eds): CONEDU, CSITA, MLCL, ISPR, NATAP, ARIN - 2022, pp. 43-54, 2022. CS & IT

arXiv:2205.06556 [pdf]

Virtual passengers for real car solutions: synthetic datasets

Authors: Paola Natalia Canas, Juan Diego Ortega, Marcos Nieto, Oihana Otaegui

Abstract: Strategies that include the generation of synthetic data are beginning to be viable as obtaining real data can be logistically complicated, very expensive or slow. Not only the capture of the data can lead to complications, but also its annotation. To achieve high-fidelity data for training intelligent systems, we have built a 3D scenario and set-up to resemble reality as closely as possible. With… ▽ More Strategies that include the generation of synthetic data are beginning to be viable as obtaining real data can be logistically complicated, very expensive or slow. Not only the capture of the data can lead to complications, but also its annotation. To achieve high-fidelity data for training intelligent systems, we have built a 3D scenario and set-up to resemble reality as closely as possible. With our approach, it is possible to configure and vary parameters to add randomness to the scene and, in this way, allow variation in data, which is so important in the construction of a dataset. Besides, the annotation task is already included in the data generation exercise, rather than being a post-capture task, which can save a lot of resources. We present the process and concept of synthetic data generation in an automotive context, specifically for driver and passenger monitoring purposes, as an alternative to real data capturing. △ Less

Submitted 13 May, 2022; originally announced May 2022.

Comments: 9 pages, 6 figures, 14th ITS European Congress

arXiv:2205.03791 [pdf, other]

doi 10.9734/ARJOM/2022/v18i530377

Harmonic Centrality and Centralization of Some Graph Products

Authors: Jose Mari E. Ortega, Rolito G. Eballe

Abstract: Harmonic centrality calculates the importance of a node in a network by adding the inverse of the geodesic distances of this node to all the other nodes. Harmonic centralization, on the other hand, is the graph-level centrality score based on the node-level harmonic centrality. In this paper, we present some results on both the harmonic centrality and harmonic centralization of graphs resulting fr… ▽ More Harmonic centrality calculates the importance of a node in a network by adding the inverse of the geodesic distances of this node to all the other nodes. Harmonic centralization, on the other hand, is the graph-level centrality score based on the node-level harmonic centrality. In this paper, we present some results on both the harmonic centrality and harmonic centralization of graphs resulting from some graph products such as Cartesian and direct products of the path $P_2$ with any of the path $P_m$, cycle $C_m$, and fan $F_m$ graphs. △ Less

Submitted 8 May, 2022; originally announced May 2022.

Comments: 10 pages, 3 figures

MSC Class: 05C12; 05C82; 91D30

Journal ref: Asian Research Journal of Mathematics, 18(5): 42-51, 2022; Article no.ARJOM.86514

arXiv:2205.01987 [pdf, ps, other]

ON-TRAC Consortium Systems for the IWSLT 2022 Dialect and Low-resource Speech Translation Tasks

Authors: Marcely Zanon Boito, John Ortega, Hugo Riguidel, Antoine Laurent, Loïc Barrault, Fethi Bougares, Firas Chaabani, Ha Nguyen, Florentin Barbier, Souhir Gahbiche, Yannick Estève

Abstract: This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. For the Tunisian Arabic-English dataset (low-resource and dialect tracks), we build an end-to-end model as our joint primary submission, and compare it against cascaded models that leverage a large fine-tu… ▽ More This paper describes the ON-TRAC Consortium translation systems developed for two challenge tracks featured in the Evaluation Campaign of IWSLT 2022: low-resource and dialect speech translation. For the Tunisian Arabic-English dataset (low-resource and dialect tracks), we build an end-to-end model as our joint primary submission, and compare it against cascaded models that leverage a large fine-tuned wav2vec 2.0 model for ASR. Our results show that in our settings pipeline approaches are still very competitive, and that with the use of transfer learning, they can outperform end-to-end models for speech translation (ST). For the Tamasheq-French dataset (low-resource track) our primary submission leverages intermediate representations from a wav2vec 2.0 model trained on 234 hours of Tamasheq audio, while our contrastive model uses a French phonetic transcription of the Tamasheq audio as input in a Conformer speech translation architecture jointly trained on automatic speech recognition, ST and machine translation losses. Our results highlight that self-supervised models trained on smaller sets of target data are more effective to low-resource end-to-end ST fine-tuning, compared to large off-the-shelf models. Results also illustrate that even approximate phonetic transcriptions can improve ST scores. △ Less

Submitted 4 May, 2022; originally announced May 2022.

Comments: IWSLT 2022 system paper

arXiv:2204.04381 [pdf]

doi 10.17654/0974165822023

Harmonic Centralization of Some Graph Families

Authors: Jose Mari E. Ortega, Rolito G. Eballe

Abstract: Centrality describes the importance of nodes in a graph and is modeled by various measures. Its global analogue, called centralization, is a general formula for calculating a graph-level centrality score based on the node-level centrality measure. The latter enables us to compare graphs based on the extent to which the connections of a given network are concentrated on a single vertex or group of… ▽ More Centrality describes the importance of nodes in a graph and is modeled by various measures. Its global analogue, called centralization, is a general formula for calculating a graph-level centrality score based on the node-level centrality measure. The latter enables us to compare graphs based on the extent to which the connections of a given network are concentrated on a single vertex or group of vertices. One of the measures of centrality in social network analysis is harmonic centrality. It sums the inverse of the geodesic distances of each node to other nodes where it is 0 if there is no path from one node to another, with the sum normalized by dividing it by $m-1$, where $m$ is the number of nodes of the graph. In this paper, we present some results regarding the harmonic centralization of some important families of graphs with the hope that formulas generated herein will be of use when one determines the harmonic centralization of more complex graphs. △ Less

Submitted 2 May, 2022; v1 submitted 8 April, 2022; originally announced April 2022.

Comments: 21 pages, 5 figures. arXiv admin note: text overlap with arXiv:2111.12239

MSC Class: 05C12; 05C82; 91D30

Journal ref: Advances and Applications in Discrete Mathematics, Volume 31, 2022, Pages 13-33

arXiv:2111.12239 [pdf, ps, other]

doi 10.5281/zenodo.6396942

Harmonic Centrality in Some Graph Families

Authors: Jose Mari E. Ortega, Rolito G. Eballe

Abstract: One of the more recent measures of centrality in social network analysis is the normalized harmonic centrality. A variant of the closeness centrality, harmonic centrality sums the inverse of the geodesic distances of each node to other nodes where it is 0 if there is no path from one node to another. It is then normalized by dividing it by m-1, where m is the number of nodes of the graph. In this… ▽ More One of the more recent measures of centrality in social network analysis is the normalized harmonic centrality. A variant of the closeness centrality, harmonic centrality sums the inverse of the geodesic distances of each node to other nodes where it is 0 if there is no path from one node to another. It is then normalized by dividing it by m-1, where m is the number of nodes of the graph. In this paper, we present notions regarding the harmonic centrality of some important classes of graphs. △ Less

Submitted 3 April, 2022; v1 submitted 23 November, 2021; originally announced November 2021.

Comments: 13 pages, 5 figures

MSC Class: 05C12; 05C82; 91D30

Journal ref: Advances and Applications in Mathematical Sciences, Volume 21, Issue 5, March 2022, Pages 2581-2598

arXiv:2111.03472 [pdf]

doi 10.1007/s10044-009-0151-4

BiosecurID: a multimodal biometric database

Authors: Julian Fierrez, Javier Galbally, Javier Ortega-Garcia, Manuel R Freire, Fernando Alonso-Fernandez, Daniel Ramos, Doroteo Torre Toledano, Joaquin Gonzalez-Rodriguez, Juan A Siguenza, Javier Garrido-Salas, E Anguiano, Guillermo Gonzalez-de-Rivera, Ricardo Ribalda, Marcos Faundez-Zanuy, JA Ortega, Valentín Cardeñoso-Payo, A Viloria, Carlos E Vivaracho, Q Isaac Moro, Juan J Igarza, J Sanchez, Inmaculada Hernaez, Carlos Orrite-Urunuela, Francisco Martinez-Contreras, Juan José Gracia-Roche

Abstract: A new multimodal biometric database, acquired in the framework of the BiosecurID project, is presented together with the description of the acquisition setup and protocol. The database includes eight unimodal biometric traits, namely: speech, iris, face (still images, videos of talking faces), handwritten signature and handwritten text (on-line dynamic signals, off-line scanned images), fingerprin… ▽ More A new multimodal biometric database, acquired in the framework of the BiosecurID project, is presented together with the description of the acquisition setup and protocol. The database includes eight unimodal biometric traits, namely: speech, iris, face (still images, videos of talking faces), handwritten signature and handwritten text (on-line dynamic signals, off-line scanned images), fingerprints (acquired with two different sensors), hand (palmprint, contour-geometry) and keystroking. The database comprises 400 subjects and presents features such as: realistic acquisition scenario, balanced gender and population distributions, availability of information about particular demographic groups (age, gender, handedness), acquisition of replay attacks for speech and keystroking, skilled forgeries for signatures, and compatibility with other existing databases. All these characteristics make it very useful in research and development of unimodal and multimodal biometric systems. △ Less

Submitted 2 November, 2021; originally announced November 2021.

Comments: Published at Pattern Analysis and Applications journal

arXiv:2110.13242 [pdf, other]

2D Grid Map Generation for Deep-Learning-based Navigation Approaches

Authors: Gabriel O. Flores-Aquino, Jheison Duvier Díaz Ortega, Ricardo Yahir Almazan Arvizu, Raúl López Muñoz, O. Octavio Gutierrez-Frias, J. Irving Vasquez-Gomez

Abstract: In the last decade, autonomous navigation for roboticshas been leveraged by deep learning and other approachesbased on machine learning. These approaches have demon-strated significant advantages in robotics performance. Butthey have the disadvantage that they require a lot of data toinfer knowledge. In this paper, we present an algorithm forbuilding 2D maps with attributes that make them useful f… ▽ More In the last decade, autonomous navigation for roboticshas been leveraged by deep learning and other approachesbased on machine learning. These approaches have demon-strated significant advantages in robotics performance. Butthey have the disadvantage that they require a lot of data toinfer knowledge. In this paper, we present an algorithm forbuilding 2D maps with attributes that make them useful fortraining and testing machine-learning-based approaches.The maps are based on dungeons environments where sev-eral random rooms are built and then those rooms are con-nected. In addition, we provide a dataset with 10,000 mapsproduced by the proposed algorithm and a description withextensive information for algorithm evaluation. Such infor-mation includes validation of path existence, the best path,distances, among other attributes. We believe that thesemaps and their related information can be very useful forrobotics enthusiasts and researchers who want to test deeplearning approaches. The dataset is available athttps://github.com/gbriel21/map2D_dataSet.git △ Less

Submitted 4 December, 2021; v1 submitted 25 October, 2021; originally announced October 2021.

Comments: 6 pages, 4 figures, conference, dataset

arXiv:2108.06598 [pdf, other]

Findings of the LoResMT 2021 Shared Task on COVID and Sign Language for Low-resource Languages

Authors: Atul Kr. Ojha, Chao-Hong Liu, Katharina Kann, John Ortega, Sheetal Shatam, Theodorus Fransen

Abstract: We present the findings of the LoResMT 2021 shared task which focuses on machine translation (MT) of COVID-19 data for both low-resource spoken and sign languages. The organization of this task was conducted as part of the fourth workshop on technologies for machine translation of low resource languages (LoResMT). Parallel corpora is presented and publicly available which includes the following di… ▽ More We present the findings of the LoResMT 2021 shared task which focuses on machine translation (MT) of COVID-19 data for both low-resource spoken and sign languages. The organization of this task was conducted as part of the fourth workshop on technologies for machine translation of low resource languages (LoResMT). Parallel corpora is presented and publicly available which includes the following directions: English$\leftrightarrow$Irish, English$\leftrightarrow$Marathi, and Taiwanese Sign language$\leftrightarrow$Traditional Chinese. Training data consists of 8112, 20933 and 128608 segments, respectively. There are additional monolingual data sets for Marathi and English that consist of 21901 segments. The results presented here are based on entries from a total of eight teams. Three teams submitted systems for English$\leftrightarrow$Irish while five teams submitted systems for English$\leftrightarrow$Marathi. Unfortunately, there were no systems submissions for the Taiwanese Sign language$\leftrightarrow$Traditional Chinese task. Maximum system performance was computed using BLEU and follow as 36.0 for English--Irish, 34.6 for Irish--English, 24.2 for English--Marathi, and 31.3 for Marathi--English. △ Less

Submitted 18 August, 2021; v1 submitted 14 August, 2021; originally announced August 2021.

Comments: 10 pages

arXiv:2108.05024 [pdf, other]

doi 10.1088/1361-6544/ace492

Learning strange attractors with reservoir systems

Authors: Lyudmila Grigoryeva, Allen Hart, Juan-Pablo Ortega

Abstract: This paper shows that the celebrated Embedding Theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generaliz… ▽ More This paper shows that the celebrated Embedding Theorem of Takens is a particular case of a much more general statement according to which, randomly generated linear state-space representations of generic observations of an invertible dynamical system carry in their wake an embedding of the phase space dynamics into the chosen Euclidean state space. This embedding coincides with a natural generalized synchronization that arises in this setup and that yields a topological conjugacy between the state-space dynamics driven by the generic observations of the dynamical system and the dynamical system itself. This result provides additional tools for the representation, learning, and analysis of chaotic attractors and sheds additional light on the reservoir computing phenomenon that appears in the context of recurrent neural networks. △ Less

Submitted 11 August, 2021; originally announced August 2021.

Comments: 36 pages, 11 figures

arXiv:2104.08726 [pdf, other]

AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages

Authors: Abteen Ebrahimi, Manuel Mager, Arturo Oncevay, Vishrav Chaudhary, Luis Chiruzzo, Angela Fan, John Ortega, Ricardo Ramos, Annette Rios, Ivan Meza-Ruiz, Gustavo A. Giménez-Lugo, Elisabeth Mager, Graham Neubig, Alexis Palmer, Rolando Coto-Solano, Ngoc Thang Vu, Katharina Kann

Abstract: Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining. However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we… ▽ More Pretrained multilingual models are able to perform cross-lingual transfer in a zero-shot setting, even for languages unseen during pretraining. However, prior work evaluating performance on unseen languages has largely been limited to low-level, syntactic tasks, and it remains unclear if zero-shot learning of high-level, semantic tasks is possible for unseen languages. To explore this question, we present AmericasNLI, an extension of XNLI (Conneau et al., 2018) to 10 indigenous languages of the Americas. We conduct experiments with XLM-R, testing multiple zero-shot and translation-based approaches. Additionally, we explore model adaptation via continued pretraining and provide an analysis of the dataset by considering hypothesis-only models. We find that XLM-R's zero-shot performance is poor for all 10 languages, with an average performance of 38.62%. Continued pretraining offers improvements, with an average accuracy of 44.05%. Surprisingly, training on poorly translated data by far outperforms all other methods with an accuracy of 48.72%. △ Less

Submitted 16 March, 2022; v1 submitted 18 April, 2021; originally announced April 2021.

Comments: Accepted to ACL 2022

arXiv:2010.14615 [pdf, ps, other]

Discrete-time signatures and randomness in reservoir computing

Authors: Christa Cuchiero, Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega, Josef Teichmann

Abstract: A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projecti… ▽ More A new explanation of geometric nature of the reservoir computing phenomenon is presented. Reservoir computing is understood in the literature as the possibility of approximating input/output systems with randomly chosen recurrent neural systems and a trained linear readout layer. Light is shed on this phenomenon by constructing what is called strongly universal reservoir systems as random projections of a family of state-space systems that generate Volterra series expansions. This procedure yields a state-affine reservoir system with randomly generated coefficients in a dimension that is logarithmically reduced with respect to the original system. This reservoir system is able to approximate any element in the fading memory filters class just by training a different linear readout for each different filter. Explicit expressions for the probability distributions needed in the generation of the projected reservoir system are stated and bounds for the committed approximation error are provided. △ Less

Submitted 17 September, 2020; originally announced October 2020.

Comments: 14 pages

arXiv:2010.12047 [pdf, ps, other]

Fading memory echo state networks are universal

Authors: Lukas Gonon, Juan-Pablo Ortega

Abstract: Echo state networks (ESNs) have been recently proved to be universal approximants for input/output systems with respect to various $L ^p$-type criteria. When $1\leq p< \infty$, only $p$-integrability hypotheses need to be imposed, while in the case $p=\infty$ a uniform boundedness hypotheses on the inputs is required. This note shows that, in the last case, a universal family of ESNs can be constr… ▽ More Echo state networks (ESNs) have been recently proved to be universal approximants for input/output systems with respect to various $L ^p$-type criteria. When $1\leq p< \infty$, only $p$-integrability hypotheses need to be imposed, while in the case $p=\infty$ a uniform boundedness hypotheses on the inputs is required. This note shows that, in the last case, a universal family of ESNs can be constructed that contains exclusively elements that have the echo state and the fading memory properties. This conclusion could not be drawn with the results and methods available so far in the literature. △ Less

Submitted 22 October, 2020; originally announced October 2020.

Comments: 6 pages letter

arXiv:2008.12085 [pdf, ps, other]

doi 10.1007/978-3-030-66823-5_23

DMD: A Large-Scale Multi-Modal Driver Monitoring Dataset for Attention and Alertness Analysis

Authors: Juan Diego Ortega, Neslihan Kose, Paola Cañas, Min-An Chao, Alexander Unnervik, Marcos Nieto, Oihana Otaegui, Luis Salgado

Abstract: Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS), especially after the recent success of Deep Learning (DL) methods. The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development, crucial for the transition of automated driving from SAE Level-2 to SAE Level-3. In this paper, we introduce the Drive… ▽ More Vision is the richest and most cost-effective technology for Driver Monitoring Systems (DMS), especially after the recent success of Deep Learning (DL) methods. The lack of sufficiently large and comprehensive datasets is currently a bottleneck for the progress of DMS development, crucial for the transition of automated driving from SAE Level-2 to SAE Level-3. In this paper, we introduce the Driver Monitoring Dataset (DMD), an extensive dataset which includes real and simulated driving scenarios: distraction, gaze allocation, drowsiness, hands-wheel interaction and context data, in 41 hours of RGB, depth and IR videos from 3 cameras capturing face, body and hands of 37 drivers. A comparison with existing similar datasets is included, which shows the DMD is more extensive, diverse, and multi-purpose. The usage of the DMD is illustrated by extracting a subset of it, the dBehaviourMD dataset, containing 13 distraction activities, prepared to be used in DL training processes. Furthermore, we propose a robust and real-time driver behaviour recognition system targeting a real-world application that can run on cost-efficient CPU-only platforms, based on the dBehaviourMD. Its performance is evaluated with different types of fusion strategies, which all reach enhanced accuracy still providing real-time response. △ Less

Submitted 27 August, 2020; originally announced August 2020.

Comments: Accepted to ECCV 2020 workshop - Assistive Computer Vision and Robotics

arXiv:2007.12141 [pdf, ps, other]

Dimension reduction in recurrent networks by canonicalization

Authors: Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: Many recurrent neural network machine learning paradigms can be formulated using state-space representations. The classical notion of canonical state-space realization is adapted in this paper to accommodate semi-infinite inputs so that it can be used as a dimension reduction tool in the recurrent networks setup. The so-called input forgetting property is identified as the key hypothesis that guar… ▽ More Many recurrent neural network machine learning paradigms can be formulated using state-space representations. The classical notion of canonical state-space realization is adapted in this paper to accommodate semi-infinite inputs so that it can be used as a dimension reduction tool in the recurrent networks setup. The so-called input forgetting property is identified as the key hypothesis that guarantees the existence and uniqueness (up to system isomorphisms) of canonical realizations for causal and time-invariant input/output systems with semi-infinite inputs. Additionally, the notion of optimal reduction coming from the theory of symmetric Hamiltonian systems is implemented in our setup to construct canonical realizations out of input forgetting but not necessarily canonical ones. These two procedures are studied in detail in the framework of linear fading memory input/output systems. Finally, the notion of implicit reduction using reproducing kernel Hilbert spaces (RKHS) is introduced which allows, for systems with linear readouts, to achieve dimension reduction without the need to actually compute the reduced spaces introduced in the first part of the paper. △ Less

Submitted 11 August, 2021; v1 submitted 23 July, 2020; originally announced July 2020.

Comments: 31 pages

arXiv:2004.11234 [pdf, other]

doi 10.1016/j.physd.2020.132721

Memory and forecasting capacities of nonlinear recurrent networks

Authors: Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: The notion of memory capacity, originally introduced for echo state and linear networks with independent inputs, is generalized to nonlinear recurrent networks with stationary but dependent inputs. The presence of dependence in the inputs makes natural the introduction of the network forecasting capacity, that measures the possibility of forecasting time series values using network states. Generic… ▽ More The notion of memory capacity, originally introduced for echo state and linear networks with independent inputs, is generalized to nonlinear recurrent networks with stationary but dependent inputs. The presence of dependence in the inputs makes natural the introduction of the network forecasting capacity, that measures the possibility of forecasting time series values using network states. Generic bounds for memory and forecasting capacities are formulated in terms of the number of neurons of the nonlinear recurrent network and the autocovariance function or the spectral density of the input. These bounds generalize well-known estimates in the literature to a dependent inputs setup. Finally, for the particular case of linear recurrent networks with independent inputs it is proved that the memory capacity is given by the rank of the associated controllability matrix, a fact that has been for a long time assumed to be true without proof by the community. △ Less

Submitted 2 September, 2020; v1 submitted 22 April, 2020; originally announced April 2020.

Comments: 27 pages, 1 figure. To appear in Physica D

arXiv:2004.09075 [pdf, ps, other]

On the integration of Shapley-Scarf housing markets

Authors: Rajnish Kunar, Kriti Manocha, Josue Ortega

Abstract: We study the welfare consequences of merging Shapley--Scarf markets. Market integration can lead to large welfare losses and make the vast majority of agents worse-off, but is on average welfare-enhancing and makes all agents better off ex-ante. The number of agents harmed by integration is a minority when all markets are small or agents' preferences are highly correlated. We study the welfare consequences of merging Shapley--Scarf markets. Market integration can lead to large welfare losses and make the vast majority of agents worse-off, but is on average welfare-enhancing and makes all agents better off ex-ante. The number of agents harmed by integration is a minority when all markets are small or agents' preferences are highly correlated. △ Less

Submitted 6 January, 2022; v1 submitted 20 April, 2020; originally announced April 2020.

Comments: Journal of Mathematical Economics, 2022

arXiv:2002.05933 [pdf, ps, other]

Approximation Bounds for Random Neural Networks and Reservoir Systems

Authors: Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: This work studies approximation based on single-hidden-layer feedforward and recurrent neural networks with randomly generated internal weights. These methods, in which only the last layer of weights and a few hyperparameters are optimized, have been successfully applied in a wide range of static and dynamic learning problems. Despite the popularity of this approach in empirical tasks, important t… ▽ More This work studies approximation based on single-hidden-layer feedforward and recurrent neural networks with randomly generated internal weights. These methods, in which only the last layer of weights and a few hyperparameters are optimized, have been successfully applied in a wide range of static and dynamic learning problems. Despite the popularity of this approach in empirical tasks, important theoretical questions regarding the relation between the unknown function, the weight distribution, and the approximation rate have remained open. In this work it is proved that, as long as the unknown function, functional, or dynamical system is sufficiently regular, it is possible to draw the internal weights of the random (recurrent) neural network from a generic distribution (not depending on the unknown object) and quantify the error in terms of the number of neurons and the hyperparameters. In particular, this proves that echo state networks with randomly generated weights are capable of approximating a wide class of dynamical systems arbitrarily well and thus provides the first mathematical explanation for their empirically observed success at learning dynamical systems. △ Less

Submitted 16 February, 2021; v1 submitted 14 February, 2020; originally announced February 2020.

Comments: 48 pages

arXiv:2002.03174 [pdf, ps, other]

Fairness and Efficiency in Cake-Cutting with Single-Peaked Preferences

Authors: Bhavook Bhardwaj, Rajnish Kumar, Josue Ortega

Abstract: We study the cake-cutting problem when agents have single-peaked preferences over the cake. We show that a recently proposed mechanism by Wang-Wu (2019) to obtain envy-free allocations can yield large welfare losses. Using a simplifying assumption, we characterize all Pareto optimal allocations, which have a simple structure: are peak-preserving and non-wasteful. Finally, we provide simple alterna… ▽ More We study the cake-cutting problem when agents have single-peaked preferences over the cake. We show that a recently proposed mechanism by Wang-Wu (2019) to obtain envy-free allocations can yield large welfare losses. Using a simplifying assumption, we characterize all Pareto optimal allocations, which have a simple structure: are peak-preserving and non-wasteful. Finally, we provide simple alternative mechanisms that Pareto dominate that of Wang-Wu, and which achieve envy-freeness or Pareto optimality. △ Less

Submitted 4 March, 2020; v1 submitted 8 February, 2020; originally announced February 2020.

Comments: Forthcoming in Economics Letters

arXiv:1911.00153 [pdf, other]

On hybrid precoder/combiner for downlink mmWave massive MU-MIMO systems

Authors: Alvaro Javier Ortega, Raimundo Sampaio-Neto, Rodrigo Pereira David

Abstract: We propose four hybrid combiner/precoder for downlink mmWave massive MU-MIMO systems. The design of a hybrid combiner/precoder is divided in two parts, analog and digital. The system baseband model shows that the signal processed by the mobile station can be interpreted as a received signal in the presence of colored Gaussian noise, therefore, since the digital part of the combiner and precoder do… ▽ More We propose four hybrid combiner/precoder for downlink mmWave massive MU-MIMO systems. The design of a hybrid combiner/precoder is divided in two parts, analog and digital. The system baseband model shows that the signal processed by the mobile station can be interpreted as a received signal in the presence of colored Gaussian noise, therefore, since the digital part of the combiner and precoder do not have constraints for their generation, their designs can be based on any traditional signal processing that takes into account this kind of noise. To the best of our knowledge, this was not considered by previous works. A more realistic and appropriate design is described in this paper. Also, the approaches adopted in the literature for the designing of the combiner'/precoder' analog parts do not try to avoid or even reduce the inter user/symbol interference, they concentrate on increasing the signal-to-noise ratio (SNR). We propose a simple solution that decreases the interference while maintaining large SNR. In addition, one of the proposed hybrid combiners reaches the maximum value of our objective function according with the Hadamard's inequality. Numerical results illustrate the BER performance improvements resulting from our proposals. In addition, a simple detection approach can be used for data estimation without significant performance loss. △ Less

Submitted 31 October, 2019; originally announced November 2019.

Comments: 28 pages, 11 figures

arXiv:1910.13886 [pdf, ps, other]

Risk bounds for reservoir computing

Authors: Lukas Gonon, Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the m… ▽ More We analyze the practices of reservoir computing in the framework of statistical learning theory. In particular, we derive finite sample upper bounds for the generalization error committed by specific families of reservoir computing systems when processing discrete-time inputs under various hypotheses on their dependence structure. Non-asymptotic bounds are explicitly written down in terms of the multivariate Rademacher complexities of the reservoir systems and the weak dependence structure of the signals that are being handled. This allows, in particular, to determine the minimal number of observations needed in order to guarantee a prescribed estimation accuracy with high probability for a given reservoir family. At the same time, the asymptotic behavior of the devised bounds guarantees the consistency of the empirical risk minimization procedure for various hypothesis classes of reservoir functionals. △ Less

Submitted 30 October, 2019; originally announced October 2019.

Comments: 60 pages

arXiv:1908.02988 [pdf, ps, other]

Obvious Manipulations in Cake-Cutting

Authors: Josue Ortega, Erel Segal-Halevi

Abstract: In cake-cutting, strategy-proofness is a very costly requirement in terms of fairness: for n=2 it implies a dictatorial allocation, whereas for n > 2 it requires that one agent receives no cake. We show that a weaker version of this property recently suggested by Troyan and Morril, called non-obvious manipulability, is compatible with the strong fairness property of proportionality, which guarante… ▽ More In cake-cutting, strategy-proofness is a very costly requirement in terms of fairness: for n=2 it implies a dictatorial allocation, whereas for n > 2 it requires that one agent receives no cake. We show that a weaker version of this property recently suggested by Troyan and Morril, called non-obvious manipulability, is compatible with the strong fairness property of proportionality, which guarantees that each agent receives 1/n of the cake. Both properties are satisfied by the leftmost leaves mechanism, an adaptation of the Dubins - Spanier moving knife procedure. Most other classical proportional mechanisms in literature are obviously manipulable, including the original moving knife mechanism. Non-obvious manipulability explains why leftmost leaves is manipulated less often in practice than other proportional mechanisms. △ Less

Submitted 14 October, 2019; v1 submitted 8 August, 2019; originally announced August 2019.

arXiv:1907.03196 [pdf, other]

Multimodal Fusion with Deep Neural Networks for Audio-Video Emotion Recognition

Authors: Juan D. S. Ortega, Mohammed Senoussaoui, Eric Granger, Marco Pedersoli, Patrick Cardinal, Alessandro L. Koerich

Abstract: This paper presents a novel deep neural network (DNN) for multimodal fusion of audio, video and text modalities for emotion recognition. The proposed DNN architecture has independent and shared layers which aim to learn the representation for each modality, as well as the best combined representation to achieve the best prediction. Experimental results on the AVEC Sentiment Analysis in the Wild da… ▽ More This paper presents a novel deep neural network (DNN) for multimodal fusion of audio, video and text modalities for emotion recognition. The proposed DNN architecture has independent and shared layers which aim to learn the representation for each modality, as well as the best combined representation to achieve the best prediction. Experimental results on the AVEC Sentiment Analysis in the Wild dataset indicate that the proposed DNN can achieve a higher level of Concordance Correlation Coefficient (CCC) than other state-of-the-art systems that perform early fusion of modalities at feature-level (i.e., concatenation) and late fusion at score-level (i.e., weighted average) fusion. The proposed DNN has achieved CCCs of 0.606, 0.534, and 0.170 on the development partition of the dataset for predicting arousal, valence and liking, respectively. △ Less

Submitted 6 July, 2019; originally announced July 2019.

arXiv:1906.10623 [pdf, other]

Emotion Recognition Using Fusion of Audio and Video Features

Authors: Juan D. S. Ortega, Patrick Cardinal, Alessandro L. Koerich

Abstract: In this paper we propose a fusion approach to continuous emotion recognition that combines visual and auditory modalities in their representation spaces to predict the arousal and valence levels. The proposed approach employs a pre-trained convolution neural network and transfer learning to extract features from video frames that capture the emotional content. For the auditory content, a minimalis… ▽ More In this paper we propose a fusion approach to continuous emotion recognition that combines visual and auditory modalities in their representation spaces to predict the arousal and valence levels. The proposed approach employs a pre-trained convolution neural network and transfer learning to extract features from video frames that capture the emotional content. For the auditory content, a minimalistic set of parameters such as prosodic, excitation, vocal tract, and spectral descriptors are used as features. The fusion of these two modalities is carried out at a feature level, before training a single support vector regressor (SVR) or at a prediction level, after training one SVR for each modality. The proposed approach also includes preprocessing and post-processing techniques which contribute favorably to improving the concordance correlation coefficient (CCC). Experimental results for predicting spontaneous and natural emotions on the RECOLA dataset have shown that the proposed approach takes advantage of the complementary information of visual and auditory modalities and provides CCCs of 0.749 and 0.565 for arousal and valence, respectively. △ Less

Submitted 25 June, 2019; originally announced June 2019.

arXiv:1902.06094 [pdf, ps, other]

Differentiable reservoir computing

Authors: Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the… ▽ More Much effort has been devoted in the last two decades to characterize the situations in which a reservoir computing system exhibits the so-called echo state (ESP) and fading memory (FMP) properties. These important features amount, in mathematical terms, to the existence and continuity of global reservoir system solutions. That research is complemented in this paper with the characterization of the differentiability of reservoir filters for very general classes of discrete-time deterministic inputs. This constitutes a novel strong contribution to the long line of research on the ESP and the FMP and, in particular, links to existing research on the input-dependence of the ESP. Differentiability has been shown in the literature to be a key feature in the learning of attractors of chaotic dynamical systems. A Volterra-type series representation for reservoir filters with semi-infinite discrete-time inputs is constructed in the analytic case using Taylor's theorem and corresponding approximation bounds are provided. Finally, it is shown as a corollary of these results that any fading memory filter can be uniformly approximated by a finite Volterra series with finite memory. △ Less

Submitted 22 March, 2019; v1 submitted 16 February, 2019; originally announced February 2019.

Comments: 60 pages

arXiv:1810.08243 [pdf, other]

doi 10.1016/j.geb.2022.01.027

Fair Cake-Cutting in Practice

Authors: Maria Kyropoulou, Josué Ortega, Erel Segal-Halevi

Abstract: Using a lab experiment, we investigate the real-life performance of envy-free and proportional cake-cutting procedures with respect to fairness and preference manipulation. We find that envy-free procedures, in particular Selfridge-Conway, are fairer and also are perceived as fairer than their proportional counterparts, despite the fact that agents very often manipulate them. Our results support t… ▽ More Using a lab experiment, we investigate the real-life performance of envy-free and proportional cake-cutting procedures with respect to fairness and preference manipulation. We find that envy-free procedures, in particular Selfridge-Conway, are fairer and also are perceived as fairer than their proportional counterparts, despite the fact that agents very often manipulate them. Our results support the practical use of the celebrated Selfridge-Conway procedure, and more generally, of envy-free cake-cutting mechanisms. We also find that subjects learn their opponents' preferences after repeated interaction and use this knowledge to improve their allocated share of the cake. Learning reduces truth-telling behavior, but also reduces envy. △ Less

Submitted 5 February, 2022; v1 submitted 18 October, 2018; originally announced October 2018.

Journal ref: Games and Economic Behavior, 2022, https://www.sciencedirect.com/science/article/abs/pii/S0899825622000331

arXiv:1807.02621 [pdf, other]

Reservoir Computing Universality With Stochastic Inputs

Authors: Lukas Gonon, Juan-Pablo Ortega

Abstract: The universal approximation properties with respect to $L ^p $-type criteria of three important families of reservoir computers with stochastic discrete-time semi-infinite inputs is shown. First, it is proved that linear reservoir systems with either polynomial or neural network readout maps are universal. More importantly, it is proved that the same property holds for two families with linear rea… ▽ More The universal approximation properties with respect to $L ^p $-type criteria of three important families of reservoir computers with stochastic discrete-time semi-infinite inputs is shown. First, it is proved that linear reservoir systems with either polynomial or neural network readout maps are universal. More importantly, it is proved that the same property holds for two families with linear readouts, namely, trigonometric state-affine systems and echo state networks, which are the most widely used reservoir systems in applications. The linearity in the readouts is a key feature in supervised machine learning applications. It guarantees that these systems can be used in high-dimensional situations and in the presence of large datasets. The $L ^p $ criteria used in this paper allow the formulation of universality results that do not necessarily impose almost sure uniform boundedness in the inputs or the fading memory property in the filter that needs to be approximated. △ Less

Submitted 7 July, 2018; originally announced July 2018.

Comments: 11 pages

arXiv:1806.00797 [pdf, other]

Echo state networks are universal

Authors: Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: This paper shows that echo state networks are universal uniform approximants in the context of discrete-time fading memory filters with uniformly bounded inputs defined on negative infinite times. This result guarantees that any fading memory input/output system in discrete time can be realized as a simple finite-dimensional neural network-type state-space model with a static linear readout map. T… ▽ More This paper shows that echo state networks are universal uniform approximants in the context of discrete-time fading memory filters with uniformly bounded inputs defined on negative infinite times. This result guarantees that any fading memory input/output system in discrete time can be realized as a simple finite-dimensional neural network-type state-space model with a static linear readout map. This approximation is valid for infinite time intervals. The proof of this statement is based on fundamental results, also presented in this work, about the topological nature of the fading memory property and about reservoir computing systems generated by continuous reservoir maps. △ Less

Submitted 26 August, 2018; v1 submitted 3 June, 2018; originally announced June 2018.

Comments: 28 pages

arXiv:1805.02039 [pdf, ps, other]

Integration in Social Networks

Authors: Josue Ortega

Abstract: We propose the notion of $k$-integration as a measure of equality of opportunity in social networks. A social network is $k$-integrated if there is a path of length at most $k$ between any two individuals, thus guaranteeing that everybody has the same network opportunities to find a job, a romantic partner, or valuable information. We compute the minimum number of bridges (i.e. edges between nodes… ▽ More We propose the notion of $k$-integration as a measure of equality of opportunity in social networks. A social network is $k$-integrated if there is a path of length at most $k$ between any two individuals, thus guaranteeing that everybody has the same network opportunities to find a job, a romantic partner, or valuable information. We compute the minimum number of bridges (i.e. edges between nodes belonging to different components) or central nodes (those which are endpoints to a bridge) required to ensure $k$-integration. The answer depends only linearly on the size of each component for $k=2$, and does not depend on the size of each component for $k \geq 3$. Our findings provide a simple and intuitive way to compare the equality of opportunity of real-life social networks. △ Less

Submitted 24 May, 2019; v1 submitted 5 May, 2018; originally announced May 2018.

MSC Class: 91D30

Journal ref: Physica A: Statistical Mechanics and its Applications; 2019

arXiv:1712.00754 [pdf, other]

Universal discrete-time reservoir computers with stochastic inputs and linear readouts using non-homogeneous state-affine systems

Authors: Lyudmila Grigoryeva, Juan-Pablo Ortega

Abstract: A new class of non-homogeneous state-affine systems is introduced for use in reservoir computing. Sufficient conditions are identified that guarantee first, that the associated reservoir computers with linear readouts are causal, time-invariant, and satisfy the fading memory property and second, that a subset of this class is universal in the category of fading memory filters with stochastic almos… ▽ More A new class of non-homogeneous state-affine systems is introduced for use in reservoir computing. Sufficient conditions are identified that guarantee first, that the associated reservoir computers with linear readouts are causal, time-invariant, and satisfy the fading memory property and second, that a subset of this class is universal in the category of fading memory filters with stochastic almost surely uniformly bounded inputs. This means that any discrete-time filter that satisfies the fading memory property with random inputs of that type can be uniformly approximated by elements in the non-homogeneous state-affine family. △ Less

Submitted 26 August, 2018; v1 submitted 3 December, 2017; originally announced December 2017.

Comments: 41 pages

Showing 1–50 of 60 results for author: Ortega, J