Search | arXiv e-print repository

Flexible and Effective Mixing of Large Language Models into a Mixture of Domain Experts

Authors: Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti

Abstract: We present a toolkit for creating low-cost Mixture-of-Domain-Experts (MOE) from trained models. The toolkit can be used for creating a mixture from models or from adapters. We perform extensive tests and offer guidance on defining the architecture of the resulting MOE using the toolkit. A public repository is available. We present a toolkit for creating low-cost Mixture-of-Domain-Experts (MOE) from trained models. The toolkit can be used for creating a mixture from models or from adapters. We perform extensive tests and offer guidance on defining the architecture of the resulting MOE using the toolkit. A public repository is available. △ Less

Submitted 30 August, 2024; originally announced August 2024.

arXiv:2407.09105 [pdf, other]

Enhancing Training Efficiency Using Packing with Flash Attention

Authors: Achintya Kundu, Rhui Dih Lee, Laura Wynter, Raghu Kiran Ganti, Mayank Mishra

Abstract: Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. Hugging Face SFT trainer has always offered the option to use packing to combin… ▽ More Padding is often used in tuning LLM models by adding special tokens to shorter training examples to match the length of the longest sequence in each batch. While this ensures uniformity for batch processing, it introduces inefficiencies by including irrelevant padding tokens in the computation and wastes GPU resources. Hugging Face SFT trainer has always offered the option to use packing to combine multiple training examples, allowing for maximal utilization of GPU resources. However, up till now, it did not offer proper masking of each packed training example. This capability has now been added to Hugging Face Transformers 4.44. We analyse this new feature and show the benefits across different variations of packing. △ Less

Submitted 23 August, 2024; v1 submitted 12 July, 2024; originally announced July 2024.

arXiv:2404.01353 [pdf, other]

Efficiently Distilling LLMs for Edge Applications

Authors: Achintya Kundu, Fabian Lim, Aaron Chew, Laura Wynter, Penny Chong, Rhui Dih Lee

Abstract: Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to ob… ▽ More Supernet training of LLMs is of great interest in industrial applications as it confers the ability to produce a palette of smaller models at constant cost, regardless of the number of models (of different size / latency) produced. We propose a new method called Multistage Low-rank Fine-tuning of Super-transformers (MLFS) for parameter-efficient supernet training. We show that it is possible to obtain high-quality encoder models that are suitable for commercial edge applications, and that while decoder-only models are resistant to a comparable degree of compression, decoders can be effectively sliced for a significant reduction in training time. △ Less

Submitted 1 April, 2024; originally announced April 2024.

Comments: This paper has been accepted for publication in NAACL 2024 (Industry Track)

arXiv:2303.15485 [pdf, other]

Transfer-Once-For-All: AI Model Optimization for Edge

Authors: Achintya Kundu, Laura Wynter, Rhui Dih Lee, Luis Angel Bathen

Abstract: Weight-sharing neural architecture search aims to optimize a configurable neural network model (supernet) for a variety of deployment scenarios across many devices with different resource constraints. Existing approaches use evolutionary search to extract models of different sizes from a supernet trained on a very large data set, and then fine-tune the extracted models on the typically small, real… ▽ More Weight-sharing neural architecture search aims to optimize a configurable neural network model (supernet) for a variety of deployment scenarios across many devices with different resource constraints. Existing approaches use evolutionary search to extract models of different sizes from a supernet trained on a very large data set, and then fine-tune the extracted models on the typically small, real-world data set of interest. The computational cost of training thus grows linearly with the number of different model deployment scenarios. Hence, we propose Transfer-Once-For-All (TOFA) for supernet-style training on small data sets with constant computational training cost over any number of edge deployment scenarios. Given a task, TOFA obtains custom neural networks, both the topology and the weights, optimized for any number of edge deployment scenarios. To overcome the challenges arising from small data, TOFA utilizes a unified semi-supervised training loss to simultaneously train all subnets within the supernet, coupled with on-the-fly architecture selection at deployment time. △ Less

Submitted 2 July, 2023; v1 submitted 27 March, 2023; originally announced March 2023.

arXiv:2202.13436 [pdf, other]

Neural-Progressive Hedging: Enforcing Constraints in Reinforcement Learning with Stochastic Programming

Authors: Supriyo Ghosh, Laura Wynter, Shiau Hong Lim, Duc Thien Nguyen

Abstract: We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints and risk-based objectives such as conditional value-at-risk (CVaR) during the execution of the policy, using probabilistic models of the state transitions to gui… ▽ More We propose a framework, called neural-progressive hedging (NP), that leverages stochastic programming during the online phase of executing a reinforcement learning (RL) policy. The goal is to ensure feasibility with respect to constraints and risk-based objectives such as conditional value-at-risk (CVaR) during the execution of the policy, using probabilistic models of the state transitions to guide policy adjustments. The framework is particularly amenable to the class of sequential resource allocation problems since feasibility with respect to typical resource constraints cannot be enforced in a scalable manner. The NP framework provides an alternative that adds modest overhead during the online phase. Experimental results demonstrate the efficacy of the NP framework on two continuous real-world tasks: (i) the portfolio optimization problem with liquidity constraints for financial planning, characterized by non-stationary state distributions; and (ii) the dynamic repositioning problem in bike sharing systems, that embodies the class of supply-demand matching problems. We show that the NP framework produces policies that are better than deep RL and other baseline approaches, adapting to non-stationarity, whilst satisfying structural constraints and accommodating risk measures in the resulting policies. Additional benefits of the NP framework are ease of implementation and better explainability of the policies. △ Less

Submitted 27 February, 2022; originally announced February 2022.

arXiv:2110.07275 [pdf, other]

Order Constraints in Optimal Transport

Authors: Fabian Lim, Laura Wynter, Shiau Hong Lim

Abstract: Optimal transport is a framework for comparing measures whereby a cost is incurred for transporting one measure to another. Recent works have aimed to improve optimal transport plans through the introduction of various forms of structure. We introduce novel order constraints into the optimal transport formulation to allow for the incorporation of structure. We define an efficient method for obtain… ▽ More Optimal transport is a framework for comparing measures whereby a cost is incurred for transporting one measure to another. Recent works have aimed to improve optimal transport plans through the introduction of various forms of structure. We introduce novel order constraints into the optimal transport formulation to allow for the incorporation of structure. We define an efficient method for obtaining explainable solutions to the new formulation that scales far better than standard approaches. The theoretical properties of the method are provided. We demonstrate experimentally that order constraints improve explainability using the e-SNLI (Stanford Natural Language Inference) dataset that includes human-annotated rationales as well as on several image color transfer examples. △ Less

Submitted 28 June, 2022; v1 submitted 14 October, 2021; originally announced October 2021.

Comments: To appear in Proceedings of ICML 2022. Main Paper + Supplementary

arXiv:2102.09745 [pdf, other]

Decentralized Deterministic Multi-Agent Reinforcement Learning

Authors: Antoine Grosnit, Desmond Cai, Laura Wynter

Abstract: [Zhang, ICML 2018] provided the first decentralized actor-critic algorithm for multi-agent reinforcement learning (MARL) that offers convergence guarantees. In that work, policies are stochastic and are defined on finite action spaces. We extend those results to offer a provably-convergent decentralized actor-critic algorithm for learning deterministic policies on continuous action spaces. Determi… ▽ More [Zhang, ICML 2018] provided the first decentralized actor-critic algorithm for multi-agent reinforcement learning (MARL) that offers convergence guarantees. In that work, policies are stochastic and are defined on finite action spaces. We extend those results to offer a provably-convergent decentralized actor-critic algorithm for learning deterministic policies on continuous action spaces. Deterministic policies are important in real-world settings. To handle the lack of exploration inherent in deterministic policies, we consider both off-policy and on-policy settings. We provide the expression of a local deterministic policy gradient, decentralized deterministic actor-critic algorithms and convergence guarantees for linearly-approximated value functions. This work will help enable decentralized MARL in high-dimensional action spaces and pave the way for more widespread use of MARL. △ Less

Submitted 19 February, 2021; originally announced February 2021.

arXiv:2102.09361 [pdf, other]

Efficient Reinforcement Learning in Resource Allocation Problems Through Permutation Invariant Multi-task Learning

Authors: Desmond Cai, Shiau Hong Lim, Laura Wynter

Abstract: One of the main challenges in real-world reinforcement learning is to learn successfully from limited training samples. We show that in certain settings, the available data can be dramatically increased through a form of multi-task learning, by exploiting an invariance property in the tasks. We provide a theoretical performance bound for the gain in sample efficiency under this setting. This motiv… ▽ More One of the main challenges in real-world reinforcement learning is to learn successfully from limited training samples. We show that in certain settings, the available data can be dramatically increased through a form of multi-task learning, by exploiting an invariance property in the tasks. We provide a theoretical performance bound for the gain in sample efficiency under this setting. This motivates a new approach to multi-task learning, which involves the design of an appropriate neural network architecture and a prioritized task-sampling strategy. We demonstrate empirically the effectiveness of the proposed approach on two real-world sequential resource allocation tasks where this invariance property occurs: financial portfolio optimization and meta federated learning. △ Less

Submitted 18 February, 2021; originally announced February 2021.

arXiv:2101.06171 [pdf, other]

Probabilistic Inference for Learning from Untrusted Sources

Authors: Duc Thien Nguyen, Shiau Hoong Lim, Laura Wynter, Desmond Cai

Abstract: Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity. However, because federated learning tasks tend to be large and complex, and training times non-negligible, it is important for the aggregation algorithm to be robust to non-IID data and corrupted parties. This ro… ▽ More Federated learning brings potential benefits of faster learning, better solutions, and a greater propensity to transfer when heterogeneous data from different parties increases diversity. However, because federated learning tasks tend to be large and complex, and training times non-negligible, it is important for the aggregation algorithm to be robust to non-IID data and corrupted parties. This robustness relies on the ability to identify, and appropriately weight, incompatible parties. Recent work assumes that a \textit{reference dataset} is available through which to perform the identification. We consider settings where no such reference dataset is available; rather, the quality and suitability of the parties needs to be \textit{inferred}. We do so by bringing ideas from crowdsourced predictions and collaborative filtering, where one must infer an unknown ground truth given proposals from participants with unknown quality. We propose novel federated learning aggregation algorithms based on Bayesian inference that adapt to the quality of the parties. Empirically, we show that the algorithms outperform standard and robust aggregation in federated learning on both synthetic and real data. △ Less

Submitted 15 January, 2021; originally announced January 2021.

arXiv:2009.06303 [pdf, other]

Robustness and Personalization in Federated Learning: A Unified Approach via Regularization

Authors: Achintya Kundu, Pengqian Yu, Laura Wynter, Shiau Hong Lim

Abstract: We present a class of methods for robust, personalized federated learning, called Fed+, that unifies many federated learning algorithms. The principal advantage of this class of methods is to better accommodate the real-world characteristics found in federated training, such as the lack of IID data across parties, the need for robustness to outliers or stragglers, and the requirement to perform we… ▽ More We present a class of methods for robust, personalized federated learning, called Fed+, that unifies many federated learning algorithms. The principal advantage of this class of methods is to better accommodate the real-world characteristics found in federated training, such as the lack of IID data across parties, the need for robustness to outliers or stragglers, and the requirement to perform well on party-specific datasets. We achieve this through a problem formulation that allows the central server to employ robust ways of aggregating the local models while keeping the structure of local computation intact. Without making any statistical assumption on the degree of heterogeneity of local data across parties, we provide convergence guarantees for Fed+ for convex and non-convex loss functions under different (robust) aggregation methods. The Fed+ theory is also equipped to handle heterogeneous computing environments including stragglers without additional assumptions; specifically, the convergence results cover the general setting where the number of local update steps across parties can vary. We demonstrate the benefits of Fed+ through extensive experiments across standard benchmark datasets. △ Less

Submitted 12 July, 2022; v1 submitted 14 September, 2020; originally announced September 2020.

Comments: Accepted by IEEE EDGE 2022 (16 pages, 4 figures, 2 tables)

arXiv:2007.10987 [pdf, other]

IBM Federated Learning: an Enterprise Framework White Paper V0.1

Authors: Heiko Ludwig, Nathalie Baracaldo, Gegi Thomas, Yi Zhou, Ali Anwar, Shashank Rajamoni, Yuya Ong, Jayaram Radhakrishnan, Ashish Verma, Mathieu Sinn, Mark Purcell, Ambrish Rawat, Tran Minh, Naoise Holohan, Supriyo Chakraborty, Shalisha Whitherspoon, Dean Steuer, Laura Wynter, Hifaz Hassan, Sean Laguna, Mikhail Yurochkin, Mayank Agarwal, Ebube Chuba, Annie Abay

Abstract: Federated Learning (FL) is an approach to conduct machine learning without centralizing training data in a single place, for reasons of privacy, confidentiality or data volume. However, solving federated machine learning problems raises issues above and beyond those of centralized machine learning. These issues include setting up communication infrastructure between parties, coordinating the learn… ▽ More Federated Learning (FL) is an approach to conduct machine learning without centralizing training data in a single place, for reasons of privacy, confidentiality or data volume. However, solving federated machine learning problems raises issues above and beyond those of centralized machine learning. These issues include setting up communication infrastructure between parties, coordinating the learning process, integrating party results, understanding the characteristics of the training data sets of different participating parties, handling data heterogeneity, and operating with the absence of a verification data set. IBM Federated Learning provides infrastructure and coordination for federated learning. Data scientists can design and run federated learning jobs based on existing, centralized machine learning models and can provide high-level instructions on how to run the federation. The framework applies to both Deep Neural Networks as well as ``traditional'' approaches for the most common machine learning libraries. {\proj} enables data scientists to expand their scope from centralized to federated machine learning, minimizing the learning curve at the outset while also providing the flexibility to deploy to different compute environments and design custom fusion algorithms. △ Less

Submitted 22 July, 2020; originally announced July 2020.

Comments: 17 pages

ACM Class: I.2.6; I.2.11

arXiv:2006.00778 [pdf, other]

Variational Bayesian Inference for Crowdsourcing Predictions

Authors: Desmond Cai, Duc Thien Nguyen, Shiau Hong Lim, Laura Wynter

Abstract: Crowdsourcing has emerged as an effective means for performing a number of machine learning tasks such as annotation and labelling of images and other data sets. In most early settings of crowdsourcing, the task involved classification, that is assigning one of a discrete set of labels to each task. Recently, however, more complex tasks have been attempted including asking crowdsource workers to a… ▽ More Crowdsourcing has emerged as an effective means for performing a number of machine learning tasks such as annotation and labelling of images and other data sets. In most early settings of crowdsourcing, the task involved classification, that is assigning one of a discrete set of labels to each task. Recently, however, more complex tasks have been attempted including asking crowdsource workers to assign continuous labels, or predictions. In essence, this involves the use of crowdsourcing for function estimation. We are motivated by this problem to drive applications such as collaborative prediction, that is, harnessing the wisdom of the crowd to predict quantities more accurately. To do so, we propose a Bayesian approach aimed specifically at alleviating overfitting, a typical impediment to accurate prediction models in practice. In particular, we develop a variational Bayesian technique for two different worker noise models - one that assumes workers' noises are independent and the other that assumes workers' noises have a latent low-rank structure. Our evaluations on synthetic and real-world datasets demonstrate that these Bayesian approaches perform significantly better than existing non-Bayesian approaches and are thus potentially useful for this class of crowdsourcing problems. △ Less

Submitted 1 June, 2020; v1 submitted 1 June, 2020; originally announced June 2020.

Comments: 7 pages

arXiv:2004.01387 [pdf, other]

A Deep Ensemble Multi-Agent Reinforcement Learning Approach for Air Traffic Control

Authors: Supriyo Ghosh, Sean Laguna, Shiau Hong Lim, Laura Wynter, Hasan Poonawala

Abstract: Air traffic control is an example of a highly challenging operational problem that is readily amenable to human expertise augmentation via decision support technologies. In this paper, we propose a new intelligent decision making framework that leverages multi-agent reinforcement learning (MARL) to dynamically suggest adjustments of aircraft speeds in real-time. The goal of the system is to enhanc… ▽ More Air traffic control is an example of a highly challenging operational problem that is readily amenable to human expertise augmentation via decision support technologies. In this paper, we propose a new intelligent decision making framework that leverages multi-agent reinforcement learning (MARL) to dynamically suggest adjustments of aircraft speeds in real-time. The goal of the system is to enhance the ability of an air traffic controller to provide effective guidance to aircraft to avoid air traffic congestion, near-miss situations, and to improve arrival timeliness. We develop a novel deep ensemble MARL method that can concisely capture the complexity of the air traffic control problem by learning to efficiently arbitrate between the decisions of a local kernel-based RL model and a wider-reaching deep MARL model. The proposed method is trained and evaluated on an open-source air traffic management simulator developed by Eurocontrol. Extensive empirical results on a real-world dataset including thousands of aircraft demonstrate the feasibility of using multi-agent RL for the problem of en-route air traffic control and show that our proposed deep ensemble MARL method significantly outperforms three state-of-the-art benchmark approaches. △ Less

Submitted 3 April, 2020; originally announced April 2020.

arXiv:1906.03040 [pdf, other]

FASTER: Fusion AnalyticS for public Transport Event Response

Authors: Sebastien Blandin, Laura Wynter, Hasan Poonawala, Sean Laguna, Basile Dura

Abstract: Increasing urban concentration raises operational challenges that can benefit from integrated monitoring and decision support. Such complex systems need to leverage the full stack of analytical methods, from state estimation using multi-sensor fusion for situational awareness, to prediction and computation of optimal responses. The FASTER platform that we describe in this work, deployed at nation… ▽ More Increasing urban concentration raises operational challenges that can benefit from integrated monitoring and decision support. Such complex systems need to leverage the full stack of analytical methods, from state estimation using multi-sensor fusion for situational awareness, to prediction and computation of optimal responses. The FASTER platform that we describe in this work, deployed at nation scale and handling 1.5 billion public transport trips a year, offers such a full stack of techniques for this large-scale, real-time problem. FASTER provides fine-grained situational awareness and real-time decision support with the objective of improving the public transport commuter experience. The methods employed range from statistical machine learning to agent-based simulation and mixed-integer optimization. In this work we present an overview of the challenges and methods involved, with details of the commuter movement prediction module, as well as a discussion of open problems. △ Less

Submitted 14 May, 2019; originally announced June 2019.

arXiv:1904.10180 [pdf, other]

High-frequency crowd insights for public safety and congestion control

Authors: Karthik Nandakumar, Sebastien Blandin, Laura Wynter

Abstract: We present results from several projects aimed at enabling the real-time understanding of crowds and their behaviour in the built environment. We make use of CCTV video cameras that are ubiquitous throughout the developed and developing world and as such are able to play the role of a reliable sensing mechanism. We outline the novel methods developed for our crowd insights engine, and illustrate e… ▽ More We present results from several projects aimed at enabling the real-time understanding of crowds and their behaviour in the built environment. We make use of CCTV video cameras that are ubiquitous throughout the developed and developing world and as such are able to play the role of a reliable sensing mechanism. We outline the novel methods developed for our crowd insights engine, and illustrate examples of its use in different contexts in the urban landscape. Applications of the technology range from maintaining security in public spaces to quantifying the adequacy of public transport level of service. △ Less

Submitted 23 April, 2019; originally announced April 2019.

arXiv:1903.01045 [pdf, other]

Robust commuter movement inference from connected mobile devices

Authors: Baoyang Song, Hasan Poonawala, Laura Wynter, Sebastien Blandin

Abstract: The preponderance of connected devices provides unprecedented opportunities for fine-grained monitoring of the public infrastructure. However while classical models expect high quality application-specific data streams, the promise of the Internet of Things (IoT) is that of an abundance of disparate and noisy datasets from connected devices. In this context, we consider the problem of estimation o… ▽ More The preponderance of connected devices provides unprecedented opportunities for fine-grained monitoring of the public infrastructure. However while classical models expect high quality application-specific data streams, the promise of the Internet of Things (IoT) is that of an abundance of disparate and noisy datasets from connected devices. In this context, we consider the problem of estimation of the level of service of a city-wide public transport network. We first propose a robust unsupervised model for train movement inference from wifi traces, via the application of robust clustering methods to a one dimensional spatio-temporal setting. We then explore the extent to which the demand-supply gap can be estimated from connected devices. We propose a classification model of real-time commuter patterns, including both a batch training phase and an online learning component. We describe our deployment architecture and assess our system accuracy on a large-scale anonymized dataset comprising more than 10 billion records. △ Less

Submitted 3 March, 2019; originally announced March 2019.

Comments: International Conference on Data Mining 2018

arXiv:1902.10887 [pdf, other]

Towards Robust ResNet: A Small Step but A Giant Leap

Authors: Jingfeng Zhang, Bo Han, Laura Wynter, Kian Hsiang Low, Mohan Kankanhalli

Abstract: This paper presents a simple yet principled approach to boosting the robustness of the residual network (ResNet) that is motivated by the dynamical system perspective. Namely, a deep neural network can be interpreted using a partial differential equation, which naturally inspires us to characterize ResNet by an explicit Euler method. Our analytical studies reveal that the step factor h in the Eule… ▽ More This paper presents a simple yet principled approach to boosting the robustness of the residual network (ResNet) that is motivated by the dynamical system perspective. Namely, a deep neural network can be interpreted using a partial differential equation, which naturally inspires us to characterize ResNet by an explicit Euler method. Our analytical studies reveal that the step factor h in the Euler method is able to control the robustness of ResNet in both its training and generalization. Specifically, we prove that a small step factor h can benefit the training robustness for back-propagation; from the view of forward-propagation, a small h can aid in the robustness of the model generalization. A comprehensive empirical evaluation on both vision CIFAR-10 and text AG-NEWS datasets confirms that a small h aids both the training and generalization robustness. △ Less

Submitted 1 July, 2019; v1 submitted 27 February, 2019; originally announced February 2019.

arXiv:1812.05451 [pdf, other]

A Probabilistic Model of the Bitcoin Blockchain

Authors: Marc Jourdan, Sebastien Blandin, Laura Wynter, Pralhad Deshpande

Abstract: The Bitcoin transaction graph is a public data structure organized as transactions between addresses, each associated with a logical entity. In this work, we introduce a complete probabilistic model of the Bitcoin Blockchain. We first formulate a set of conditional dependencies induced by the Bitcoin protocol at the block level and derive a corresponding fully observed graphical model of a Bitcoin… ▽ More The Bitcoin transaction graph is a public data structure organized as transactions between addresses, each associated with a logical entity. In this work, we introduce a complete probabilistic model of the Bitcoin Blockchain. We first formulate a set of conditional dependencies induced by the Bitcoin protocol at the block level and derive a corresponding fully observed graphical model of a Bitcoin block. We then extend the model to include hidden entity attributes such as the functional category of the associated logical agent and derive asymptotic bounds on the privacy properties implied by this model. At the network level, we show evidence of complex transaction-to-transaction behavior and present a relevant discriminative model of the agent categories. Performance of both the block-based graphical model and the network-level discriminative model is evaluated on a subset of the public Bitcoin Blockchain. △ Less

Submitted 6 November, 2018; originally announced December 2018.

arXiv:1810.11956 [pdf, other]

Characterizing Entities in the Bitcoin Blockchain

Authors: Marc Jourdan, Sebastien Blandin, Laura Wynter, Pralhad Deshpande

Abstract: Bitcoin has created a new exchange paradigm within which financial transactions can be trusted without an intermediary. This premise of a free decentralized transactional network however requires, in its current implementation, unrestricted access to the ledger for peer-based transaction verification. A number of studies have shown that, in this pseudonymous context, identities can be leaked based… ▽ More Bitcoin has created a new exchange paradigm within which financial transactions can be trusted without an intermediary. This premise of a free decentralized transactional network however requires, in its current implementation, unrestricted access to the ledger for peer-based transaction verification. A number of studies have shown that, in this pseudonymous context, identities can be leaked based on transaction features or off-network information. In this work, we analyze the information revealed by the pattern of transactions in the neighborhood of a given entity transaction. By definition, these features which pertain to an extended network are not directly controllable by the entity, but might enable leakage of information about transacting entities. We define a number of new features relevant to entity characterization on the Bitcoin Blockchain and study their efficacy in practice. We show that even a weak attacker with shallow data mining knowledge is able to leverage these features to characterize the entity properties. △ Less

Submitted 29 October, 2018; originally announced October 2018.

arXiv:1809.10315 [pdf, other]

Smooth Inter-layer Propagation of Stabilized Neural Networks for Classification

Authors: Jingfeng Zhang, Laura Wynter

Abstract: Recent work has studied the reasons for the remarkable performance of deep neural networks in image classification. We examine batch normalization on the one hand and the dynamical systems view of residual networks on the other hand. Our goal is in understanding the notions of stability and smoothness of the inter-layer propagation of ResNets so as to explain when they contribute to significantly… ▽ More Recent work has studied the reasons for the remarkable performance of deep neural networks in image classification. We examine batch normalization on the one hand and the dynamical systems view of residual networks on the other hand. Our goal is in understanding the notions of stability and smoothness of the inter-layer propagation of ResNets so as to explain when they contribute to significantly enhanced performance. We postulate that such stability is of importance for the trained ResNet to transfer. △ Less

Submitted 28 September, 2018; v1 submitted 26 September, 2018; originally announced September 2018.

Comments: Revised Abstract

arXiv:1703.00759 [pdf, other]

Real-time public transport service-level monitoring using passive WiFi: a spectral clustering approach for train timetable estimation

Authors: Baoyang Song, Laura Wynter

Abstract: A new area in which passive WiFi analytics have promise for delivering value is the real-time monitoring of public transport systems. One example is determining the true (as opposed to the published) timetable of a public transport system in real-time. In most cases, there are no other publicly-available sources for this information. Yet, it is indispensable for the real-time monitoring of public… ▽ More A new area in which passive WiFi analytics have promise for delivering value is the real-time monitoring of public transport systems. One example is determining the true (as opposed to the published) timetable of a public transport system in real-time. In most cases, there are no other publicly-available sources for this information. Yet, it is indispensable for the real-time monitoring of public transport service levels. Furthermore, this information, if accurate and temporally fine-grained, can be used for very low-latency incident detection. In this work, we propose using spectral clustering based on trajectories derived from passive WiFi traces of users of a public transport system to infer the true timetable and two key performance indicators of the transport service, namely public transport vehicle headway and in-station dwell time. By detecting anomalous dwell times or headways, we demonstrate that a fast and accurate real-time incident-detection procedure can be obtained. The method we introduce makes use of the advantages of the high-frequency WiFi data, which provides very low-latency, universally-accessible information, while minimizing the impact of the noise in the data. △ Less

Submitted 2 March, 2017; originally announced March 2017.

Showing 1–21 of 21 results for author: Wynter, L