Search | arXiv e-print repository

Rapid Deployment of DNNs for Edge Computing via Structured Pruning at Initialization

Authors: Bailey J. Eccles, Leon Wong, Blesson Varghese

Abstract: Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning… ▽ More Edge machine learning (ML) enables localized processing of data on devices and is underpinned by deep neural networks (DNNs). However, DNNs cannot be easily run on devices due to their substantial computing, memory and energy requirements for delivering performance that is comparable to cloud-based ML. Therefore, model compression techniques, such as pruning, have been considered. Existing pruning methods are problematic for edge ML since they: (1) Create compressed models that have limited runtime performance benefits (using unstructured pruning) or compromise the final model accuracy (using structured pruning), and (2) Require substantial compute resources and time for identifying a suitable compressed DNN model (using neural architecture search). In this paper, we explore a new avenue, referred to as Pruning-at-Initialization (PaI), using structured pruning to mitigate the above problems. We develop Reconvene, a system for rapidly generating pruned models suited for edge deployments using structured PaI. Reconvene systematically identifies and prunes DNN convolution layers that are least sensitive to structured pruning. Reconvene rapidly creates pruned DNNs within seconds that are up to 16.21x smaller and 2x faster while maintaining the same accuracy as an unstructured PaI counterpart. △ Less

Submitted 22 April, 2024; originally announced April 2024.

Comments: The 24th IEEE/ACM International Symposium on Cluster, Cloud and Internet Computing

arXiv:2404.03687 [pdf, other]

DRIVE: Dual Gradient-Based Rapid Iterative Pruning

Authors: Dhananjay Saikumar, Blesson Varghese

Abstract: Modern deep neural networks (DNNs) consist of millions of parameters, necessitating high-performance computing during training and inference. Pruning is one solution that significantly reduces the space and time complexities of DNNs. Traditional pruning methods that are applied post-training focus on streamlining inference, but there are recent efforts to leverage sparsity early on by pruning befo… ▽ More Modern deep neural networks (DNNs) consist of millions of parameters, necessitating high-performance computing during training and inference. Pruning is one solution that significantly reduces the space and time complexities of DNNs. Traditional pruning methods that are applied post-training focus on streamlining inference, but there are recent efforts to leverage sparsity early on by pruning before training. Pruning methods, such as iterative magnitude-based pruning (IMP) achieve up to a 90% parameter reduction while retaining accuracy comparable to the original model. However, this leads to impractical runtime as it relies on multiple train-prune-reset cycles to identify and eliminate redundant parameters. In contrast, training agnostic early pruning methods, such as SNIP and SynFlow offer fast pruning but fall short of the accuracy achieved by IMP at high sparsities. To bridge this gap, we present Dual Gradient-Based Rapid Iterative Pruning (DRIVE), which leverages dense training for initial epochs to counteract the randomness inherent at the initialization. Subsequently, it employs a unique dual gradient-based metric for parameter ranking. It has been experimentally demonstrated for VGG and ResNet architectures on CIFAR-10/100 and Tiny ImageNet, and ResNet on ImageNet that DRIVE consistently has superior performance over other training-agnostic early pruning methods in accuracy. Notably, DRIVE is 43$\times$ to 869$\times$ faster than IMP for pruning. △ Less

Submitted 1 April, 2024; originally announced April 2024.

arXiv:2402.14139 [pdf, other]

NeuroFlux: Memory-Efficient CNN Training Using Adaptive Local Learning

Authors: Dhananjay Saikumar, Blesson Varghese

Abstract: Efficient on-device Convolutional Neural Network (CNN) training in resource-constrained mobile and edge environments is an open challenge. Backpropagation is the standard approach adopted, but it is GPU memory intensive due to its strong inter-layer dependencies that demand intermediate activations across the entire CNN model to be retained in GPU memory. This necessitates smaller batch sizes to m… ▽ More Efficient on-device Convolutional Neural Network (CNN) training in resource-constrained mobile and edge environments is an open challenge. Backpropagation is the standard approach adopted, but it is GPU memory intensive due to its strong inter-layer dependencies that demand intermediate activations across the entire CNN model to be retained in GPU memory. This necessitates smaller batch sizes to make training possible within the available GPU memory budget, but in turn, results in substantially high and impractical training time. We introduce NeuroFlux, a novel CNN training system tailored for memory-constrained scenarios. We develop two novel opportunities: firstly, adaptive auxiliary networks that employ a variable number of filters to reduce GPU memory usage, and secondly, block-specific adaptive batch sizes, which not only cater to the GPU memory constraints but also accelerate the training process. NeuroFlux segments a CNN into blocks based on GPU memory usage and further attaches an auxiliary network to each layer in these blocks. This disrupts the typical layer dependencies under a new training paradigm - $\textit{`adaptive local learning'}$. Moreover, NeuroFlux adeptly caches intermediate activations, eliminating redundant forward passes over previously trained blocks, further accelerating the training process. The results are twofold when compared to Backpropagation: on various hardware platforms, NeuroFlux demonstrates training speed-ups of 2.3$\times$ to 6.1$\times$ under stringent GPU memory budgets, and NeuroFlux generates streamlined models that have 10.9$\times$ to 29.4$\times$ fewer parameters. △ Less

Submitted 4 March, 2024; v1 submitted 21 February, 2024; originally announced February 2024.

Comments: Accepted to EuroSys 2024

arXiv:2312.09626 [pdf]

Exploring Gender Disparities in Bumble's Match Recommendations

Authors: Ritvik Aryan Kalra, Pratham Gupta, Ben Varghese, Nimmi Rangaswamy

Abstract: We study bias and discrimination in the context of Bumble, an online dating platform in India. Drawing on research in AI fairness and inclusion studies we analyze algorithmic bias and their propensity to reproduce bias. We conducted an experiment to identify and address the presence of bias in the matching algorithms Bumble pushes to its users in the form of profiles for potential dates in the rea… ▽ More We study bias and discrimination in the context of Bumble, an online dating platform in India. Drawing on research in AI fairness and inclusion studies we analyze algorithmic bias and their propensity to reproduce bias. We conducted an experiment to identify and address the presence of bias in the matching algorithms Bumble pushes to its users in the form of profiles for potential dates in the real world. Dating apps like Bumble utilize algorithms that learn from user data to make recommendations. Even if the algorithm does not have intentions or consciousness, it is a system created and maintained by humans. We attribute moral agency of such systems to be compositely derived from algorithmic mediations, the design and utilization of these platforms. Developers, designers, and operators of dating platforms thus have a moral obligation to mitigate biases in the algorithms to create inclusive platforms that affirm diverse social identities. △ Less

Submitted 15 December, 2023; originally announced December 2023.

arXiv:2309.06973 [pdf, ps, other]

DNNShifter: An Efficient DNN Pruning System for Edge Computing

Authors: Bailey J. Eccles, Philip Rodgers, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Deep neural networks (DNNs) underpin many machine learning applications. Production quality DNN models achieve high inference accuracy by training millions of DNN parameters which has a significant resource footprint. This presents a challenge for resources operating at the extreme edge of the network, such as mobile and embedded devices that have limited computational and memory resources. To add… ▽ More Deep neural networks (DNNs) underpin many machine learning applications. Production quality DNN models achieve high inference accuracy by training millions of DNN parameters which has a significant resource footprint. This presents a challenge for resources operating at the extreme edge of the network, such as mobile and embedded devices that have limited computational and memory resources. To address this, models are pruned to create lightweight, more suitable variants for these devices. Existing pruning methods are unable to provide similar quality models compared to their unpruned counterparts without significant time costs and overheads or are limited to offline use cases. Our work rapidly derives suitable model variants while maintaining the accuracy of the original model. The model variants can be swapped quickly when system and network conditions change to match workload demand. This paper presents DNNShifter, an end-to-end DNN training, spatial pruning, and model switching system that addresses the challenges mentioned above. At the heart of DNNShifter is a novel methodology that prunes sparse models using structured pruning. The pruned model variants generated by DNNShifter are smaller in size and thus faster than dense and sparse model predecessors, making them suitable for inference at the edge while retaining near similar accuracy as of the original dense model. DNNShifter generates a portfolio of model variants that can be swiftly interchanged depending on operational conditions. DNNShifter produces pruned model variants up to 93x faster than conventional training methods. Compared to sparse models, the pruned model variants are up to 5.14x smaller and have a 1.67x inference latency speedup, with no compromise to sparse model accuracy. In addition, DNNShifter has up to 11.9x lower overhead for switching models and up to 3.8x lower memory utilisation than existing approaches. △ Less

Submitted 13 September, 2023; originally announced September 2023.

Comments: 14 pages, 7 figures, 5 tables

MSC Class: 68T07 ACM Class: I.2.1

arXiv:2304.05495 [pdf, other]

EcoFed: Efficient Communication for DNN Partitioning-based Federated Learning

Authors: Di Wu, Rehmat Ullah, Philip Rodgers, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to the server. However, this creates signifi… ▽ More Efficiently running federated learning (FL) on resource-constrained devices is challenging since they are required to train computationally intensive deep neural networks (DNN) independently. DNN partitioning-based FL (DPFL) has been proposed as one mechanism to accelerate training where the layers of a DNN (or computation) are offloaded from the device to the server. However, this creates significant communication overheads since the intermediate activation and gradient need to be transferred between the device and the server during training. While current research reduces the communication introduced by DNN partitioning using local loss-based methods, we demonstrate that these methods are ineffective in improving the overall efficiency (communication overhead and training speed) of a DPFL system. This is because they suffer from accuracy degradation and ignore the communication costs incurred when transferring the activation from the device to the server. This article proposes EcoFed - a communication efficient framework for DPFL systems. EcoFed eliminates the transmission of the gradient by developing pre-trained initialization of the DNN model on the device for the first time. This reduces the accuracy degradation seen in local loss-based methods. In addition, EcoFed proposes a novel replay buffer mechanism and implements a quantization-based compression technique to reduce the transmission of the activation. It is experimentally demonstrated that EcoFed can reduce the communication cost by up to 133x and accelerate training by up to 21x when compared to classic FL. Compared to vanilla DPFL, EcoFed achieves a 16x communication reduction and 2.86x training time speed-up. EcoFed is available from https://github.com/blessonvar/EcoFed. △ Less

Submitted 3 January, 2024; v1 submitted 11 April, 2023; originally announced April 2023.

arXiv:2302.12803 [pdf, other]

PiPar: Pipeline Parallelism for Collaborative Machine Learning

Authors: Zihan Zhang, Philip Rodgers, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Collaborative machine learning (CML) techniques, such as federated learning, have been proposed to train deep learning models across multiple mobile devices and a server. CML techniques are privacy-preserving as a local model that is trained on each device instead of the raw data from the device is shared with the server. However, CML training is inefficient due to low resource utilization. We ide… ▽ More Collaborative machine learning (CML) techniques, such as federated learning, have been proposed to train deep learning models across multiple mobile devices and a server. CML techniques are privacy-preserving as a local model that is trained on each device instead of the raw data from the device is shared with the server. However, CML training is inefficient due to low resource utilization. We identify idling resources on the server and devices due to sequential computation and communication as the principal cause of low resource utilization. A novel framework PiPar that leverages pipeline parallelism for CML techniques is developed to substantially improve resource utilization. A new training pipeline is designed to parallelize the computations on different hardware resources and communication on different bandwidth resources, thereby accelerating the training process in CML. A low overhead automated parameter selection method is proposed to optimize the pipeline, maximizing the utilization of available resources. The experimental results confirm the validity of the underlying approach of PiPar and highlight that when compared to federated learning: (i) the idle time of the server can be reduced by up to 64.1x, and (ii) the overall training time can be accelerated by up to 34.6x under varying network conditions for a collection of six small and large popular deep neural networks and four datasets without sacrificing accuracy. It is also experimentally demonstrated that PiPar achieves performance benefits when incorporating differential privacy methods and operating in environments with heterogeneous devices and changing bandwidths. △ Less

Submitted 25 June, 2024; v1 submitted 1 December, 2022; originally announced February 2023.

arXiv:2212.04645 [pdf, other]

doi 10.1016/j.iot.2022.100674

AI-based Fog and Edge Computing: A Systematic Review, Taxonomy and Future Directions

Authors: Sundas Iftikhar, Sukhpal Singh Gill, Chenghao Song, Minxian Xu, Mohammad Sadegh Aslanpour, Adel N. Toosi, Junhui Du, Huaming Wu, Shreya Ghosh, Deepraj Chowdhury, Muhammed Golec, Mohit Kumar, Ahmed M. Abdelmoniem, Felix Cuadrado, Blesson Varghese, Omer Rana, Schahram Dustdar, Steve Uhlig

Abstract: Resource management in computing is a very challenging problem that involves making sequential decisions. Resource limitations, resource heterogeneity, dynamic and diverse nature of workload, and the unpredictability of fog/edge computing environments have made resource management even more challenging to be considered in the fog landscape. Recently Artificial Intelligence (AI) and Machine Learnin… ▽ More Resource management in computing is a very challenging problem that involves making sequential decisions. Resource limitations, resource heterogeneity, dynamic and diverse nature of workload, and the unpredictability of fog/edge computing environments have made resource management even more challenging to be considered in the fog landscape. Recently Artificial Intelligence (AI) and Machine Learning (ML) based solutions are adopted to solve this problem. AI/ML methods with the capability to make sequential decisions like reinforcement learning seem most promising for these type of problems. But these algorithms come with their own challenges such as high variance, explainability, and online training. The continuously changing fog/edge environment dynamics require solutions that learn online, adopting changing computing environment. In this paper, we used standard review methodology to conduct this Systematic Literature Review (SLR) to analyze the role of AI/ML algorithms and the challenges in the applicability of these algorithms for resource management in fog/edge computing environments. Further, various machine learning, deep learning and reinforcement learning techniques for edge AI management have been discussed. Furthermore, we have presented the background and current status of AI/ML-based Fog/Edge Computing. Moreover, a taxonomy of AI/ML-based resource management techniques for fog/edge computing has been proposed and compared the existing techniques based on the proposed taxonomy. Finally, open challenges and promising future research directions have been identified and discussed in the area of AI/ML-based fog/edge computing. △ Less

Submitted 8 December, 2022; originally announced December 2022.

Comments: 49 page, 15 figures, 10 tables

Journal ref: Preprint for Publication in Elsevier IoT Journal 2022

arXiv:2210.16083 [pdf, other]

doi 10.1109/WACV56688.2023.00634

ROMA: Run-Time Object Detection To Maximize Real-Time Accuracy

Authors: JunKyu Lee, Blesson Varghese, Hans Vandierendonck

Abstract: This paper analyzes the effects of dynamically varying video contents and detection latency on the real-time detection accuracy of a detector and proposes a new run-time accuracy variation model, ROMA, based on the findings from the analysis. ROMA is designed to select an optimal detector out of a set of detectors in real time without label information to maximize real-time object detection accura… ▽ More This paper analyzes the effects of dynamically varying video contents and detection latency on the real-time detection accuracy of a detector and proposes a new run-time accuracy variation model, ROMA, based on the findings from the analysis. ROMA is designed to select an optimal detector out of a set of detectors in real time without label information to maximize real-time object detection accuracy. ROMA utilizing four YOLOv4 detectors on an NVIDIA Jetson Nano shows real-time accuracy improvements by 4 to 37% for a scenario of dynamically varying video contents and detection latency consisting of MOT17Det and MOT20Det datasets, compared to individual YOLOv4 detectors and two state-of-the-art runtime techniques. △ Less

Submitted 28 October, 2022; originally announced October 2022.

Comments: Accepted at the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2023

arXiv:2209.02052 [pdf, other]

RX-ADS: Interpretable Anomaly Detection using Adversarial ML for Electric Vehicle CAN data

Authors: Chathurika S. Wickramasinghe, Daniel L. Marino, Harindra S. Mavikumbure, Victor Cobilean, Timothy D. Pennington, Benny J. Varghese, Craig Rieger, Milos Manic

Abstract: Recent year has brought considerable advancements in Electric Vehicles (EVs) and associated infrastructures/communications. Intrusion Detection Systems (IDS) are widely deployed for anomaly detection in such critical infrastructures. This paper presents an Interpretable Anomaly Detection System (RX-ADS) for intrusion detection in CAN protocol communication in EVs. Contributions include: 1) window… ▽ More Recent year has brought considerable advancements in Electric Vehicles (EVs) and associated infrastructures/communications. Intrusion Detection Systems (IDS) are widely deployed for anomaly detection in such critical infrastructures. This paper presents an Interpretable Anomaly Detection System (RX-ADS) for intrusion detection in CAN protocol communication in EVs. Contributions include: 1) window based feature extraction method; 2) deep Autoencoder based anomaly detection method; and 3) adversarial machine learning based explanation generation methodology. The presented approach was tested on two benchmark CAN datasets: OTIDS and Car Hacking. The anomaly detection performance of RX-ADS was compared against the state-of-the-art approaches on these datasets: HIDS and GIDS. The RX-ADS approach presented performance comparable to the HIDS approach (OTIDS dataset) and has outperformed HIDS and GIDS approaches (Car Hacking dataset). Further, the proposed approach was able to generate explanations for detected abnormal behaviors arising from various intrusions. These explanations were later validated by information used by domain experts to detect anomalies. Other advantages of RX-ADS include: 1) the method can be trained on unlabeled data; 2) explanations help experts in understanding anomalies and root course analysis, and also help with AI model debugging and diagnostics, ultimately improving user trust in AI systems. △ Less

Submitted 5 September, 2022; originally announced September 2022.

arXiv:2208.08764 [pdf, other]

FedComm: Understanding Communication Protocols for Edge-based Federated Learning

Authors: Gary Cleland, Di Wu, Rehmat Ullah, Blesson Varghese

Abstract: Federated learning (FL) trains machine learning (ML) models on devices using locally generated data and exchanges models without transferring raw data to a distant server. This exchange incurs a communication overhead and impacts the performance of FL training. There is limited understanding of how communication protocols specifically contribute to the performance of FL. Such an understanding is e… ▽ More Federated learning (FL) trains machine learning (ML) models on devices using locally generated data and exchanges models without transferring raw data to a distant server. This exchange incurs a communication overhead and impacts the performance of FL training. There is limited understanding of how communication protocols specifically contribute to the performance of FL. Such an understanding is essential for selecting the right communication protocol when designing an FL system. This paper presents FedComm, a benchmarking methodology to quantify the impact of optimized application layer protocols, namely Message Queue Telemetry Transport (MQTT), Advanced Message Queuing Protocol (AMQP), and ZeroMQ Message Transport Protocol (ZMTP), and non-optimized application layer protocols, namely as TCP and UDP, on the performance of FL. FedComm measures the overall performance of FL in terms of communication time and accuracy under varying computational and network stress and packet loss rates. Experiments on a lab-based testbed demonstrate that TCP outperforms UDP as a non-optimized application layer protocol with higher accuracy and shorter communication times for 4G and Wi-Fi networks. Optimized application layer protocols such as AMQP, MQTT, and ZMTP outperformed non-optimized application layer protocols in most network conditions, resulting in a 2.5x reduction in communication time compared to TCP while maintaining accuracy. The experimental results enable us to highlight a number of open research issues for further investigation. FedComm is available for download from https://github.com/qub-blesson/FedComm. △ Less

Submitted 18 August, 2022; originally announced August 2022.

arXiv:2206.05267 [pdf, other]

CONTINUER: Maintaining Distributed DNN Services During Edge Failures

Authors: Ayesha Abdul Majeed, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Partitioning and deploying Deep Neural Networks (DNNs) across edge nodes may be used to meet performance objectives of applications. However, the failure of a single node may result in cascading failures that will adversely impact the delivery of the service and will result in failure to meet specific objectives. The impact of these failures needs to be minimised at runtime. Three techniques are e… ▽ More Partitioning and deploying Deep Neural Networks (DNNs) across edge nodes may be used to meet performance objectives of applications. However, the failure of a single node may result in cascading failures that will adversely impact the delivery of the service and will result in failure to meet specific objectives. The impact of these failures needs to be minimised at runtime. Three techniques are explored in this paper, namely repartitioning, early-exit and skip-connection. When an edge node fails, the repartitioning technique will repartition and redeploy the DNN thus avoiding the failed nodes. The early-exit technique makes provision for a request to exit (early) before the failed node. The skip connection technique dynamically routes the request by skipping the failed nodes. This paper will leverage trade-offs in accuracy, end-to-end latency and downtime for selecting the best technique given user-defined objectives (accuracy, latency and downtime thresholds) when an edge node fails. To this end, CONTINUER is developed. Two key activities of the framework are estimating the accuracy and latency when using the techniques for distributed DNNs and selecting the best technique. It is demonstrated on a lab-based experimental testbed that CONTINUER estimates accuracy and latency when using the techniques with no more than an average error of 0.28% and 13.06%, respectively and selects the suitable technique with a low overhead of no more than 16.82 milliseconds and an accuracy of up to 99.86%. △ Less

Submitted 25 April, 2022; originally announced June 2022.

Comments: 10 pages

arXiv:2112.00616 [pdf, other]

Roadmap for Edge AI: A Dagstuhl Perspective

Authors: Aaron Yi Ding, Ella Peltonen, Tobias Meuser, Atakan Aral, Christian Becker, Schahram Dustdar, Thomas Hiessl, Dieter Kranzlmuller, Madhusanka Liyanage, Setareh Magshudi, Nitinder Mohan, Joerg Ott, Jan S. Rellermeyer, Stefan Schulte, Henning Schulzrinne, Gurkan Solmaz, Sasu Tarkoma, Blesson Varghese, Lars Wolf

Abstract: Based on the collective input of Dagstuhl Seminar (21342), this paper presents a comprehensive discussion on AI methods and capabilities in the context of edge computing, referred as Edge AI. In a nutshell, we envision Edge AI to provide adaptation for data-driven applications, enhance network and radio access, and allow the creation, optimization, and deployment of distributed AI/ML pipelines wit… ▽ More Based on the collective input of Dagstuhl Seminar (21342), this paper presents a comprehensive discussion on AI methods and capabilities in the context of edge computing, referred as Edge AI. In a nutshell, we envision Edge AI to provide adaptation for data-driven applications, enhance network and radio access, and allow the creation, optimization, and deployment of distributed AI/ML pipelines with given quality of experience, trust, security and privacy targets. The Edge AI community investigates novel ML methods for the edge computing environment, spanning multiple sub-fields of computer science, engineering and ICT. The goal is to share an envisioned roadmap that can bring together key actors and enablers to further advance the domain of Edge AI. △ Less

Submitted 27 November, 2021; originally announced December 2021.

Comments: for ACM SIGCOMM CCR

ACM Class: I.2.11

arXiv:2111.05190 [pdf, other]

QUDOS: Quorum-Based Cloud-Edge Distributed DNNs for Security Enhanced Industry 4.0

Authors: Kevin Wallis, Christoph Reich, Blesson Varghese, Christian Schindelhauer

Abstract: Distributed machine learning algorithms that employ Deep Neural Networks (DNNs) are widely used in Industry 4.0 applications, such as smart manufacturing. The layers of a DNN can be mapped onto different nodes located in the cloud, edge and shop floor for preserving privacy. The quality of the data that is fed into and processed through the DNN is of utmost importance for critical tasks, such as i… ▽ More Distributed machine learning algorithms that employ Deep Neural Networks (DNNs) are widely used in Industry 4.0 applications, such as smart manufacturing. The layers of a DNN can be mapped onto different nodes located in the cloud, edge and shop floor for preserving privacy. The quality of the data that is fed into and processed through the DNN is of utmost importance for critical tasks, such as inspection and quality control. Distributed Data Validation Networks (DDVNs) are used to validate the quality of the data. However, they are prone to single points of failure when an attack occurs. This paper proposes QUDOS, an approach that enhances the security of a distributed DNN that is supported by DDVNs using quorums. The proposed approach allows individual nodes that are corrupted due to an attack to be detected or excluded when the DNN produces an output. Metrics such as corruption factor and success probability of an attack are considered for evaluating the security aspects of DNNs. A simulation study demonstrates that if the number of corrupted nodes is less than a given threshold for decision-making in a quorum, the QUDOS approach always prevents attacks. Furthermore, the study shows that increasing the size of the quorum has a better impact on security than increasing the number of layers. One merit of QUDOS is that it enhances the security of DNNs without requiring any modifications to the algorithm and can therefore be applied to other classes of problems. △ Less

Submitted 9 November, 2021; originally announced November 2021.

arXiv:2111.01516 [pdf, other]

FedFly: Towards Migration in Edge-based Distributed Federated Learning

Authors: Rehmat Ullah, Di Wu, Paul Harvey, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Federated learning (FL) is a privacy-preserving distributed machine learning technique that trains models while keeping all the original data generated on devices locally. Since devices may be resource constrained, offloading can be used to improve FL performance by transferring computational workload from devices to edge servers. However, due to mobility, devices participating in FL may leave the… ▽ More Federated learning (FL) is a privacy-preserving distributed machine learning technique that trains models while keeping all the original data generated on devices locally. Since devices may be resource constrained, offloading can be used to improve FL performance by transferring computational workload from devices to edge servers. However, due to mobility, devices participating in FL may leave the network during training and need to connect to a different edge server. This is challenging because the offloaded computations from edge server need to be migrated. In line with this assertion, we present FedFly, which is, to the best of our knowledge, the first work to migrate a deep neural network (DNN) when devices move between edge servers during FL training. Our empirical results on the CIFAR10 dataset, with both balanced and imbalanced data distribution, support our claims that FedFly can reduce training time by up to 33% when a device moves after 50% of the training is completed, and by up to 45% when 90% of the training is completed when compared to state-of-the-art offloading approach in FL. FedFly has negligible overhead of up to two seconds and does not compromise accuracy. Finally, we highlight a number of open research issues for further investigation. FedFly can be downloaded from https://github.com/qub-blesson/FedFly. △ Less

Submitted 14 July, 2022; v1 submitted 2 November, 2021; originally announced November 2021.

Comments: 7 pages, 6 figures

arXiv:2107.04271 [pdf, other]

FedAdapt: Adaptive Offloading for IoT Devices in Federated Learning

Authors: Di Wu, Rehmat Ullah, Paul Harvey, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Applying Federated Learning (FL) on Internet-of-Things devices is necessitated by the large volumes of data they produce and growing concerns of data privacy. However, there are three challenges that need to be addressed to make FL efficient: (i) execution on devices with limited computational capabilities, (ii) accounting for stragglers due to computational heterogeneity of devices, and (iii) ada… ▽ More Applying Federated Learning (FL) on Internet-of-Things devices is necessitated by the large volumes of data they produce and growing concerns of data privacy. However, there are three challenges that need to be addressed to make FL efficient: (i) execution on devices with limited computational capabilities, (ii) accounting for stragglers due to computational heterogeneity of devices, and (iii) adaptation to the changing network bandwidths. This paper presents FedAdapt, an adaptive offloading FL framework to mitigate the aforementioned challenges. FedAdapt accelerates local training in computationally constrained devices by leveraging layer offloading of deep neural networks (DNNs) to servers. Further, FedAdapt adopts reinforcement learning based optimization and clustering to adaptively identify which layers of the DNN should be offloaded for each individual device on to a server to tackle the challenges of computational heterogeneity and changing network bandwidth. Experimental studies are carried out on a lab-based testbed and it is demonstrated that by offloading a DNN from the device to the server FedAdapt reduces the training time of a typical IoT device by over half compared to classic FL. The training time of extreme stragglers and the overall training time can be reduced by up to 57%. Furthermore, with changing network bandwidth, FedAdapt is demonstrated to reduce the training time by up to 40% when compared to classic FL, without sacrificing accuracy. △ Less

Submitted 18 May, 2022; v1 submitted 9 July, 2021; originally announced July 2021.

Comments: 13 pages

arXiv:2106.15689 [pdf, other]

NEUKONFIG: Reducing Edge Service Downtime When Repartitioning DNNs

Authors: Ayesha Abdul Majeed, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Deep Neural Networks (DNNs) may be partitioned across the edge and the cloud to improve the performance efficiency of inference. DNN partitions are determined based on operational conditions such as network speed. When operational conditions change DNNs will need to be repartitioned to maintain the overall performance. However, repartitioning using existing approaches, such as Pause and Resume, wi… ▽ More Deep Neural Networks (DNNs) may be partitioned across the edge and the cloud to improve the performance efficiency of inference. DNN partitions are determined based on operational conditions such as network speed. When operational conditions change DNNs will need to be repartitioned to maintain the overall performance. However, repartitioning using existing approaches, such as Pause and Resume, will incur a service downtime on the edge. This paper presents the NEUKONFIG framework that identifies the service downtime incurred when repartitioning DNNs and proposes approaches for reducing edge service downtime. The proposed approaches are based on 'Dynamic Switching' in which, when the network speed changes and given an existing edge-cloud pipeline, a new edge-cloud pipeline is initialised with new DNN partitions. Incoming inference requests are switched to the new pipeline for processing data. Two dynamic switching scenarios are considered: when a second edge-cloud pipeline is always running and when a second pipeline is only initialised when the network speed changes. Experimental studies are carried out on a lab-based testbed to demonstrate that Dynamic Switching reduces the downtime by at least an order of magnitude when compared to a baseline using Pause and Resume that has a downtime of 6 seconds. A trade-off in the edge service downtime and memory required is noted. The Dynamic Switching approach that requires the same amount of memory as the baseline reduces the edge service downtime to 0.6 seconds and to less than 1 millisecond in the best case when twice the amount of memory as the baseline is available. △ Less

Submitted 29 June, 2021; originally announced June 2021.

Comments: 10 pages

arXiv:2106.12224 [pdf, other]

Revisiting the Arguments for Edge Computing Research

Authors: Blesson Varghese, Eyal de Lara, Aaron Ding, Cheol-Ho Hong, Flavio Bonomi, Schahram Dustdar, Paul Harvey, Peter Hewkin, Weisong Shi, Mark Thiele, Peter Willis

Abstract: This article argues that low latency, high bandwidth, device proliferation, sustainable digital infrastructure, and data privacy and sovereignty continue to motivate the need for edge computing research even though its initial concepts were formulated more than a decade ago. This article argues that low latency, high bandwidth, device proliferation, sustainable digital infrastructure, and data privacy and sovereignty continue to motivate the need for edge computing research even though its initial concepts were formulated more than a decade ago. △ Less

Submitted 23 June, 2021; originally announced June 2021.

arXiv:2105.08668 [pdf, other]

doi 10.1109/ICFEC51620.2021.00015

TOD: Transprecise Object Detection to Maximise Real-Time Accuracy on the Edge

Authors: JunKyu Lee, Blesson Varghese, Roger Woods, Hans Vandierendonck

Abstract: Real-time video analytics on the edge is challenging as the computationally constrained resources typically cannot analyse video streams at full fidelity and frame rate, which results in loss of accuracy. This paper proposes a Transprecise Object Detector (TOD) which maximises the real-time object detection accuracy on an edge device by selecting an appropriate Deep Neural Network (DNN) on the fly… ▽ More Real-time video analytics on the edge is challenging as the computationally constrained resources typically cannot analyse video streams at full fidelity and frame rate, which results in loss of accuracy. This paper proposes a Transprecise Object Detector (TOD) which maximises the real-time object detection accuracy on an edge device by selecting an appropriate Deep Neural Network (DNN) on the fly with negligible computational overhead. TOD makes two key contributions over the state of the art: (1) TOD leverages characteristics of the video stream such as object size and speed of movement to identify networks with high prediction accuracy for the current frames; (2) it selects the best-performing network based on projected accuracy and computational demand using an effective and low-overhead decision mechanism. Experimental evaluation on a Jetson Nano demonstrates that TOD improves the average object detection precision by 34.7 % over the YOLOv4-tiny-288 model on average over the MOT17Det dataset. In the MOT17-05 test dataset, TOD utilises only 45.1 % of GPU resource and 62.7 % of the GPU board power without losing accuracy, compared to YOLOv4-416 model. We expect that TOD will maximise the application of edge devices to real-time object detection, since TOD maximises real-time object detection accuracy given edge devices according to dynamic input features without increasing inference latency in practice. △ Less

Submitted 18 May, 2021; originally announced May 2021.

arXiv:2105.02019 [pdf, other]

ScissionLite: Accelerating Distributed Deep Neural Networks Using Transfer Layer

Authors: Hyunho Ahn, Munkyu Lee, Cheol-Ho Hong, Blesson Varghese

Abstract: Industrial Internet of Things (IIoT) applications can benefit from leveraging edge computing. For example, applications underpinned by deep neural networks (DNN) models can be sliced and distributed across the IIoT device and the edge of the network for improving the overall performance of inference and for enhancing privacy of the input data, such as industrial product images. However, low networ… ▽ More Industrial Internet of Things (IIoT) applications can benefit from leveraging edge computing. For example, applications underpinned by deep neural networks (DNN) models can be sliced and distributed across the IIoT device and the edge of the network for improving the overall performance of inference and for enhancing privacy of the input data, such as industrial product images. However, low network performance between IIoT devices and the edge is often a bottleneck. In this study, we develop ScissionLite, a holistic framework for accelerating distributed DNN inference using the Transfer Layer (TL). The TL is a traffic-aware layer inserted between the optimal slicing point of a DNN model slice in order to decrease the outbound network traffic without a significant accuracy drop. For the TL, we implement a new lightweight down/upsampling network for performance-limited IIoT devices. In ScissionLite, we develop ScissionTL, the Preprocessor, and the Offloader for end-to-end activities for deploying DNN slices with the TL. They decide the optimal slicing point of the DNN, prepare pre-trained DNN slices including the TL, and execute the DNN slices on an IIoT device and the edge. Employing the TL for the sliced DNN models has a negligible overhead. ScissionLite improves the inference latency by up to 16 and 2.8 times when compared to execution on the local device and an existing state-of-the-art model slicing approach respectively. △ Less

Submitted 5 May, 2021; originally announced May 2021.

Comments: 10 pages

arXiv:2103.04930 [pdf, other]

AVEC: Accelerator Virtualization in Cloud-Edge Computing for Deep Learning Libraries

Authors: Jason Kennedy, Blesson Varghese, Carlos Reaño

Abstract: Edge computing offers the distinct advantage of harnessing compute capabilities on resources located at the edge of the network to run workloads of relatively weak user devices. This is achieved by offloading computationally intensive workloads, such as deep learning from user devices to the edge. Using the edge reduces the overall communication latency of applications as workloads can be processe… ▽ More Edge computing offers the distinct advantage of harnessing compute capabilities on resources located at the edge of the network to run workloads of relatively weak user devices. This is achieved by offloading computationally intensive workloads, such as deep learning from user devices to the edge. Using the edge reduces the overall communication latency of applications as workloads can be processed closer to where data is generated on user devices rather than sending them to geographically distant clouds. Specialised hardware accelerators, such as Graphics Processing Units (GPUs) available in the cloud-edge network can enhance the performance of computationally intensive workloads that are offloaded from devices on to the edge. The underlying approach required to facilitate this is virtualization of GPUs. This paper therefore sets out to investigate the potential of GPU accelerator virtualization to improve the performance of deep learning workloads in a cloud-edge environment. The AVEC accelerator virtualization framework is proposed that incurs minimum overheads and requires no source-code modification of the workload. AVEC intercepts local calls to a GPU on a device and forwards them to an edge resource seamlessly. The feasibility of AVEC is demonstrated on a real-world application, namely OpenPose using the Caffe deep learning library. It is observed that on a lab-based experimental test-bed AVEC delivers up to 7.48x speedup despite communication overheads incurred due to data transfers. △ Less

Submitted 8 March, 2021; originally announced March 2021.

Comments: 8 pages, 13 figures

arXiv:2008.03523 [pdf, other]

Scission: Performance-driven and Context-aware Cloud-Edge Distribution of Deep Neural Networks

Authors: Luke Lockhart, Paul Harvey, Pierre Imai, Peter Willis, Blesson Varghese

Abstract: Partitioning and distributing deep neural networks (DNNs) across end-devices, edge resources and the cloud has a potential twofold advantage: preserving privacy of the input data, and reducing the ingress bandwidth demand beyond the edge. However, for a given DNN, identifying the optimal partition configuration for distributing the DNN that maximizes performance is a significant challenge. This is… ▽ More Partitioning and distributing deep neural networks (DNNs) across end-devices, edge resources and the cloud has a potential twofold advantage: preserving privacy of the input data, and reducing the ingress bandwidth demand beyond the edge. However, for a given DNN, identifying the optimal partition configuration for distributing the DNN that maximizes performance is a significant challenge. This is because the combination of potential target hardware resources that maximizes performance and the sequence of layers of the DNN that should be distributed across the target resources needs to be determined, while accounting for user-defined objectives/constraints for partitioning. This paper presents Scission, a tool for automated benchmarking of DNNs on a given set of target device, edge and cloud resources for determining optimal partitions that maximize DNN performance. The decision-making approach is context-aware by capitalizing on hardware capabilities of the target resources, their locality, the characteristics of DNN layers, and the network condition. Experimental studies are carried out on 18 DNNs. The decisions made by Scission cannot be manually made by a human given the complexity and the number of dimensions affecting the search space. The benchmarking overheads of Scission allow for responding to operational changes periodically rather than in real-time. Scission is available for public download at https://github.com/qub-blesson/Scission. △ Less

Submitted 16 December, 2020; v1 submitted 8 August, 2020; originally announced August 2020.

Comments: Accepted to IEEE/ACM UCC 2020

arXiv:2008.01814 [pdf, other]

A Case For Adaptive Deep Neural Networks in Edge Computing

Authors: Francis McNamee, Schahram Dustadar, Peter Kilpatrick, Weisong Shi, Ivor Spence, Blesson Varghese

Abstract: Edge computing offers an additional layer of compute infrastructure closer to the data source before raw data from privacy-sensitive and performance-critical applications is transferred to a cloud data center. Deep Neural Networks (DNNs) are one class of applications that are reported to benefit from collaboratively computing between the edge and the cloud. A DNN is partitioned such that specific… ▽ More Edge computing offers an additional layer of compute infrastructure closer to the data source before raw data from privacy-sensitive and performance-critical applications is transferred to a cloud data center. Deep Neural Networks (DNNs) are one class of applications that are reported to benefit from collaboratively computing between the edge and the cloud. A DNN is partitioned such that specific layers of the DNN are deployed onto the edge and the cloud to meet performance and privacy objectives. However, there is limited understanding of: (a) whether and how evolving operational conditions (increased CPU and memory utilization at the edge or reduced data transfer rates between the edge and the cloud) affect the performance of already deployed DNNs, and (b) whether a new partition configuration is required to maximize performance. A DNN that adapts to changing operational conditions is referred to as an 'adaptive DNN'. This paper investigates whether there is a case for adaptive DNNs in edge computing by considering three questions: (i) Are DNNs sensitive to operational conditions? (ii) How sensitive are DNNs to operational conditions? (iii) Do individual or a combination of operational conditions equally affect DNNs? (iv) Is DNN partitioning sensitive to hardware architectures on the cloud/edge? The exploration is carried out in the context of 8 pre-trained DNN models and the results presented are from analyzing nearly 8 million data points. The results highlight that network conditions affects DNN performance more than CPU or memory related operational conditions. Repartitioning is noted to provide a performance gain in a number of cases, but a specific trend was not noted in relation to its correlation to the underlying hardware architecture. Nonetheless, the need for adaptive DNNs is confirmed. △ Less

Submitted 16 December, 2020; v1 submitted 4 August, 2020; originally announced August 2020.

arXiv:2006.12761 [pdf, other]

Benchmarking features from different radiomics toolkits / toolboxes using Image Biomarkers Standardization Initiative

Authors: Mingxi Lei, Bino Varghese, Darryl Hwang, Steven Cen, Xiaomeng Lei, Afshin Azadikhah, Bhushan Desai, Assad Oberai, Vinay Duddalwar

Abstract: There is no consensus regarding the radiomic feature terminology, the underlying mathematics, or their implementation. This creates a scenario where features extracted using different toolboxes could not be used to build or validate the same model leading to a non-generalization of radiomic results. In this study, the image biomarker standardization initiative (IBSI) established phantom and benchm… ▽ More There is no consensus regarding the radiomic feature terminology, the underlying mathematics, or their implementation. This creates a scenario where features extracted using different toolboxes could not be used to build or validate the same model leading to a non-generalization of radiomic results. In this study, the image biomarker standardization initiative (IBSI) established phantom and benchmark values were used to compare the variation of the radiomic features while using 6 publicly available software programs and 1 in-house radiomics pipeline. All IBSI-standardized features (11 classes, 173 in total) were extracted. The relative differences between the extracted feature values from the different software and the IBSI benchmark values were calculated to measure the inter-software agreement. To better understand the variations, features are further grouped into 3 categories according to their properties: 1) morphology, 2) statistic/histogram and 3)texture features. While a good agreement was observed for a majority of radiomics features across the various programs, relatively poor agreement was observed for morphology features. Significant differences were also found in programs that use different gray level discretization approaches. Since these programs do not include all IBSI features, the level of quantitative assessment for each category was analyzed using Venn and the UpSet diagrams and also quantified using two ad hoc metrics. Morphology features earns lowest scores for both metrics, indicating that morphological features are not consistently evaluated among software programs. We conclude that radiomic features calculated using different software programs may not be identical and reliable. Further studies are needed to standardize the workflow of radiomic feature extraction. △ Less

Submitted 23 June, 2020; originally announced June 2020.

Comments: 21 pages, 8 figures

arXiv:2006.00342 [pdf, other]

WattsApp: Power-Aware Container Scheduling

Authors: Hemant Mehta, Paul Harvey, Omer Rana, Rajkumar Buyya, Blesson Varghese

Abstract: Containers are becoming a popular workload deployment mechanism in modern distributed systems. However, there are limited software-based methods (hardware-based methods are expensive requiring hardware level changes) for obtaining the power consumed by containers for facilitating power-aware container scheduling, an essential activity for efficient management of distributed systems. This paper pre… ▽ More Containers are becoming a popular workload deployment mechanism in modern distributed systems. However, there are limited software-based methods (hardware-based methods are expensive requiring hardware level changes) for obtaining the power consumed by containers for facilitating power-aware container scheduling, an essential activity for efficient management of distributed systems. This paper presents WattsApp, a tool underpinned by a six step software-based method for power-aware container scheduling to minimize power cap violations on a server. The proposed method relies on a neural network-based power estimation model and a power capped container scheduling technique. Experimental studies are pursued in a lab-based environment on 10 benchmarks deployed on Intel and ARM processors. The results highlight that the power estimation model has negligible overheads for data collection - nearly 90% of all data samples can be estimated with less than a 10% error, and the Mean Absolute Percentage Error (MAPE) is less than 6%. The power-aware scheduling of WattsApp is more effective than Intel's Running Power Average Limit (RAPL) based power capping for both single and multiple containers as it does not degrade the performance of all containers running on the server. The results confirm the feasibility of WattsApp. △ Less

Submitted 30 May, 2020; originally announced June 2020.

arXiv:2004.11725 [pdf, other]

A Survey on Edge Performance Benchmarking

Authors: Blesson Varghese, Nan Wang, David Bermbach, Cheol-Ho Hong, Eyal de Lara, Weisong Shi, Christopher Stewart

Abstract: Edge computing is the next Internet frontier that will leverage computing resources located near users, sensors, and data stores to provide more responsive services. Therefore, it is envisioned that a large-scale, geographically dispersed, and resource-rich distributed system will emerge and play a key role in the future Internet. However, given the loosely coupled nature of such complex systems,… ▽ More Edge computing is the next Internet frontier that will leverage computing resources located near users, sensors, and data stores to provide more responsive services. Therefore, it is envisioned that a large-scale, geographically dispersed, and resource-rich distributed system will emerge and play a key role in the future Internet. However, given the loosely coupled nature of such complex systems, their operational conditions are expected to change significantly over time. In this context, the performance characteristics of such systems will need to be captured rapidly, which is referred to as performance benchmarking, for application deployment, resource orchestration, and adaptive decision-making. Edge performance benchmarking is a nascent research avenue that has started gaining momentum over the past five years. This article first reviews articles published over the past three decades to trace the history of performance benchmarking from tightly coupled to loosely coupled systems. It then systematically classifies previous research to identify the system under test, techniques analyzed, and benchmark runtime in edge performance benchmarking. △ Less

Submitted 16 December, 2020; v1 submitted 24 April, 2020; originally announced April 2020.

Comments: Accepted by ACM Computing Surveys, 16 December 2020

arXiv:2003.08305 [pdf, other]

Cross Architectural Power Modelling

Authors: Kai Chen, Peter Kilpatrick, Dimitrios S. Nikolopoulos, Blesson Varghese

Abstract: Existing power modelling research focuses on the model rather than the process for developing models. An automated power modelling process that can be deployed on different processors for developing power models with high accuracy is developed. For this, (i) an automated hardware performance counter selection method that selects counters best correlated to power on both ARM and Intel processors, (… ▽ More Existing power modelling research focuses on the model rather than the process for developing models. An automated power modelling process that can be deployed on different processors for developing power models with high accuracy is developed. For this, (i) an automated hardware performance counter selection method that selects counters best correlated to power on both ARM and Intel processors, (ii) a noise filter based on clustering that can reduce the mean error in power models, and (iii) a two stage power model that surmounts challenges in using existing power models across multiple architectures are proposed and developed. The key results are: (i) the automated hardware performance counter selection method achieves comparable selection to the manual method reported in the literature, (ii) the noise filter reduces the mean error in power models by up to 55%, and (iii) the two stage power model can predict dynamic power with less than 8% error on both ARM and Intel processors, which is an improvement over classic models. △ Less

Submitted 17 March, 2020; originally announced March 2020.

Comments: 10 pages; IEEE/ACM CCGrid 2020. arXiv admin note: text overlap with arXiv:1710.10325

arXiv:2002.05531 [pdf, other]

Modelling Fog Offloading Performance

Authors: Ayesha Abdul Majeed, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Fog computing has emerged as a computing paradigm aimed at addressing the issues of latency, bandwidth and privacy when mobile devices are communicating with remote cloud services. The concept is to offload compute services closer to the data. However many challenges exist in the realisation of this approach. During offloading, (part of) the application underpinned by the services may be unavailab… ▽ More Fog computing has emerged as a computing paradigm aimed at addressing the issues of latency, bandwidth and privacy when mobile devices are communicating with remote cloud services. The concept is to offload compute services closer to the data. However many challenges exist in the realisation of this approach. During offloading, (part of) the application underpinned by the services may be unavailable, which the user will experience as down time. This paper describes work aimed at building models to allow prediction of such down time based on metrics (operational data) of the underlying and surrounding infrastructure. Such prediction would be invaluable in the context of automated Fog offloading and adaptive decision making in Fog orchestration. Models that cater for four container-based stateless and stateful offload techniques, namely Save and Load, Export and Import, Push and Pull and Live Migration, are built using four (linear and non-linear) regression techniques. Experimental results comprising over 42 million data points from multiple lab-based Fog infrastructure are presented. The results highlight that reasonably accurate predictions (measured by the coefficient of determination for regression models, mean absolute percentage error, and mean absolute error) may be obtained when considering 25 metrics relevant to the infrastructure. △ Less

Submitted 12 February, 2020; originally announced February 2020.

Comments: arXiv admin note: substantial text overlap with arXiv:1909.04945

arXiv:2001.09228 [pdf, other]

Context-aware Distribution of Fog Applications Using Deep Reinforcement Learning

Authors: Nan Wang, Blesson Varghese

Abstract: Fog computing is an emerging paradigm that aims to meet the increasing computation demands arising from the billions of devices connected to the Internet. Offloading services of an application from the Cloud to the edge of the network can improve the overall Quality-of-Service (QoS) of the application since it can process data closer to user devices. Diverse Fog nodes ranging from Wi-Fi routers to… ▽ More Fog computing is an emerging paradigm that aims to meet the increasing computation demands arising from the billions of devices connected to the Internet. Offloading services of an application from the Cloud to the edge of the network can improve the overall Quality-of-Service (QoS) of the application since it can process data closer to user devices. Diverse Fog nodes ranging from Wi-Fi routers to mini-clouds with varying resource capabilities makes it challenging to determine which services of an application need to be offloaded. In this paper, a context-aware mechanism for distributing applications across the Cloud and the Fog is proposed. The mechanism dynamically generates (re)deployment plans for the application to maximise the performance efficiency of the application by taking the QoS and running costs into account. The mechanism relies on deep Q-networks to generate a distribution plan without prior knowledge of the available resources on the Fog node, the network condition and the application. The feasibility of the proposed context-aware distribution mechanism is demonstrated on two use-cases, namely a face detection application and a location-based mobile game. The benefits are increased utility of dynamic distribution in both use cases, when compared to a static distribution approach used in existing research. △ Less

Submitted 24 January, 2020; originally announced January 2020.

arXiv:2001.09070 [pdf, other]

Priority-based Fair Scheduling in Edge Computing

Authors: Arkadiusz Madej, Nan Wang, Nikolaos Athanasopoulos, Rajiv Ranjan, Blesson Varghese

Abstract: Scheduling is important in Edge computing. In contrast to the Cloud, Edge resources are hardware limited and cannot support workload-driven infrastructure scaling. Hence, resource allocation and scheduling for the Edge requires a fresh perspective. Existing Edge scheduling research assumes availability of all needed resources whenever a job request is made. This paper challenges that assumption, s… ▽ More Scheduling is important in Edge computing. In contrast to the Cloud, Edge resources are hardware limited and cannot support workload-driven infrastructure scaling. Hence, resource allocation and scheduling for the Edge requires a fresh perspective. Existing Edge scheduling research assumes availability of all needed resources whenever a job request is made. This paper challenges that assumption, since not all job requests from a Cloud server can be scheduled on an Edge node. Thus, guaranteeing fairness among the clients (Cloud servers offloading jobs) while accounting for priorities of the jobs becomes a critical task. This paper presents four scheduling techniques, the first is a naive first come first serve strategy and further proposes three strategies, namely a client fair, priority fair, and hybrid that accounts for the fairness of both clients and job priorities. An evaluation on a target platform under three different scenarios, namely equal, random, and Gaussian job distributions is presented. The experimental studies highlight the low overheads and the distribution of scheduled jobs on the Edge node when compared to the naive strategy. The results confirm the superior performance of the hybrid strategy and showcase the feasibility of fair schedulers for Edge computing. △ Less

Submitted 24 January, 2020; originally announced January 2020.

Comments: 10 pages; accepted to IEEE Int. Conf. on Fog and Edge Computing (ICFEC), 2020

arXiv:1909.04945 [pdf, other]

Performance Estimation of Container-Based Cloud-to-Fog Offloading

Authors: Ayesha Abdul Majeed, Peter Kilpatrick, Ivor Spence, Blesson Varghese

Abstract: Fog computing offloads latency critical application services running on the Cloud in close proximity to end-user devices onto resources located at the edge of the network. The research in this paper is motivated towards characterising and estimating the time taken to offload a service using containers, which is investigated in the context of the `Save and Load' container migration technique. To th… ▽ More Fog computing offloads latency critical application services running on the Cloud in close proximity to end-user devices onto resources located at the edge of the network. The research in this paper is motivated towards characterising and estimating the time taken to offload a service using containers, which is investigated in the context of the `Save and Load' container migration technique. To this end, the research addresses questions such as whether fog offloading can be accurately modelled and which system and network related parameters influence offloading. These are addressed by exploring a catalogue of 21 different metrics both at the system and process levels that is used as input to four estimation techniques using collective model and individual models to predict the time taken for offloading. The study is pursued by collecting over 1.1 million data points and the preliminary results indicate that offloading can be modelled accurately. △ Less

Submitted 11 September, 2019; originally announced September 2019.

arXiv:1907.10890 [pdf, other]

DeFog: Fog Computing Benchmarks

Authors: Jonathan McChesney, Nan Wang, Ashish Tanwer, Eyal de Lara, Blesson Varghese

Abstract: Fog computing envisions that deploying services of an application across resources in the cloud and those located at the edge of the network may improve the overall performance of the application when compared to running the application on the cloud. However, there are currently no benchmarks that can directly compare the performance of the application across the cloud-only, edge-only and cloud-ed… ▽ More Fog computing envisions that deploying services of an application across resources in the cloud and those located at the edge of the network may improve the overall performance of the application when compared to running the application on the cloud. However, there are currently no benchmarks that can directly compare the performance of the application across the cloud-only, edge-only and cloud-edge deployment platform to obtain any insight on performance improvement. This paper proposes DeFog, a first Fog benchmarking suite to: (i) alleviate the burden of Fog benchmarking by using a standard methodology, and (ii) facilitate the understanding of the target platform by collecting a catalogue of relevant metrics for a set of benchmarks. The current portfolio of DeFog benchmarks comprises six relevant applications conducive to using the edge. Experimental studies are carried out on multiple target platforms to demonstrate the use of DeFog for collecting metrics related to application latencies (communication and computation), for understanding the impact of stress and concurrent users on application latencies, and for understanding the performance of deploying different combination of services of an application across the cloud and edge. DeFog is available for public download (https://github.com/qub-blesson/DeFog). △ Less

Submitted 25 July, 2019; originally announced July 2019.

Comments: Accepted to the ACM/IEEE Symposium on Edge Computing, 2019, Washington DC, USA

arXiv:1902.03656 [pdf, other]

Cloud Futurology

Authors: Blesson Varghese, Philipp Leitner, Suprio Ray, Kyle Chard, Adam Barker, Yehia Elkhatib, Herry Herry, Cheol-Ho Hong, Jeremy Singer, Fung Po Tso, Eiko Yoneki, Mohamed-Faten Zhani

Abstract: The Cloud has become integral to most Internet-based applications and user gadgets. This article provides a brief history of the Cloud and presents a researcher's view of the prospects for innovating at the infrastructure, middleware, and application and delivery levels of the already crowded Cloud computing stack. The Cloud has become integral to most Internet-based applications and user gadgets. This article provides a brief history of the Cloud and presents a researcher's view of the prospects for innovating at the infrastructure, middleware, and application and delivery levels of the already crowded Cloud computing stack. △ Less

Submitted 10 February, 2019; originally announced February 2019.

Comments: Accepted to IEEE Computer, 2019

arXiv:1812.01344 [pdf]

doi 10.1109/MCC.2018.064181115

Realizing Edge Marketplaces: Challenges and Opportunities

Authors: Blesson Varghese, Massimo Villari, Omer Rana, Philip James, Tejal Shal, Maria Fazio, Rajiv Ranjan

Abstract: The edge of the network has the potential to host services for supporting a variety of user applications, ranging in complexity from data preprocessing, image and video rendering, and interactive gaming, to embedded systems in autonomous cars and built environments. However, the computational and data resources over which such services are hosted, and the actors that interact with these services,… ▽ More The edge of the network has the potential to host services for supporting a variety of user applications, ranging in complexity from data preprocessing, image and video rendering, and interactive gaming, to embedded systems in autonomous cars and built environments. However, the computational and data resources over which such services are hosted, and the actors that interact with these services, have an intermittent availability and access profile, introducing significant risk for user applications that must rely on them. This article investigates the development of an edge marketplace, which is able to support multiple providers for offering services at the network edge, and to enable demand supply for influencing the operation of such a marketplace. Resilience, cost, and quality of service and experience will subsequently enable such a marketplace to adapt its services over time. This article also describes how distributed-ledger technologies (such as blockchains) provide a promising approach to support the operation of such a marketplace and regulate its behavior (such as the GDPR in Europe) and operation. Two application scenarios provide context for the discussion of how such a marketplace would function and be utilized in practice. △ Less

Submitted 4 December, 2018; originally announced December 2018.

Comments: Published in IEEE Cloud Computing, Volume 5, Issue 6, 2018, pp. 9-20

Journal ref: B. Varghese et al., "Realizing Edge Marketplaces: Challenges and Opportunities," in IEEE Cloud Computing, vol. 5, no. 6, pp. 9-20, Nov./Dec. 2018

arXiv:1810.06046 [pdf, other]

Accelerator Virtualization in Fog Computing: Moving From the Cloud to the Edge

Authors: Blesson Varghese, Carlos Reano, Federico Silla

Abstract: Hardware accelerators are available on the Cloud for enhanced analytics. Next generation Clouds aim to bring enhanced analytics using accelerators closer to user devices at the edge of the network for improving Quality-of-Service by minimizing end-to-end latencies and response times. The collective computing model that utilizes resources at the Cloud-Edge continuum in a multi-tier hierarchy compri… ▽ More Hardware accelerators are available on the Cloud for enhanced analytics. Next generation Clouds aim to bring enhanced analytics using accelerators closer to user devices at the edge of the network for improving Quality-of-Service by minimizing end-to-end latencies and response times. The collective computing model that utilizes resources at the Cloud-Edge continuum in a multi-tier hierarchy comprising the Cloud, the Edge and user devices is referred to as Fog computing. This article identifies challenges and opportunities in making accelerators accessible at the Edge. A holistic view of the Fog architecture is key to pursuing meaningful research in this area. △ Less

Submitted 14 October, 2018; originally announced October 2018.

Comments: IEEE Cloud Computing magazine

arXiv:1810.04608 [pdf, other]

DYVERSE: DYnamic VERtical Scaling in Multi-tenant Edge Environments

Authors: Nan Wang, Michail Matthaiou, Dimitrios S. Nikolopoulos, Blesson Varghese

Abstract: Multi-tenancy in resource-constrained environments is a key challenge in Edge computing. In this paper, we develop 'DYVERSE: DYnamic VERtical Scaling in Edge' environments, which is the first light-weight and dynamic vertical scaling mechanism for managing resources allocated to applications for facilitating multi-tenancy in Edge environments. To enable dynamic vertical scaling, one static and thr… ▽ More Multi-tenancy in resource-constrained environments is a key challenge in Edge computing. In this paper, we develop 'DYVERSE: DYnamic VERtical Scaling in Edge' environments, which is the first light-weight and dynamic vertical scaling mechanism for managing resources allocated to applications for facilitating multi-tenancy in Edge environments. To enable dynamic vertical scaling, one static and three dynamic priority management approaches that are workload-aware, community-aware and system-aware, respectively are proposed. This research advocates that dynamic vertical scaling and priority management approaches reduce Service Level Objective (SLO) violation rates. An online-game and a face detection workload in a Cloud-Edge test-bed are used to validate the research. The merits of DYVERSE is that there is only a sub-second overhead per Edge server when 32 Edge servers are deployed on a single Edge node. When compared to executing applications on the Edge servers without dynamic vertical scaling, static priorities and dynamic priorities reduce SLO violation rates of requests by up to 4% and 12% for the online game, respectively, and in both cases 6% for the face detection workload. Moreover, for both workloads, the system-aware dynamic vertical scaling method effectively reduces the latency of non-violated requests, when compared to other methods. △ Less

Submitted 21 February, 2020; v1 submitted 19 September, 2018; originally announced October 2018.

arXiv:1810.00305 [pdf, other]

doi 10.1145/3326066

Resource Management in Fog/Edge Computing: A Survey

Authors: Cheol-Ho Hong, Blesson Varghese

Abstract: Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. Fog/edge resources are typically resource-constrained, heterogeneous, and dynamic compared to the cloud, thereby making res… ▽ More Contrary to using distant and centralized cloud data center resources, employing decentralized resources at the edge of a network for processing data closer to user devices, such as smartphones and tablets, is an upcoming computing paradigm, referred to as fog/edge computing. Fog/edge resources are typically resource-constrained, heterogeneous, and dynamic compared to the cloud, thereby making resource management an important challenge that needs to be addressed. This article reviews publications as early as 1991, with 85% of the publications between 2013-2018, to identify and classify the architectures, infrastructure, and underlying algorithms for managing resources in fog/edge computing. △ Less

Submitted 29 September, 2018; originally announced October 2018.

Comments: 22 pages

Journal ref: ACM Computing Surveys (CSUR) 52.5 (2019) 1-37

arXiv:1803.05255 [pdf]

Addressing the Challenges in Federating Edge Resources

Authors: Cihat Baktir, Cagatay Sonmez, Cem Ersoy, Atay Ozgovde, Blesson Varghese

Abstract: This book chapter considers how Edge deployments can be brought to bear in a global context by federating them across multiple geographic regions to create a global Edge-based fabric that decentralizes data center computation. This is currently impractical, not only because of technical challenges, but is also shrouded by social, legal and geopolitical issues. In this chapter, we discuss two key c… ▽ More This book chapter considers how Edge deployments can be brought to bear in a global context by federating them across multiple geographic regions to create a global Edge-based fabric that decentralizes data center computation. This is currently impractical, not only because of technical challenges, but is also shrouded by social, legal and geopolitical issues. In this chapter, we discuss two key challenges - networking and management in federating Edge deployments. Additionally, we consider resource and modeling challenges that will need to be addressed for a federated Edge. △ Less

Submitted 14 March, 2018; originally announced March 2018.

Comments: Book Chapter accepted to the Fog and Edge Computing: Principles and Paradigms; Editors Buyya, Srirama

arXiv:1712.04495 [pdf, other]

Intra-node Memory Safe GPU Co-Scheduling

Authors: Carlos Reano, Federico Silla, Dimitrios S. Nikolopoulos, Blesson Varghese

Abstract: GPUs in High-Performance Computing systems remain under-utilised due to the unavailability of schedulers that can safely schedule multiple applications to share the same GPU. The research reported in this paper is motivated to improve the utilisation of GPUs by proposing a framework, we refer to as schedGPU, to facilitate intra-node GPU co-scheduling such that a GPU can be safely shared among mult… ▽ More GPUs in High-Performance Computing systems remain under-utilised due to the unavailability of schedulers that can safely schedule multiple applications to share the same GPU. The research reported in this paper is motivated to improve the utilisation of GPUs by proposing a framework, we refer to as schedGPU, to facilitate intra-node GPU co-scheduling such that a GPU can be safely shared among multiple applications by taking memory constraints into account. Two approaches, namely a client-server and a shared memory approach are explored. However, the shared memory approach is more suitable due to lower overheads when compared to the former approach. Four policies are proposed in schedGPU to handle applications that are waiting to access the GPU, two of which account for priorities. The feasibility of schedGPU is validated on three real-world applications. The key observation is that a performance gain is achieved. For single applications, a gain of over 10 times, as measured by GPU utilisation and GPU memory utilisation, is obtained. For workloads comprising multiple applications, a speed-up of up to 5x in the total execution time is noted. Moreover, the average GPU utilisation and average GPU memory utilisation is increased by 5 and 12 times, respectively. △ Less

Submitted 12 December, 2017; originally announced December 2017.

Comments: Accepted on 12 Dec 2017, IEEE Transactions on Parallel and Distributed Systems

arXiv:1711.09138 [pdf, other]

Plug and Play Bench: Simplifying Big Data Benchmarking Using Containers

Authors: Sheriffo Ceesay, Adam Barker, Blesson Varghese

Abstract: The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which… ▽ More The recent boom of big data, coupled with the challenges of its processing and storage gave rise to the development of distributed data processing and storage paradigms like MapReduce, Spark, and NoSQL databases. With the advent of cloud computing, processing and storing such massive datasets on clusters of machines is now feasible with ease. However, there are limited tools and approaches, which users can rely on to gauge and comprehend the performance of their big data applications deployed locally on clusters, or in the cloud. Researchers have started exploring this area by providing benchmarking suites suitable for big data applications. However, many of these tools are fragmented, complex to deploy and manage, and do not provide transparency with respect to the monetary cost of benchmarking an application. In this paper, we present Plug And Play Bench, an infrastructure aware abstraction built to integrate and simplify the deployment of big data benchmarking tools on clusters of machines. PAPB automates the tedious process of installing, configuring and executing common big data benchmark workloads by containerising the tools and settings based on the underlying cluster deployment framework. Our proof of concept implementation utilises HiBench as the benchmark suite, HDP as the cluster deployment framework and Azure as the cloud platform. The paper further illustrates the inclusion of cost metrics based on the underlying Microsoft Azure cloud platform. △ Less

Submitted 29 November, 2017; v1 submitted 24 November, 2017; originally announced November 2017.

Comments: 8 pages, Published as a workshop paper in 2017 IEEE International Conference on Big Data in Boston Dec 11 - 14

arXiv:1711.09123 [pdf, other]

A Manifesto for Future Generation Cloud Computing: Research Directions for the Next Decade

Authors: Rajkumar Buyya, Satish Narayana Srirama, Giuliano Casale, Rodrigo Calheiros, Yogesh Simmhan, Blesson Varghese, Erol Gelenbe, Bahman Javadi, Luis Miguel Vaquero, Marco A. S. Netto, Adel Nadjaran Toosi, Maria Alejandra Rodriguez, Ignacio M. Llorente, Sabrina De Capitani di Vimercati, Pierangela Samarati, Dejan Milojicic, Carlos Varela, Rami Bahsoon, Marcos Dias de Assuncao, Omer Rana, Wanlei Zhou, Hai Jin, Wolfgang Gentzsch, Albert Y. Zomaya, Haiying Shen

Abstract: The Cloud computing paradigm has revolutionised the computer science horizon during the past decade and has enabled the emergence of computing as the fifth utility. It has captured significant attention of academia, industries, and government bodies. Now, it has emerged as the backbone of modern economy by offering subscription-based services anytime, anywhere following a pay-as-you-go model. This… ▽ More The Cloud computing paradigm has revolutionised the computer science horizon during the past decade and has enabled the emergence of computing as the fifth utility. It has captured significant attention of academia, industries, and government bodies. Now, it has emerged as the backbone of modern economy by offering subscription-based services anytime, anywhere following a pay-as-you-go model. This has instigated (1) shorter establishment times for start-ups, (2) creation of scalable global enterprise applications, (3) better cost-to-value associativity for scientific and high performance computing applications, and (4) different invocation/execution models for pervasive and ubiquitous applications. The recent technological developments and paradigms such as serverless computing, software-defined networking, Internet of Things, and processing at network edge are creating new opportunities for Cloud computing. However, they are also posing several new challenges and creating the need for new approaches and research strategies, as well as the re-evaluation of the models that were developed to address issues such as scalability, elasticity, reliability, security, sustainability, and application models. The proposed manifesto addresses them by identifying the major open challenges in Cloud computing, emerging trends, and impact areas. It then offers research directions for the next decade, thus helping in the realisation of Future Generation Cloud Computing. △ Less

Submitted 24 August, 2018; v1 submitted 24 November, 2017; originally announced November 2017.

Comments: 51 pages, 3 figures

arXiv:1711.08973 [pdf, other]

A Survey and Taxonomy of Resource Optimisation for Executing Bag-of-Task Applications on Public Clouds

Authors: Long Thai, Blesson Varghese, Adam Barker

Abstract: Cloud computing has been widely adopted due to the flexibility in resource provisioning and on-demand pricing models. Entire clusters of Virtual Machines (VMs) can be dynamically provisioned to meet the computational demands of users. However, from a user's perspective, it is still challenging to utilise cloud resources efficiently. This is because an overwhelmingly wide variety of resource types… ▽ More Cloud computing has been widely adopted due to the flexibility in resource provisioning and on-demand pricing models. Entire clusters of Virtual Machines (VMs) can be dynamically provisioned to meet the computational demands of users. However, from a user's perspective, it is still challenging to utilise cloud resources efficiently. This is because an overwhelmingly wide variety of resource types with different prices and significant performance variations are available. This paper presents a survey and taxonomy of existing research in optimising the execution of Bag-of-Task applications on cloud resources. A BoT application consists of multiple independent tasks, each of which can be executed by a VM in any order; these applications are widely used by both the scientific communities and commercial organisations. The objectives of this survey are as follows: (i) to provide the reader with a concise understanding of existing research on optimising the execution of BoT applications on the cloud, (ii) to define a taxonomy that categorises current frameworks to compare and contrast them, and (iii) to present current trends and future research directions in the area. △ Less

Submitted 24 November, 2017; originally announced November 2017.

Comments: Accepted to Future Generation Computer Systems, 23 November 2017

arXiv:1710.10325 [pdf, ps, other]

Power Modelling for Heterogeneous Cloud-Edge Data Centers

Authors: Kai Chen, Blesson Varghese, Peter Kilpatrick, Dimitrios S. Nikolopoulos

Abstract: Existing power modelling research focuses not on the method used for developing models but rather on the model itself. This paper aims to develop a method for deploying power models on emerging processors that will be used, for example, in cloud-edge data centers. Our research first develops a hardware counter selection method that appropriately selects counters most correlated to power on ARM and… ▽ More Existing power modelling research focuses not on the method used for developing models but rather on the model itself. This paper aims to develop a method for deploying power models on emerging processors that will be used, for example, in cloud-edge data centers. Our research first develops a hardware counter selection method that appropriately selects counters most correlated to power on ARM and Intel processors. Then, we propose a two stage power model that works across multiple architectures. The key results are: (i) the automated hardware performance counter selection method achieves comparable selection to the manual selection methods reported in literature, and (ii) the two stage power model can predict dynamic power more accurately on both ARM and Intel processors when compared to classic power models. △ Less

Submitted 27 October, 2017; originally announced October 2017.

Comments: 10 pages,10 figures,conference

arXiv:1710.10090 [pdf, other]

Edge-as-a-Service: Towards Distributed Cloud Architectures

Authors: Blesson Varghese, Nan Wang, Jianyu Li, Dimitrios S. Nikolopoulos

Abstract: We present an Edge-as-a-Service (EaaS) platform for realising distributed cloud architectures and integrating the edge of the network in the computing ecosystem. The EaaS platform is underpinned by (i) a lightweight discovery protocol that identifies edge nodes and make them publicly accessible in a computing environment, and (ii) a scalable resource provisioning mechanism for offloading workloads… ▽ More We present an Edge-as-a-Service (EaaS) platform for realising distributed cloud architectures and integrating the edge of the network in the computing ecosystem. The EaaS platform is underpinned by (i) a lightweight discovery protocol that identifies edge nodes and make them publicly accessible in a computing environment, and (ii) a scalable resource provisioning mechanism for offloading workloads from the cloud on to the edge for servicing multiple user requests. We validate the feasibility of EaaS on an online game use-case to highlight the improvement in the QoS of the application hosted on our cloud-edge platform. On this platform we demonstrate (i) low overheads of less than 6%, (ii) reduced data traffic to the cloud by up to 95% and (iii) minimised application latency between 40%-60%. △ Less

Submitted 27 October, 2017; originally announced October 2017.

Comments: 10 pages; presented at the EdgeComp Symposium 2017; will appear in Proceedings of the International Conference on Parallel Computing, 2017

arXiv:1709.04061 [pdf, other]

ENORM: A Framework For Edge NOde Resource Management

Authors: Nan Wang, Blesson Varghese, Michail Matthaiou, Dimitrios S. Nikolopoulos

Abstract: Current computing techniques using the cloud as a centralised server will become untenable as billions of devices get connected to the Internet. This raises the need for fog computing, which leverages computing at the edge of the network on nodes, such as routers, base stations and switches, along with the cloud. However, to realise fog computing the challenge of managing edge nodes will need to b… ▽ More Current computing techniques using the cloud as a centralised server will become untenable as billions of devices get connected to the Internet. This raises the need for fog computing, which leverages computing at the edge of the network on nodes, such as routers, base stations and switches, along with the cloud. However, to realise fog computing the challenge of managing edge nodes will need to be addressed. This paper is motivated to address the resource management challenge. We develop the first framework to manage edge nodes, namely the Edge NOde Resource Management (ENORM) framework. Mechanisms for provisioning and auto-scaling edge node resources are proposed. The feasibility of the framework is demonstrated on a PokeMon Go-like online game use-case. The benefits of using ENORM are observed by reduced application latency between 20% - 80% and reduced data transfer and communication frequency between the edge node and the cloud by up to 95\%. These results highlight the potential of fog computing for improving the quality of service and experience. △ Less

Submitted 12 September, 2017; originally announced September 2017.

Comments: 14 pages; accepted to IEEE Transactions on Services Computing on 12 September 2017

arXiv:1707.07452 [pdf, other]

Next Generation Cloud Computing: New Trends and Research Directions

Authors: Blesson Varghese, Rajkumar Buyya

Abstract: The landscape of cloud computing has significantly changed over the last decade. Not only have more providers and service offerings crowded the space, but also cloud infrastructure that was traditionally limited to single provider data centers is now evolving. In this paper, we firstly discuss the changing cloud infrastructure and consider the use of infrastructure from multiple providers and the… ▽ More The landscape of cloud computing has significantly changed over the last decade. Not only have more providers and service offerings crowded the space, but also cloud infrastructure that was traditionally limited to single provider data centers is now evolving. In this paper, we firstly discuss the changing cloud infrastructure and consider the use of infrastructure from multiple providers and the benefit of decentralising computing away from data centers. These trends have resulted in the need for a variety of new computing architectures that will be offered by future cloud infrastructure. These architectures are anticipated to impact areas, such as connecting people and devices, data-intensive computing, the service space and self-learning systems. Finally, we lay out a roadmap of challenges that will need to be addressed for realising the potential of next generation cloud systems. △ Less

Submitted 8 September, 2017; v1 submitted 24 July, 2017; originally announced July 2017.

Comments: Accepted to Future Generation Computer Systems, 07 September 2017

arXiv:1701.05451 [pdf, other]

Feasibility of Fog Computing

Authors: Blesson Varghese, Nan Wang, Dimitrios S. Nikolopoulos, Rajkumar Buyya

Abstract: As billions of devices get connected to the Internet, it will not be sustainable to use the cloud as a centralised server. The way forward is to decentralise computations away from the cloud towards the edge of the network closer to the user. This reduces the latency of communication between a user device and the cloud, and is the premise of 'fog computing' defined in this paper. The aim of this p… ▽ More As billions of devices get connected to the Internet, it will not be sustainable to use the cloud as a centralised server. The way forward is to decentralise computations away from the cloud towards the edge of the network closer to the user. This reduces the latency of communication between a user device and the cloud, and is the premise of 'fog computing' defined in this paper. The aim of this paper is to highlight the feasibility and the benefits in improving the Quality-of-Service and Experience by using fog computing. For an online game use-case, we found that the average response time for a user is improved by 20% when using the edge of the network in comparison to using a cloud-only model. It was also observed that the volume of traffic between the edge and the cloud server is reduced by over 90% for the use-case. The preliminary results highlight the potential of fog computing in achieving a sustainable computing model and highlights the benefits of integrating the edge of the network into the computing ecosystem. △ Less

Submitted 19 January, 2017; originally announced January 2017.

Comments: 8 pages

arXiv:1609.01967 [pdf, other]

Challenges and Opportunities in Edge Computing

Authors: Blesson Varghese, Nan Wang, Sakil Barbhuiya, Peter Kilpatrick, Dimitrios S. Nikolopoulos

Abstract: Many cloud-based applications employ a data centre as a central server to process data that is generated by edge devices, such as smartphones, tablets and wearables. This model places ever increasing demands on communication and computational infrastructure with inevitable adverse effect on Quality-of-Service and Experience. The concept of Edge Computing is predicated on moving some of this comput… ▽ More Many cloud-based applications employ a data centre as a central server to process data that is generated by edge devices, such as smartphones, tablets and wearables. This model places ever increasing demands on communication and computational infrastructure with inevitable adverse effect on Quality-of-Service and Experience. The concept of Edge Computing is predicated on moving some of this computational load towards the edge of the network to harness computational capabilities that are currently untapped in edge nodes, such as base stations, routers and switches. This position paper considers the challenges and opportunities that arise out of this new direction in the computing landscape. △ Less

Submitted 7 September, 2016; originally announced September 2016.

Comments: 6 pages, accepted to IEEE SmartCloud 2016

arXiv:1609.00536 [pdf, other]

A Machine Learning Analysis of Twitter Sentiment to the Sandy Hook Shootings

Authors: Nan Wang, Blesson Varghese, Peter D. Donnelly

Abstract: Gun related violence is a complex issue and accounts for a large proportion of violent incidents. In the research reported in this paper, we set out to investigate the pro-gun and anti-gun sentiments expressed on a social media platform, namely Twitter, in response to the 2012 Sandy Hook Elementary School shooting in Connecticut, USA. Machine learning techniques are applied to classify a data corp… ▽ More Gun related violence is a complex issue and accounts for a large proportion of violent incidents. In the research reported in this paper, we set out to investigate the pro-gun and anti-gun sentiments expressed on a social media platform, namely Twitter, in response to the 2012 Sandy Hook Elementary School shooting in Connecticut, USA. Machine learning techniques are applied to classify a data corpus of over 700,000 tweets. The sentiments are captured using a public sentiment score that considers the volume of tweets as well as population. A web-based interactive tool is developed to visualise the sentiments and is available at http://www.gunsontwitter.com. The key findings from this research are: (i) There are elevated rates of both pro-gun and anti-gun sentiments on the day of the shooting. Surprisingly, the pro-gun sentiment remains high for a number of days following the event but the anti-gun sentiment quickly falls to pre-event levels. (ii) There is a different public response from each state, with the highest pro-gun sentiment not coming from those with highest gun ownership levels but rather from California, Texas and New York. △ Less

Submitted 2 September, 2016; originally announced September 2016.

Comments: 10 pages, accepted to IEEE eScience 2016, Baltimore, USA

arXiv:1608.00406 [pdf, other]

doi 10.1109/TCC.2016.2603476

Cloud Benchmarking For Maximising Performance of Scientific Applications

Authors: Blesson Varghese, Ozgur Akgun, Ian Miguel, Long Thai, Adam Barker

Abstract: How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate… ▽ More How can applications be deployed on the cloud to achieve maximum performance? This question is challenging to address with the availability of a wide variety of cloud Virtual Machines (VMs) with different performance capabilities. The research reported in this paper addresses the above question by proposing a six step benchmarking methodology in which a user provides a set of weights that indicate how important memory, local communication, computation and storage related operations are to an application. The user can either provide a set of four abstract weights or eight fine grain weights based on the knowledge of the application. The weights along with benchmarking data collected from the cloud are used to generate a set of two rankings - one based only on the performance of the VMs and the other takes both performance and costs into account. The rankings are validated on three case study applications using two validation techniques. The case studies on a set of experimental VMs highlight that maximum performance can be achieved by the three top ranked VMs and maximum performance in a cost-effective manner is achieved by at least one of the top three ranked VMs produced by the methodology. △ Less

Submitted 1 August, 2016; originally announced August 2016.

Comments: 14 pages, accepted to the IEEE Transactions on Cloud Computing on 31 July 2016

Showing 1–50 of 79 results for author: Varghese, B