-
einspace: Searching for Neural Architectures from Fundamental Operations
Authors:
Linus Ericsson,
Miguel Espinosa,
Chenhongyi Yang,
Antreas Antoniou,
Amos Storkey,
Shay B. Cohen,
Steven McDonagh,
Elliot J. Crowley
Abstract:
Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shift…
▽ More
Neural architecture search (NAS) finds high performing networks for a given task. Yet the results of NAS are fairly prosaic; they did not e.g. create a shift from convolutional structures to transformers. This is not least because the search spaces in NAS often aren't diverse enough to include such transformations a priori. Instead, for NAS to provide greater potential for fundamental design shifts, we need a novel expressive search space design which is built from more fundamental operations. To this end, we introduce einspace, a search space based on a parameterised probabilistic context-free grammar. Our space is versatile, supporting architectures of various sizes and complexities, while also containing diverse network operations which allow it to model convolutions, attention components and more. It contains many existing competitive architectures, and provides flexibility for discovering new ones. Using this search space, we perform experiments to find novel architectures as well as improvements on existing ones on the diverse Unseen NAS datasets. We show that competitive architectures can be obtained by searching from scratch, and we consistently find large improvements when initialising the search with strong baselines. We believe that this work is an important advancement towards a transformative NAS paradigm where search space expressivity and strategic search initialisation play key roles.
△ Less
Submitted 31 May, 2024;
originally announced May 2024.
-
Adversarial Augmentation Training Makes Action Recognition Models More Robust to Realistic Video Distribution Shifts
Authors:
Kiyoon Kim,
Shreyank N Gowda,
Panagiotis Eustratiadis,
Antreas Antoniou,
Robert B Fisher
Abstract:
Despite recent advances in video action recognition achieving strong performance on existing benchmarks, these models often lack robustness when faced with natural distribution shifts between training and test data. We propose two novel evaluation methods to assess model resilience to such distribution disparity. One method uses two different datasets collected from different sources and uses one…
▽ More
Despite recent advances in video action recognition achieving strong performance on existing benchmarks, these models often lack robustness when faced with natural distribution shifts between training and test data. We propose two novel evaluation methods to assess model resilience to such distribution disparity. One method uses two different datasets collected from different sources and uses one for training and validation, and the other for testing. More precisely, we created dataset splits of HMDB-51 or UCF-101 for training, and Kinetics-400 for testing, using the subset of the classes that are overlapping in both train and test datasets. The other proposed method extracts the feature mean of each class from the target evaluation dataset's training data (i.e. class prototype) and estimates test video prediction as a cosine similarity score between each sample to the class prototypes of each target class. This procedure does not alter model weights using the target dataset and it does not require aligning overlapping classes of two different datasets, thus is a very efficient method to test the model robustness to distribution shifts without prior knowledge of the target distribution. We address the robustness problem by adversarial augmentation training - generating augmented views of videos that are "hard" for the classification model by applying gradient ascent on the augmentation parameters - as well as "curriculum" scheduling the strength of the video augmentations. We experimentally demonstrate the superior performance of the proposed adversarial augmentation approach over baselines across three state-of-the-art action recognition models - TSM, Video Swin Transformer, and Uniformer. The presented work provides critical insight into model robustness to distribution shifts and presents effective techniques to enhance video action recognition performance in a real-world deployment.
△ Less
Submitted 21 January, 2024;
originally announced January 2024.
-
Theories Without Models: Uncontrolled Idealizations in Particle Physics
Authors:
Antonis Antoniou,
Karim P. Y. Thébault
Abstract:
The perturbative treatment of realistic quantum field theories, such as quantum electrodynamics, requires the use of mathematical idealizations in the approximation series for scattering amplitudes. Such mathematical idealisations are necessary to derive empirically relevant models from the theory. Mathematical idealizations can be either controlled or uncontrolled, depending on whether current sc…
▽ More
The perturbative treatment of realistic quantum field theories, such as quantum electrodynamics, requires the use of mathematical idealizations in the approximation series for scattering amplitudes. Such mathematical idealisations are necessary to derive empirically relevant models from the theory. Mathematical idealizations can be either controlled or uncontrolled, depending on whether current scientific knowledge can explain whether the effects of the idealization are negligible or not. Drawing upon negative formal results in asymptotic analysis (failure of Borel summability) and renormalization group theory (failure of asymptotic safety), we argue that the mathematical idealizations applied in perturbative quantum electrodynamics should be understood as uncontrolled. This, in turn, leads to the problematic conclusion that such theories do not have theoretical models in the standard understanding of this term. The existence of unquestionable empirically successful theories without theoretical models has significant implications both for our understanding of the theory-model relationship in physics and the concept of empirical adequacy.
△ Less
Submitted 13 December, 2023;
originally announced December 2023.
-
Is Scaling Learned Optimizers Worth It? Evaluating The Value of VeLO's 4000 TPU Months
Authors:
Fady Rezk,
Antreas Antoniou,
Henry Gouk,
Timothy Hospedales
Abstract:
We analyze VeLO (versatile learned optimizer), the largest scale attempt to train a general purpose "foundational" optimizer to date. VeLO was trained on thousands of machine learning tasks using over 4000 TPU months with the goal of producing an optimizer capable of generalizing to new problems while being hyperparameter free, and outperforming industry standards such as Adam. We independently ev…
▽ More
We analyze VeLO (versatile learned optimizer), the largest scale attempt to train a general purpose "foundational" optimizer to date. VeLO was trained on thousands of machine learning tasks using over 4000 TPU months with the goal of producing an optimizer capable of generalizing to new problems while being hyperparameter free, and outperforming industry standards such as Adam. We independently evaluate VeLO on the MLCommons optimizer benchmark suite. We find that, contrary to initial claims: (1) VeLO has a critical hyperparameter that needs problem-specific tuning, (2) VeLO does not necessarily outperform competitors in quality of solution found, and (3) VeLO is not faster than competing optimizers at reducing the training loss. These observations call into question VeLO's generality and the value of the investment in training it.
△ Less
Submitted 7 March, 2024; v1 submitted 27 October, 2023;
originally announced October 2023.
-
Development of a Deep Learning Method to Identify Acute Ischemic Stroke Lesions on Brain CT
Authors:
Alessandro Fontanella,
Wenwen Li,
Grant Mair,
Antreas Antoniou,
Eleanor Platt,
Paul Armitage,
Emanuele Trucco,
Joanna Wardlaw,
Amos Storkey
Abstract:
Computed Tomography (CT) is commonly used to image acute ischemic stroke (AIS) patients, but its interpretation by radiologists is time-consuming and subject to inter-observer variability. Deep learning (DL) techniques can provide automated CT brain scan assessment, but usually require annotated images. Aiming to develop a DL method for AIS using labelled but not annotated CT brain scans from pati…
▽ More
Computed Tomography (CT) is commonly used to image acute ischemic stroke (AIS) patients, but its interpretation by radiologists is time-consuming and subject to inter-observer variability. Deep learning (DL) techniques can provide automated CT brain scan assessment, but usually require annotated images. Aiming to develop a DL method for AIS using labelled but not annotated CT brain scans from patients with AIS, we designed a convolutional neural network-based DL algorithm using routinely-collected CT brain scans from the Third International Stroke Trial (IST-3), which were not acquired using strict research protocols. The DL model aimed to detect AIS lesions and classify the side of the brain affected. We explored the impact of AIS lesion features, background brain appearances, and timing on DL performance. From 5772 unique CT scans of 2347 AIS patients (median age 82), 54% had visible AIS lesions according to expert labelling. Our best-performing DL method achieved 72% accuracy for lesion presence and side. Lesions that were larger (80% accuracy) or multiple (87% accuracy for two lesions, 100% for three or more), were better detected. Follow-up scans had 76% accuracy, while baseline scans 67% accuracy. Chronic brain conditions reduced accuracy, particularly non-stroke lesions and old stroke lesions (32% and 31% error rates respectively). DL methods can be designed for AIS lesion detection on CT using the vast quantities of routinely-collected CT brain scan data. Ultimately, this should lead to more robust and widely-applicable methods.
△ Less
Submitted 29 September, 2023;
originally announced September 2023.
-
Challenges of building medical image datasets for development of deep learning software in stroke
Authors:
Alessandro Fontanella,
Wenwen Li,
Grant Mair,
Antreas Antoniou,
Eleanor Platt,
Chloe Martin,
Paul Armitage,
Emanuele Trucco,
Joanna Wardlaw,
Amos Storkey
Abstract:
Despite the large amount of brain CT data generated in clinical practice, the availability of CT datasets for deep learning (DL) research is currently limited. Furthermore, the data can be insufficiently or improperly prepared for machine learning and thus lead to spurious and irreproducible analyses. This lack of access to comprehensive and diverse datasets poses a significant challenge for the d…
▽ More
Despite the large amount of brain CT data generated in clinical practice, the availability of CT datasets for deep learning (DL) research is currently limited. Furthermore, the data can be insufficiently or improperly prepared for machine learning and thus lead to spurious and irreproducible analyses. This lack of access to comprehensive and diverse datasets poses a significant challenge for the development of DL algorithms. In this work, we propose a complete semi-automatic pipeline to address the challenges of preparing a clinical brain CT dataset for DL analysis and describe the process of standardising this heterogeneous dataset. Challenges include handling image sets with different orientations (axial, sagittal, coronal), different image types (to view soft tissues or bones) and dimensions, and removing redundant background. The final pipeline was able to process 5,868/10,659 (45%) CT image datasets. Reasons for rejection include non-axial data (n=1,920), bone reformats (n=687), separated skull base/vault images (n=1,226), and registration failures (n=465). Further format adjustments, including image cropping, resizing and scaling are also needed for DL processing. Of the axial scans that were not localisers, bone reformats or split brains, 5,868/6,333 (93%) were accepted, while the remaining 465 failed the registration process. Appropriate preparation of medical imaging datasets for DL is a costly and time-intensive process.
△ Less
Submitted 26 September, 2023;
originally announced September 2023.
-
Few-shot Class-Incremental Semantic Segmentation via Pseudo-Labeling and Knowledge Distillation
Authors:
Chengjia Jiang,
Tao Wang,
Sien Li,
Jinyang Wang,
Shirui Wang,
Antonios Antoniou
Abstract:
We address the problem of learning new classes for semantic segmentation models from few examples, which is challenging because of the following two reasons. Firstly, it is difficult to learn from limited novel data to capture the underlying class distribution. Secondly, it is challenging to retain knowledge for existing classes and to avoid catastrophic forgetting. For learning from limited data,…
▽ More
We address the problem of learning new classes for semantic segmentation models from few examples, which is challenging because of the following two reasons. Firstly, it is difficult to learn from limited novel data to capture the underlying class distribution. Secondly, it is challenging to retain knowledge for existing classes and to avoid catastrophic forgetting. For learning from limited data, we propose a pseudo-labeling strategy to augment the few-shot training annotations in order to learn novel classes more effectively. Given only one or a few images labeled with the novel classes and a much larger set of unlabeled images, we transfer the knowledge from labeled images to unlabeled images with a coarse-to-fine pseudo-labeling approach in two steps. Specifically, we first match each labeled image to its nearest neighbors in the unlabeled image set at the scene level, in order to obtain images with a similar scene layout. This is followed by obtaining pseudo-labels within this neighborhood by applying classifiers learned on the few-shot annotations. In addition, we use knowledge distillation on both labeled and unlabeled data to retain knowledge on existing classes. We integrate the above steps into a single convolutional neural network with a unified learning objective. Extensive experiments on the Cityscapes and KITTI datasets validate the efficacy of the proposed approach in the self-driving domain. Code is available from https://github.com/ChasonJiang/FSCILSS.
△ Less
Submitted 5 August, 2023;
originally announced August 2023.
-
ACAT: Adversarial Counterfactual Attention for Classification and Detection in Medical Imaging
Authors:
Alessandro Fontanella,
Antreas Antoniou,
Wenwen Li,
Joanna Wardlaw,
Grant Mair,
Emanuele Trucco,
Amos Storkey
Abstract:
In some medical imaging tasks and other settings where only small parts of the image are informative for the classification task, traditional CNNs can sometimes struggle to generalise. Manually annotated Regions of Interest (ROI) are sometimes used to isolate the most informative parts of the image. However, these are expensive to collect and may vary significantly across annotators. To overcome t…
▽ More
In some medical imaging tasks and other settings where only small parts of the image are informative for the classification task, traditional CNNs can sometimes struggle to generalise. Manually annotated Regions of Interest (ROI) are sometimes used to isolate the most informative parts of the image. However, these are expensive to collect and may vary significantly across annotators. To overcome these issues, we propose a framework that employs saliency maps to obtain soft spatial attention masks that modulate the image features at different scales. We refer to our method as Adversarial Counterfactual Attention (ACAT). ACAT increases the baseline classification accuracy of lesions in brain CT scans from 71.39% to 72.55% and of COVID-19 related findings in lung CT scans from 67.71% to 70.84% and exceeds the performance of competing methods. We investigate the best way to generate the saliency maps employed in our architecture and propose a way to obtain them from adversarially generated counterfactual images. They are able to isolate the area of interest in brain and lung CT scans without using any manual annotations. In the task of localising the lesion location out of 6 possible regions, they obtain a score of 65.05% on brain CT scans, improving the score of 61.29% obtained with the best competing method.
△ Less
Submitted 11 August, 2023; v1 submitted 27 March, 2023;
originally announced March 2023.
-
Contrastive Meta-Learning for Partially Observable Few-Shot Learning
Authors:
Adam Jelley,
Amos Storkey,
Antreas Antoniou,
Sam Devlin
Abstract:
Many contrastive and meta-learning approaches learn representations by identifying common features in multiple views. However, the formalism for these approaches generally assumes features to be shared across views to be captured coherently. We consider the problem of learning a unified representation from partial observations, where useful features may be present in only some of the views. We app…
▽ More
Many contrastive and meta-learning approaches learn representations by identifying common features in multiple views. However, the formalism for these approaches generally assumes features to be shared across views to be captured coherently. We consider the problem of learning a unified representation from partial observations, where useful features may be present in only some of the views. We approach this through a probabilistic formalism enabling views to map to representations with different levels of uncertainty in different components; these views can then be integrated with one another through marginalisation over that uncertainty. Our approach, Partial Observation Experts Modelling (POEM), then enables us to meta-learn consistent representations from partial observations. We evaluate our approach on an adaptation of a comprehensive few-shot learning benchmark, Meta-Dataset, and demonstrate the benefits of POEM over other meta-learning methods at representation learning from partial observations. We further demonstrate the utility of POEM by meta-learning to represent an environment from partial views observed by an agent exploring the environment.
△ Less
Submitted 30 January, 2023;
originally announced January 2023.
-
Assessing the risk of re-identification arising from an attack on anonymised data
Authors:
Anna Antoniou,
Giacomo Dossena,
Julia MacMillan,
Steven Hamblin,
David Clifton,
Paula Petrone
Abstract:
Objective: The use of routinely-acquired medical data for research purposes requires the protection of patient confidentiality via data anonymisation. The objective of this work is to calculate the risk of re-identification arising from a malicious attack to an anonymised dataset, as described below. Methods: We first present an analytical means of estimating the probability of re-identification o…
▽ More
Objective: The use of routinely-acquired medical data for research purposes requires the protection of patient confidentiality via data anonymisation. The objective of this work is to calculate the risk of re-identification arising from a malicious attack to an anonymised dataset, as described below. Methods: We first present an analytical means of estimating the probability of re-identification of a single patient in a k-anonymised dataset of Electronic Health Record (EHR) data. Second, we generalize this solution to obtain the probability of multiple patients being re-identified. We provide synthetic validation via Monte Carlo simulations to illustrate the accuracy of the estimates obtained. Results: The proposed analytical framework for risk estimation provides re-identification probabilities that are in agreement with those provided by simulation in a number of scenarios. Our work is limited by conservative assumptions which inflate the re-identification probability. Discussion: Our estimates show that the re-identification probability increases with the proportion of the dataset maliciously obtained and that it has an inverse relationship with the equivalence class size. Our recursive approach extends the applicability domain to the general case of a multi-patient re-identification attack in an arbitrary k-anonymisation scheme. Conclusion: We prescribe a systematic way to parametrize the k-anonymisation process based on a pre-determined re-identification probability. We observed that the benefits of a reduced re-identification risk that come with increasing k-size may not be worth the reduction in data granularity when one is considering benchmarking the re-identification probability on the size of the portion of the dataset maliciously obtained by the adversary.
△ Less
Submitted 31 March, 2022;
originally announced March 2022.
-
"What Artists Want": Elicitation of Artist Requirements to Feed the Design on a New Collaboration Platform for Creative Work
Authors:
Angeliki Antoniou,
Ioanna Lykourentzou,
Antonios Liapis,
Dimitra Nikolou,
Marily Konstantinopoulou
Abstract:
Aiming at designing a decentralized platform to support grassroot initiatives for self-organized creative work, the present work solicited feedback from a group of visual artists regarding their work processes and concerns. The paper presents the qualitative methodology followed for collecting requirements from the target audience of the envisioned software solution. The data gathered from the foc…
▽ More
Aiming at designing a decentralized platform to support grassroot initiatives for self-organized creative work, the present work solicited feedback from a group of visual artists regarding their work processes and concerns. The paper presents the qualitative methodology followed for collecting requirements from the target audience of the envisioned software solution. The data gathered from the focus group is analyzed and we conclude with a set of important requirements that the future platform needs to fulfill.
△ Less
Submitted 6 October, 2021;
originally announced October 2021.
-
Original Research By Young Twinkle Students (ORBYTS): Ephemeris Refinement of Transiting Exoplanets II
Authors:
Billy Edwards,
Lara Anisman,
Quentin Changeat,
Mario Morvan,
Sam Wright,
Kai Hou Yip,
Amiira Abdullahi,
Jesmin Ali,
Clarry Amofa,
Antony Antoniou,
Shahad Arzouni,
Noeka Bradley,
Dayanara Campana,
Nandini Chavda,
Jessy Creswell,
Neliman Gazieva,
Emily Gudgeon-Sidelnikova,
Pratap Guha,
Ella Hayden,
Mohammed Huda,
Hana Hussein,
Ayub Ibrahim,
Chika Ike,
Salma Jama,
Bhavya Joshi
, et al. (38 additional authors not shown)
Abstract:
We report follow-up observations of four transiting exoplanets, TRES-2b, HAT-P-22b, HAT-P-36b and XO-2b, as part of the Original Research By Young Twinkle Students (ORBYTS) programme. These observations were taken using the Las Cumbres Observatory Global Telescope Network's (LCOGT) robotic 0.4 m telescopes and were analysed using the HOlomon Photometric Software (HOPS). Such observations are key f…
▽ More
We report follow-up observations of four transiting exoplanets, TRES-2b, HAT-P-22b, HAT-P-36b and XO-2b, as part of the Original Research By Young Twinkle Students (ORBYTS) programme. These observations were taken using the Las Cumbres Observatory Global Telescope Network's (LCOGT) robotic 0.4 m telescopes and were analysed using the HOlomon Photometric Software (HOPS). Such observations are key for ensuring accurate transit times for upcoming telescopes, such as the James Webb Space Telescope (JWST), Twinkle and Ariel, which may seek to characterise the atmospheres of these planets. The data have been uploaded to ExoClock and a significant portion of this work has been completed by secondary school students in London.
△ Less
Submitted 14 July, 2020;
originally announced July 2020.
-
Defining Benchmarks for Continual Few-Shot Learning
Authors:
Antreas Antoniou,
Massimiliano Patacchiola,
Mateusz Ochal,
Amos Storkey
Abstract:
Both few-shot and continual learning have seen substantial progress in the last years due to the introduction of proper benchmarks. That being said, the field has still to frame a suite of benchmarks for the highly desirable setting of continual few-shot learning, where the learner is presented a number of few-shot tasks, one after the other, and then asked to perform well on a validation set stem…
▽ More
Both few-shot and continual learning have seen substantial progress in the last years due to the introduction of proper benchmarks. That being said, the field has still to frame a suite of benchmarks for the highly desirable setting of continual few-shot learning, where the learner is presented a number of few-shot tasks, one after the other, and then asked to perform well on a validation set stemming from all previously seen tasks. Continual few-shot learning has a small computational footprint and is thus an excellent setting for efficient investigation and experimentation. In this paper we first define a theoretical framework for continual few-shot learning, taking into account recent literature, then we propose a range of flexible benchmarks that unify the evaluation criteria and allows exploring the problem from multiple perspectives. As part of the benchmark, we introduce a compact variant of ImageNet, called SlimageNet64, which retains all original 1000 classes but only contains 200 instances of each one (a total of 200K data-points) downscaled to 64 x 64 pixels. We provide baselines for the proposed benchmarks using a number of popular few-shot learning algorithms, as a result, exposing previously unknown strengths and weaknesses of those algorithms in continual and data-limited settings.
△ Less
Submitted 15 April, 2020;
originally announced April 2020.
-
Meta-Learning in Neural Networks: A Survey
Authors:
Timothy Hospedales,
Antreas Antoniou,
Paul Micaelli,
Amos Storkey
Abstract:
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional chall…
▽ More
The field of meta-learning, or learning-to-learn, has seen a dramatic rise in interest in recent years. Contrary to conventional approaches to AI where tasks are solved from scratch using a fixed learning algorithm, meta-learning aims to improve the learning algorithm itself, given the experience of multiple learning episodes. This paradigm provides an opportunity to tackle many conventional challenges of deep learning, including data and computation bottlenecks, as well as generalization. This survey describes the contemporary meta-learning landscape. We first discuss definitions of meta-learning and position it with respect to related fields, such as transfer learning and hyperparameter optimization. We then propose a new taxonomy that provides a more comprehensive breakdown of the space of meta-learning methods today. We survey promising applications and successes of meta-learning such as few-shot learning and reinforcement learning. Finally, we discuss outstanding challenges and promising areas for future research.
△ Less
Submitted 7 November, 2020; v1 submitted 11 April, 2020;
originally announced April 2020.
-
On Atomic Density of Numerical Semigroup Algebras
Authors:
A. A. Antoniou,
R. A. C. Edmonds,
B. Kubik,
C. O'Neill,
S. Talbott
Abstract:
A numerical semigroup $S$ is a cofinite, additively-closed subset of the nonnegative integers that contains $0$. In this paper, we initiate the study of atomic density, an asymptotic measure of the proportion of irreducible elements in a given ring or semigroup, for semigroup algebras. It is known that the atomic density of the polynomial ring $\mathbb{F}_q[x]$ is zero for any finite field…
▽ More
A numerical semigroup $S$ is a cofinite, additively-closed subset of the nonnegative integers that contains $0$. In this paper, we initiate the study of atomic density, an asymptotic measure of the proportion of irreducible elements in a given ring or semigroup, for semigroup algebras. It is known that the atomic density of the polynomial ring $\mathbb{F}_q[x]$ is zero for any finite field $\mathbb{F}_q$; we prove that the numerical semigroup algebra $\mathbb{F}_q[S]$ also has atomic density zero for any numerical semigroup~$S$. We also examine the particular algebra $\mathbb{F}_2[x^2,x^3]$ in more detail, providing a bound on the rate of convergence of the atomic density as well as a counting formula for irreducible polynomials using Möbius inversion, comparable to the formula for irreducible polynomials over a finite field $\mathbb{F}_q$.
△ Less
Submitted 6 March, 2021; v1 submitted 3 March, 2020;
originally announced March 2020.
-
Learning to learn via Self-Critique
Authors:
Antreas Antoniou,
Amos Storkey
Abstract:
In few-shot learning, a machine learning system learns from a small set of labelled examples relating to a specific task, such that it can generalize to new examples of the same task. Given the limited availability of labelled examples in such tasks, we wish to make use of all the information we can. Usually a model learns task-specific information from a small training-set (support-set) to predic…
▽ More
In few-shot learning, a machine learning system learns from a small set of labelled examples relating to a specific task, such that it can generalize to new examples of the same task. Given the limited availability of labelled examples in such tasks, we wish to make use of all the information we can. Usually a model learns task-specific information from a small training-set (support-set) to predict on an unlabelled validation set (target-set). The target-set contains additional task-specific information which is not utilized by existing few-shot learning methods. Making use of the target-set examples via transductive learning requires approaches beyond the current methods; at inference time, the target-set contains only unlabelled input data-points, and so discriminative learning cannot be used. In this paper, we propose a framework called Self-Critique and Adapt or SCA, which learns to learn a label-free loss function, parameterized as a neural network. A base-model learns on a support-set using existing methods (e.g. stochastic gradient descent combined with the cross-entropy loss), and then is updated for the incoming target-task using the learnt loss function. This label-free loss function is itself optimized such that the learnt model achieves higher generalization performance. Experiments demonstrate that SCA offers substantially reduced error-rates compared to baselines which only adapt on the support-set, and results in state of the art benchmark performance on Mini-ImageNet and Caltech-UCSD Birds 200.
△ Less
Submitted 30 January, 2020; v1 submitted 24 May, 2019;
originally announced May 2019.
-
Assume, Augment and Learn: Unsupervised Few-Shot Meta-Learning via Random Labels and Data Augmentation
Authors:
Antreas Antoniou,
Amos Storkey
Abstract:
The field of few-shot learning has been laboriously explored in the supervised setting, where per-class labels are available. On the other hand, the unsupervised few-shot learning setting, where no labels of any kind are required, has seen little investigation. We propose a method, named Assume, Augment and Learn or AAL, for generating few-shot tasks using unlabeled data. We randomly label a rando…
▽ More
The field of few-shot learning has been laboriously explored in the supervised setting, where per-class labels are available. On the other hand, the unsupervised few-shot learning setting, where no labels of any kind are required, has seen little investigation. We propose a method, named Assume, Augment and Learn or AAL, for generating few-shot tasks using unlabeled data. We randomly label a random subset of images from an unlabeled dataset to generate a support set. Then by applying data augmentation on the support set's images, and reusing the support set's labels, we obtain a target set. The resulting few-shot tasks can be used to train any standard meta-learning framework. Once trained, such a model, can be directly applied on small real-labeled datasets without any changes or fine-tuning required. In our experiments, the learned models achieve good generalization performance in a variety of established few-shot learning tasks on Omniglot and Mini-Imagenet.
△ Less
Submitted 5 March, 2019; v1 submitted 26 February, 2019;
originally announced February 2019.
-
Dilated DenseNets for Relational Reasoning
Authors:
Antreas Antoniou,
Agnieszka Słowik,
Elliot J. Crowley,
Amos Storkey
Abstract:
Despite their impressive performance in many tasks, deep neural networks often struggle at relational reasoning. This has recently been remedied with the introduction of a plug-in relational module that considers relations between pairs of objects. Unfortunately, this is combinatorially expensive. In this extended abstract, we show that a DenseNet incorporating dilated convolutions excels at relat…
▽ More
Despite their impressive performance in many tasks, deep neural networks often struggle at relational reasoning. This has recently been remedied with the introduction of a plug-in relational module that considers relations between pairs of objects. Unfortunately, this is combinatorially expensive. In this extended abstract, we show that a DenseNet incorporating dilated convolutions excels at relational reasoning on the Sort-of-CLEVR dataset, allowing us to forgo this relational module and its associated expense.
△ Less
Submitted 1 November, 2018;
originally announced November 2018.
-
How to train your MAML
Authors:
Antreas Antoniou,
Harrison Edwards,
Amos Storkey
Abstract:
The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neur…
▽ More
The field of few-shot learning has recently seen substantial advancements. Most of these advancements came from casting few-shot learning as a meta-learning problem. Model Agnostic Meta Learning or MAML is currently one of the best approaches for few-shot learning via meta-learning. MAML is simple, elegant and very powerful, however, it has a variety of issues, such as being very sensitive to neural network architectures, often leading to instability during training, requiring arduous hyperparameter searches to stabilize training and achieve high generalization and being very computationally expensive at both training and inference times. In this paper, we propose various modifications to MAML that not only stabilize the system, but also substantially improve the generalization performance, convergence speed and computational overhead of MAML, which we call MAML++.
△ Less
Submitted 5 March, 2019; v1 submitted 22 October, 2018;
originally announced October 2018.
-
CINIC-10 is not ImageNet or CIFAR-10
Authors:
Luke N. Darlow,
Elliot J. Crowley,
Antreas Antoniou,
Amos J. Storkey
Abstract:
In this brief technical report we introduce the CINIC-10 dataset as a plug-in extended alternative for CIFAR-10. It was compiled by combining CIFAR-10 with images selected and downsampled from the ImageNet database. We present the approach to compiling the dataset, illustrate the example images for different classes, give pixel distributions for each part of the repository, and give some standard…
▽ More
In this brief technical report we introduce the CINIC-10 dataset as a plug-in extended alternative for CIFAR-10. It was compiled by combining CIFAR-10 with images selected and downsampled from the ImageNet database. We present the approach to compiling the dataset, illustrate the example images for different classes, give pixel distributions for each part of the repository, and give some standard benchmarks for well known models. Details for download, usage, and compilation can be found in the associated github repository.
△ Less
Submitted 2 October, 2018;
originally announced October 2018.
-
On the Arithmetic of Power Monoids and Sumsets in Cyclic Groups
Authors:
Austin A. Antoniou,
Salvatore Tringali
Abstract:
Let $H$ be a multiplicatively written monoid with identity $1_H$ (in particular, a group). We denote by $\mathcal P_{\rm fin,\times}(H)$ the monoid obtained by endowing the collection of all finite subsets of $H$ containing a unit with the operation of setwise multiplication $(X,Y) \mapsto \{xy: x \in X, y \in Y\}$; and study fundamental features of the arithmetic of this and related structures, w…
▽ More
Let $H$ be a multiplicatively written monoid with identity $1_H$ (in particular, a group). We denote by $\mathcal P_{\rm fin,\times}(H)$ the monoid obtained by endowing the collection of all finite subsets of $H$ containing a unit with the operation of setwise multiplication $(X,Y) \mapsto \{xy: x \in X, y \in Y\}$; and study fundamental features of the arithmetic of this and related structures, with a focus on the submonoid, $\mathcal P_{\text{fin},1}(H)$, of $\mathcal P_{\text{fin},\times}(H)$ consisting of all finite subsets $X$ of $H$ with $1_H \in X$.
Among others, we prove that $\mathcal{P}_{\text{fin},1}(H)$ is atomic (i.e., each non-unit is a product of irreducibles) iff $1_H \ne x^2 \ne x$ for every $x \in H \setminus \{1_H\}$. Then we obtain that $\mathcal{P}_{\text{fin},1}(H)$ is BF (i.e., it is atomic and every element has factorizations of bounded length) iff $H$ is torsion-free; and show how to transfer these conclusions to $\mathcal P_{\text{fin},\times}(H)$.
Next, we introduce "minimal factorizations" to account for the fact that monoids may have non-trivial idempotents, in which case standard definitions from Factorization Theory degenerate. Accordingly, we obtain conditions for $\mathcal P_{\text{fin},\times}(H)$ to be BmF (meaning that each non-unit has minimal factorizations of bounded length); and for $\mathcal{P}_{\text{fin},1}(H)$ to be BmF, HmF (i.e., a BmF-monoid where all the minimal factorizations of a given element have the same length), or minimally factorial (i.e., a BmF-monoid where each element has an essentially unique minimal factorization). Finally, we prove how to realize certain intervals as sets of minimal lengths in $\mathcal P_{\text{fin},1}(H)$.
Many proofs come down to considering sumset decompositions in cyclic groups, so giving rise to an intriguing interplay with Arithmetic Combinatorics.
△ Less
Submitted 1 December, 2020; v1 submitted 29 April, 2018;
originally announced April 2018.
-
Data Augmentation Generative Adversarial Networks
Authors:
Antreas Antoniou,
Amos Storkey,
Harrison Edwards
Abstract:
Effective training of neural networks requires much data. In the low-data regime, parameters are underdetermined, and learnt networks generalise poorly. Data Augmentation alleviates this by using existing data more effectively. However standard data augmentation produces only limited plausible alternative data. Given there is potential to generate a much broader set of augmentations, we design and…
▽ More
Effective training of neural networks requires much data. In the low-data regime, parameters are underdetermined, and learnt networks generalise poorly. Data Augmentation alleviates this by using existing data more effectively. However standard data augmentation produces only limited plausible alternative data. Given there is potential to generate a much broader set of augmentations, we design and train a generative model to do data augmentation. The model, based on image conditional Generative Adversarial Networks, takes data from a source domain and learns to take any data item and generalise it to generate other within-class data items. As this generative process does not depend on the classes themselves, it can be applied to novel unseen classes of data. We show that a Data Augmentation Generative Adversarial Network (DAGAN) augments standard vanilla classifiers well. We also show a DAGAN can enhance few-shot learning systems such as Matching Networks. We demonstrate these approaches on Omniglot, on EMNIST having learnt the DAGAN on Omniglot, and VGG-Face data. In our experiments we can see over 13% increase in accuracy in the low-data regime experiments in Omniglot (from 69% to 82%), EMNIST (73.9% to 76%) and VGG-Face (4.5% to 12%); in Matching Networks for Omniglot we observe an increase of 0.5% (from 96.9% to 97.4%) and an increase of 1.8% in EMNIST (from 59.5% to 61.3%).
△ Less
Submitted 21 March, 2018; v1 submitted 12 November, 2017;
originally announced November 2017.
-
Matching or Crashing? Personality-based Team Formation in Crowdsourcing Environments
Authors:
Ioanna Lykourentzou,
Angeliki Antoniou,
Yannick Naudet
Abstract:
"Does placing workers together based on their personality give better performance results in cooperative crowdsourcing settings, compared to non-personality based crowd team formation?" In this work we examine the impact of personality compatibility on the effectiveness of crowdsourced team work. Using a personality-based group dynamics approach, we examine two main types of personality combinatio…
▽ More
"Does placing workers together based on their personality give better performance results in cooperative crowdsourcing settings, compared to non-personality based crowd team formation?" In this work we examine the impact of personality compatibility on the effectiveness of crowdsourced team work. Using a personality-based group dynamics approach, we examine two main types of personality combinations (matching and crashing) on two main types of tasks (collaborative and competitive). Our experimental results show that personality compatibility significantly affects the quality of the team's final outcome, the quality of interactions and the emotions experienced by the team members. The present study is the first to examine the effect of personality over team result in crowdsourcing settings, and it has practical implications for the better design of crowdsourced team work.
△ Less
Submitted 26 January, 2015;
originally announced January 2015.
-
Study of the C IV ($λλ$1548.187 - 1550.772), Si IV ($λλ$1393.755 - 1402.770) and O IV ($λ$1401.156) Regions of the QSO J021327.25-001446.93
Authors:
Ch. Katsavrias,
E. Danezis,
A. Antoniou
Abstract:
Broad Absorption Line Regions - BALR are composed of a number of successive independent absorbing density layers. Using the GR model, we analyze the UV Si IV (λλ1393.755 - 1402.770), O IV (λ1401.156) and C IV (λλ1548.187 - 1550.772) resonance lines in the spectra of a certain QSO and discuss the results concerning its kinematic properties (rotational, radial and random velocities).
Broad Absorption Line Regions - BALR are composed of a number of successive independent absorbing density layers. Using the GR model, we analyze the UV Si IV (λλ1393.755 - 1402.770), O IV (λ1401.156) and C IV (λλ1548.187 - 1550.772) resonance lines in the spectra of a certain QSO and discuss the results concerning its kinematic properties (rotational, radial and random velocities).
△ Less
Submitted 13 May, 2014;
originally announced May 2014.
-
Improved Design Method for Nearly Linear-Phase IIR Filters Using Constrained Optimization
Authors:
R. C. Nongpiur,
D. J. Shpak,
A. Antoniou
Abstract:
A new optimization method for the design of nearly linear-phase IIR digital filters that satisfy prescribed specifications is proposed. The group-delay deviation is minimized under the constraint that the passband ripple and stopband attenuation are within the prescribed specifications and either a prescribed or an optimized group delay can be achieved. By representing the filter in terms of a cas…
▽ More
A new optimization method for the design of nearly linear-phase IIR digital filters that satisfy prescribed specifications is proposed. The group-delay deviation is minimized under the constraint that the passband ripple and stopband attenuation are within the prescribed specifications and either a prescribed or an optimized group delay can be achieved. By representing the filter in terms of a cascade of second-order sections, a non-restrictive stability constraint characterized by a set of linear inequality constraints can be incorporated in the optimization algorithm. An additional feature of the method, which is very useful in certain applications, is that it provides the capability of constraining the maximum gain in transition bands to be below a prescribed level. Experimental results show that filters designed using the proposed method have much lower group-delay deviation for the same passband ripple and stopband attenuation when compared with corresponding filters designed with several state-of-the-art competing methods.
△ Less
Submitted 13 February, 2014;
originally announced February 2014.
-
A new model for the structure of the DACs and SACs regions in the Oe and Be stellar atmospheres
Authors:
E. Danezis,
D. Nikolaidis,
E. Lyratzi,
L. Č. Popović,
M. S. Dimitrijević,
A. Antoniou,
E. Theodosiou
Abstract:
In this paper we present a new mathematical model for the density regions where a specific spectral line and its SACs/DACs are created in the Oe and Be stellar atmospheres. In the calculations of final spectral line function we consider that the main reasons of the line broadening are the rotation of the density regions creating the spectral line and its DACs/SACs, as well as the random motions…
▽ More
In this paper we present a new mathematical model for the density regions where a specific spectral line and its SACs/DACs are created in the Oe and Be stellar atmospheres. In the calculations of final spectral line function we consider that the main reasons of the line broadening are the rotation of the density regions creating the spectral line and its DACs/SACs, as well as the random motions of the ions. This line function is able to reproduce the spectral feature and it enables us to calculate some important physical parameters, such as the rotational, the radial and the random velocities, the Full Width at Half Maximum, the Gaussian deviation, the optical depth, the column density and the absorbed or emitted energy. Additionally, we can calculate the percentage of the contribution of the rotational velocity and the ions' random motions of the DACs/SACs regions to the line broadening. Finally, we present two tests and three short applications of the proposed model.
△ Less
Submitted 10 May, 2007;
originally announced May 2007.
-
The complex structure of the Mg II {λλ} 2795.523, 2802.698 Å regions of 64 Be stars
Authors:
E. Lyratzi,
E. Danezis,
L. Č. Popović,
M. S. Dimitrijević,
D. Nikolaidis,
A. Antoniou
Abstract:
Here is studied the presence of absorption components shifted to the violet or the red side of the main spectral line (satellite, or discrete absorption components, i.e. SACs or DACs), in Mg II resonance lines' regions in Be stars and their kinematical characteristics. Namely our objective is to check if exists a common physical structure for the atmospheric regions creating SACs or DACs of the…
▽ More
Here is studied the presence of absorption components shifted to the violet or the red side of the main spectral line (satellite, or discrete absorption components, i.e. SACs or DACs), in Mg II resonance lines' regions in Be stars and their kinematical characteristics. Namely our objective is to check if exists a common physical structure for the atmospheric regions creating SACs or DACs of the Mg II resonance lines. In order to do this, a statistical study of the Mg II {λλ} 2792.523, 2802.698 Å lines in the spectra of 64 Be stars of all spectral subtypes and luminosity classes is performed. We found that the absorption atmospherical regions where the Mg II resonance lines originated may be formed of several independent density layers of matter which rotate with different velocities. It is attempted also to separate SACs and DACs according to low or high radial velocity. The emission lines were detected only in the earliest and latest spectral subtypes.
△ Less
Submitted 22 February, 2007;
originally announced February 2007.
-
Price Clustering and Discreteness: Is there Chaos behind the Noise?
Authors:
Antonios Antoniou,
Constantinos E. Vorlow
Abstract:
We investigate the "compass rose" (Crack, T.F. and Ledoit, O. (1996), Journal of Finance, 51(2), pg. 751-762) patterns revealed in phase portraits (delay plots) of stock returns. The structures observed in these diagrams have been attributed mainly to price clustering and discreteness. Using wavelet based denoising, we examine the noise-free versions of a set of FTSE100 stock returns time series…
▽ More
We investigate the "compass rose" (Crack, T.F. and Ledoit, O. (1996), Journal of Finance, 51(2), pg. 751-762) patterns revealed in phase portraits (delay plots) of stock returns. The structures observed in these diagrams have been attributed mainly to price clustering and discreteness. Using wavelet based denoising, we examine the noise-free versions of a set of FTSE100 stock returns time series. We reveal evidence of non-periodic cyclical dynamics. As a second stage we apply Surrogate Data Analysis on the original and denoised stock returns. Our results suggest that there is a strong nonlinear and possibly deterministic signature in the data generating processes of the stock returns sequences.
△ Less
Submitted 18 July, 2004;
originally announced July 2004.