Search | arXiv e-print repository

arXiv:2405.19874 [pdf, other]

Is In-Context Learning Sufficient for Instruction Following in LLMs?

Authors: Hao Zhao, Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion

Abstract: In-context learning (ICL) allows LLMs to learn from examples without changing their weights, which is a particularly promising capability for long-context LLMs that can potentially learn from many examples. Recently, Lin et al. (2024) proposed URIAL, a method using only three in-context examples to align base LLMs, achieving non-trivial instruction following performance. In this work, we show that… ▽ More In-context learning (ICL) allows LLMs to learn from examples without changing their weights, which is a particularly promising capability for long-context LLMs that can potentially learn from many examples. Recently, Lin et al. (2024) proposed URIAL, a method using only three in-context examples to align base LLMs, achieving non-trivial instruction following performance. In this work, we show that, while effective, ICL alignment with URIAL still underperforms compared to instruction fine-tuning on established benchmarks such as MT-Bench and AlpacaEval 2.0 (LC), especially with more capable base LMs. Unlike for tasks such as classification, translation, or summarization, adding more ICL demonstrations for long-context LLMs does not systematically improve instruction following performance. To address this limitation, we derive a greedy selection approach for ICL examples that noticeably improves performance, yet without bridging the gap to instruction fine-tuning. Finally, we provide a series of ablation studies to better understand the reasons behind the remaining gap, and we show how some aspects of ICL depart from the existing knowledge and are specific to the instruction tuning setting. Overall, our work advances the understanding of ICL as an alignment technique. We provide our code at https://github.com/tml-epfl/icl-alignment. △ Less

Submitted 30 May, 2024; originally announced May 2024.

Comments: Preprint. Code at https://github.com/tml-epfl/icl-alignment

arXiv:2404.14461 [pdf, other]

Competition Report: Finding Universal Jailbreak Backdoors in Aligned LLMs

Authors: Javier Rando, Francesco Croce, Kryštof Mitka, Stepan Shabalin, Maksym Andriushchenko, Nicolas Flammarion, Florian Tramèr

Abstract: Large language models are aligned to be safe, preventing users from generating harmful content like misinformation or instructions for illegal activities. However, previous work has shown that the alignment process is vulnerable to poisoning attacks. Adversaries can manipulate the safety training data to inject backdoors that act like a universal sudo command: adding the backdoor string to any pro… ▽ More Large language models are aligned to be safe, preventing users from generating harmful content like misinformation or instructions for illegal activities. However, previous work has shown that the alignment process is vulnerable to poisoning attacks. Adversaries can manipulate the safety training data to inject backdoors that act like a universal sudo command: adding the backdoor string to any prompt enables harmful responses from models that, otherwise, behave safely. Our competition, co-located at IEEE SaTML 2024, challenged participants to find universal backdoors in several large language models. This report summarizes the key findings and promising ideas for future research. △ Less

Submitted 6 June, 2024; v1 submitted 22 April, 2024; originally announced April 2024.

Comments: Competition Report

arXiv:2404.02151 [pdf, other]

Jailbreaking Leading Safety-Aligned LLMs with Simple Adaptive Attacks

Authors: Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion

Abstract: We show that even the most recent safety-aligned LLMs are not robust to simple adaptive jailbreaking attacks. First, we demonstrate how to successfully leverage access to logprobs for jailbreaking: we initially design an adversarial prompt template (sometimes adapted to the target LLM), and then we apply random search on a suffix to maximize a target logprob (e.g., of the token ``Sure''), potentia… ▽ More We show that even the most recent safety-aligned LLMs are not robust to simple adaptive jailbreaking attacks. First, we demonstrate how to successfully leverage access to logprobs for jailbreaking: we initially design an adversarial prompt template (sometimes adapted to the target LLM), and then we apply random search on a suffix to maximize a target logprob (e.g., of the token ``Sure''), potentially with multiple restarts. In this way, we achieve nearly 100% attack success rate -- according to GPT-4 as a judge -- on Vicuna-13B, Mistral-7B, Phi-3-Mini, Nemotron-4-340B, Llama-2-Chat-7B/13B/70B, Llama-3-Instruct-8B, Gemma-7B, GPT-3.5, GPT-4, and R2D2 from HarmBench that was adversarially trained against the GCG attack. We also show how to jailbreak all Claude models -- that do not expose logprobs -- via either a transfer or prefilling attack with a 100% success rate. In addition, we show how to use random search on a restricted set of tokens for finding trojan strings in poisoned models -- a task that shares many similarities with jailbreaking -- which is the algorithm that brought us the first place in the SaTML'24 Trojan Detection Competition. The common theme behind these attacks is that adaptivity is crucial: different models are vulnerable to different prompting templates (e.g., R2D2 is very sensitive to in-context learning prompts), some models have unique vulnerabilities based on their APIs (e.g., prefilling for Claude), and in some settings, it is crucial to restrict the token search space based on prior knowledge (e.g., for trojan detection). For reproducibility purposes, we provide the code, logs, and jailbreak artifacts in the JailbreakBench format at https://github.com/tml-epfl/llm-adaptive-attacks. △ Less

Submitted 18 June, 2024; v1 submitted 2 April, 2024; originally announced April 2024.

Comments: Updates in the v2: more models (Llama3, Phi-3, Nemotron-4-340B), jailbreak artifacts for all attacks are available, evaluation of generalization to a different judge (Llama-3-70B and Llama Guard 2), more experiments (convergence plots over iterations, ablation on the suffix length for random search), improved exposition of the paper, examples of jailbroken generation

arXiv:2404.01318 [pdf, other]

JailbreakBench: An Open Robustness Benchmark for Jailbreaking Large Language Models

Authors: Patrick Chao, Edoardo Debenedetti, Alexander Robey, Maksym Andriushchenko, Francesco Croce, Vikash Sehwag, Edgar Dobriban, Nicolas Flammarion, George J. Pappas, Florian Tramer, Hamed Hassani, Eric Wong

Abstract: Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and suc… ▽ More Jailbreak attacks cause large language models (LLMs) to generate harmful, unethical, or otherwise objectionable content. Evaluating these attacks presents a number of challenges, which the current collection of benchmarks and evaluation techniques do not adequately address. First, there is no clear standard of practice regarding jailbreaking evaluation. Second, existing works compute costs and success rates in incomparable ways. And third, numerous works are not reproducible, as they withhold adversarial prompts, involve closed-source code, or rely on evolving proprietary APIs. To address these challenges, we introduce JailbreakBench, an open-sourced benchmark with the following components: (1) an evolving repository of state-of-the-art adversarial prompts, which we refer to as jailbreak artifacts; (2) a jailbreaking dataset comprising 100 behaviors -- both original and sourced from prior work (Zou et al., 2023; Mazeika et al., 2023, 2024) -- which align with OpenAI's usage policies; (3) a standardized evaluation framework at https://github.com/JailbreakBench/jailbreakbench that includes a clearly defined threat model, system prompts, chat templates, and scoring functions; and (4) a leaderboard at https://jailbreakbench.github.io/ that tracks the performance of attacks and defenses for various LLMs. We have carefully considered the potential ethical implications of releasing this benchmark, and believe that it will be a net positive for the community. △ Less

Submitted 16 July, 2024; v1 submitted 27 March, 2024; originally announced April 2024.

Comments: JailbreakBench v1.0: more attack artifacts, more test-time defenses, a more accurate jailbreak judge (Llama-3-70B with a custom prompt), a larger dataset of human preferences for selecting a jailbreak judge (300 examples), an over-refusal evaluation dataset (100 benign/borderline behaviors), a semantic refusal judge based on Llama-3-8B

arXiv:2402.12336 [pdf, other]

Robust CLIP: Unsupervised Adversarial Fine-Tuning of Vision Embeddings for Robust Large Vision-Language Models

Authors: Christian Schlarmann, Naman Deep Singh, Francesco Croce, Matthias Hein

Abstract: Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly used for various real-world tasks. Prior work has shown that these models are highly vulnerable to adversarial attacks on the vision modality. These attacks can be leveraged to spread fake information or defraud users, and thus pose a significant risk, which makes the robustness of large multi-modal foundation model… ▽ More Multi-modal foundation models like OpenFlamingo, LLaVA, and GPT-4 are increasingly used for various real-world tasks. Prior work has shown that these models are highly vulnerable to adversarial attacks on the vision modality. These attacks can be leveraged to spread fake information or defraud users, and thus pose a significant risk, which makes the robustness of large multi-modal foundation models a pressing problem. The CLIP model, or one of its variants, is used as a frozen vision encoder in many large vision-language models (LVLMs), e.g. LLaVA and OpenFlamingo. We propose an unsupervised adversarial fine-tuning scheme to obtain a robust CLIP vision encoder, which yields robustness on all vision down-stream tasks (LVLMs, zero-shot classification) that rely on CLIP. In particular, we show that stealth-attacks on users of LVLMs by a malicious third party providing manipulated images are no longer possible once one replaces the original CLIP model with our robust one. No retraining or fine-tuning of the down-stream LVLMs is required. The code and robust models are available at https://github.com/chs20/RobustVLM △ Less

Submitted 5 June, 2024; v1 submitted 19 February, 2024; originally announced February 2024.

Comments: ICML 2024 Oral

arXiv:2402.04833 [pdf, other]

Long Is More for Alignment: A Simple but Tough-to-Beat Baseline for Instruction Fine-Tuning

Authors: Hao Zhao, Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion

Abstract: There is a consensus that instruction fine-tuning of LLMs requires high-quality data, but what are they? LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses -- th… ▽ More There is a consensus that instruction fine-tuning of LLMs requires high-quality data, but what are they? LIMA (NeurIPS 2023) and AlpaGasus (ICLR 2024) are state-of-the-art methods for selecting such high-quality examples, either via manual curation or using GPT-3.5-Turbo as a quality scorer. We show that the extremely simple baseline of selecting the 1,000 instructions with longest responses -- that intuitively contain more learnable information and are harder to overfit -- from standard datasets can consistently outperform these sophisticated methods according to GPT-4 and PaLM-2 as judges, while remaining competitive on the Open LLM benchmarks that test factual knowledge. We demonstrate this for several LLMs (Llama-2-7B, Llama-2-13B, Mistral-7B-v0.1) and datasets (Alpaca-52k, Evol-Instruct-70k). In addition, a lightweight refinement of such long instructions can further improve the abilities of the fine-tuned LLMs, and allows us to obtain competitive results on MT-Bench and the 2nd highest-ranked Llama-2-7B-based model on AlpacaEval 2.0, while training on only 1,000 examples and no extra preference data. We also conduct a thorough analysis of our models to ensure that their enhanced performance is not simply due to GPT-4's preference for longer responses. Overall, our findings suggest that fine-tuning on the longest responses should be the default baseline for any work on instruction fine-tuning. We provide our code at https://github.com/tml-epfl/long-is-more-for-alignment. △ Less

Submitted 4 June, 2024; v1 submitted 7 February, 2024; originally announced February 2024.

Comments: Accepted at ICML 2024. This camera-ready version adds MT-Bench evaluations, a human study, more thorough analysis of length bias. Code at https://github.com/tml-epfl/long-is-more-for-alignment

arXiv:2311.14450 [pdf, other]

Segment (Almost) Nothing: Prompt-Agnostic Adversarial Attacks on Segmentation Models

Authors: Francesco Croce, Matthias Hein

Abstract: General purpose segmentation models are able to generate (semantic) segmentation masks from a variety of prompts, including visual (points, boxed, etc.) and textual (object names) ones. In particular, input images are pre-processed by an image encoder to obtain embedding vectors which are later used for mask predictions. Existing adversarial attacks target the end-to-end tasks, i.e. aim at alterin… ▽ More General purpose segmentation models are able to generate (semantic) segmentation masks from a variety of prompts, including visual (points, boxed, etc.) and textual (object names) ones. In particular, input images are pre-processed by an image encoder to obtain embedding vectors which are later used for mask predictions. Existing adversarial attacks target the end-to-end tasks, i.e. aim at altering the segmentation mask predicted for a specific image-prompt pair. However, this requires running an individual attack for each new prompt for the same image. We propose instead to generate prompt-agnostic adversarial attacks by maximizing the $\ell_2$-distance, in the latent space, between the embedding of the original and perturbed images. Since the encoding process only depends on the image, distorted image representations will cause perturbations in the segmentation masks for a variety of prompts. We show that even imperceptible $\ell_\infty$-bounded perturbations of radius $ε=1/255$ are often sufficient to drastically modify the masks predicted with point, box and text prompts by recently proposed foundation models for segmentation. Moreover, we explore the possibility of creating universal, i.e. non image-specific, attacks which can be readily applied to any input without further computational cost. △ Less

Submitted 24 November, 2023; originally announced November 2023.

arXiv:2306.12941 [pdf, other]

Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models

Authors: Francesco Croce, Naman D Singh, Matthias Hein

Abstract: Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in acc… ▽ More Adversarial robustness has been studied extensively in image classification, especially for the $\ell_\infty$-threat model, but significantly less so for related tasks such as object detection and semantic segmentation, where attacks turn out to be a much harder optimization problem than for image classification. We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU. The ensemble of our attacks, SEA, shows that existing attacks severely overestimate the robustness of semantic segmentation models. Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust. We investigate why previous adaptations of adversarial training to semantic segmentation failed and show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k. The associated code and robust models are available at https://github.com/nmndeep/robust-segmentation △ Less

Submitted 16 July, 2024; v1 submitted 22 June, 2023; originally announced June 2023.

Comments: ECCV 2024

arXiv:2303.01870 [pdf, other]

Revisiting Adversarial Training for ImageNet: Architectures, Training and Generalization across Threat Models

Authors: Naman D Singh, Francesco Croce, Matthias Hein

Abstract: While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing Patc… ▽ More While adversarial training has been extensively studied for ResNet architectures and low resolution datasets like CIFAR, much less is known for ImageNet. Given the recent debate about whether transformers are more robust than convnets, we revisit adversarial training on ImageNet comparing ViTs and ConvNeXts. Extensive experiments show that minor changes in architecture, most notably replacing PatchStem with ConvStem, and training scheme have a significant impact on the achieved robustness. These changes not only increase robustness in the seen $\ell_\infty$-threat model, but even more so improve generalization to unseen $\ell_1/\ell_2$-attacks. Our modified ConvNeXt, ConvNeXt + ConvStem, yields the most robust $\ell_\infty$-models across different ranges of model parameters and FLOPs, while our ViT + ConvStem yields the best generalization to unseen threat models. △ Less

Submitted 28 October, 2023; v1 submitted 3 March, 2023; originally announced March 2023.

Comments: Accepted at NeurIPS 2023

arXiv:2302.10826 [pdf, ps, other]

ITERATED INSIDE OUT: a new exact algorithm for the transportation problem

Authors: Roberto Bargetto, Federico Della Croce, Rosario Scatamacchia

Abstract: We propose a novel exact algorithm for the transportation problem, one of the paradigmatic network optimization problems. The algorithm, denoted Iterated Inside Out, requires in input a basic feasible solution and is composed by two main phases that are iteratively repeated until an optimal basic feasible solution is reached. In the first "inside" phase, the algorithm progressively improves upon a… ▽ More We propose a novel exact algorithm for the transportation problem, one of the paradigmatic network optimization problems. The algorithm, denoted Iterated Inside Out, requires in input a basic feasible solution and is composed by two main phases that are iteratively repeated until an optimal basic feasible solution is reached. In the first "inside" phase, the algorithm progressively improves upon a given basic solution by increasing the value of several non-basic variables with negative reduced cost. This phase typically outputs a non-basic feasible solution interior to the constraints set polytope. The second "out" phase moves in the opposite direction by iteratively setting to zero several variables until a new improved basic feasible solution is reached. Extensive computational tests show that the proposed approach strongly outperforms all versions of network and linear programming algorithms available in the commercial solvers Cplex and Gurobi and other exact algorithms available in the literature. △ Less

Submitted 29 March, 2023; v1 submitted 21 February, 2023; originally announced February 2023.

arXiv:2302.10164 [pdf, other]

Seasoning Model Soups for Robustness to Adversarial and Natural Distribution Shifts

Authors: Francesco Croce, Sylvestre-Alvise Rebuffi, Evan Shelhamer, Sven Gowal

Abstract: Adversarial training is widely used to make classifiers robust to a specific threat or adversary, such as $\ell_p$-norm bounded perturbations of a given $p$-norm. However, existing methods for training classifiers robust to multiple threats require knowledge of all attacks during training and remain vulnerable to unseen distribution shifts. In this work, we describe how to obtain adversarially-rob… ▽ More Adversarial training is widely used to make classifiers robust to a specific threat or adversary, such as $\ell_p$-norm bounded perturbations of a given $p$-norm. However, existing methods for training classifiers robust to multiple threats require knowledge of all attacks during training and remain vulnerable to unseen distribution shifts. In this work, we describe how to obtain adversarially-robust model soups (i.e., linear combinations of parameters) that smoothly trade-off robustness to different $\ell_p$-norm bounded adversaries. We demonstrate that such soups allow us to control the type and level of robustness, and can achieve robustness to all threats without jointly training on all of them. In some cases, the resulting model soups are more robust to a given $\ell_p$-norm adversary than the constituent model specialized against that same adversary. Finally, we show that adversarially-robust model soups can be a viable tool to adapt to distribution shifts from a few examples. △ Less

Submitted 20 February, 2023; originally announced February 2023.

arXiv:2302.07011 [pdf, other]

A Modern Look at the Relationship between Sharpness and Generalization

Authors: Maksym Andriushchenko, Francesco Croce, Maximilian Müller, Matthias Hein, Nicolas Flammarion

Abstract: Sharpness of minima is a promising quantity that can correlate with generalization in deep networks and, when optimized during training, can improve generalization. However, standard sharpness is not invariant under reparametrizations of neural networks, and, to fix this, reparametrization-invariant sharpness definitions have been proposed, most prominently adaptive sharpness (Kwon et al., 2021).… ▽ More Sharpness of minima is a promising quantity that can correlate with generalization in deep networks and, when optimized during training, can improve generalization. However, standard sharpness is not invariant under reparametrizations of neural networks, and, to fix this, reparametrization-invariant sharpness definitions have been proposed, most prominently adaptive sharpness (Kwon et al., 2021). But does it really capture generalization in modern practical settings? We comprehensively explore this question in a detailed study of various definitions of adaptive sharpness in settings ranging from training from scratch on ImageNet and CIFAR-10 to fine-tuning CLIP on ImageNet and BERT on MNLI. We focus mostly on transformers for which little is known in terms of sharpness despite their widespread usage. Overall, we observe that sharpness does not correlate well with generalization but rather with some training parameters like the learning rate that can be positively or negatively correlated with generalization depending on the setup. Interestingly, in multiple cases, we observe a consistent negative correlation of sharpness with out-of-distribution error implying that sharper minima can generalize better. Finally, we illustrate on a simple model that the right sharpness measure is highly data-dependent, and that we do not understand well this aspect for realistic data distributions. The code of our experiments is available at https://github.com/tml-epfl/sharpness-vs-generalization. △ Less

Submitted 7 June, 2023; v1 submitted 14 February, 2023; originally announced February 2023.

Comments: The camera-ready version (accepted at ICML 2023)

arXiv:2210.11841 [pdf, other]

Diffusion Visual Counterfactual Explanations

Authors: Maximilian Augustin, Valentyn Boreiko, Francesco Croce, Matthias Hein

Abstract: Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image classifier. They are 'small' but 'realistic' semantic changes of the image changing the classifier decision. Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts, or are limited to image classification problems with… ▽ More Visual Counterfactual Explanations (VCEs) are an important tool to understand the decisions of an image classifier. They are 'small' but 'realistic' semantic changes of the image changing the classifier decision. Current approaches for the generation of VCEs are restricted to adversarially robust models and often contain non-realistic artefacts, or are limited to image classification problems with few classes. In this paper, we overcome this by generating Diffusion Visual Counterfactual Explanations (DVCEs) for arbitrary ImageNet classifiers via a diffusion process. Two modifications to the diffusion process are key for our DVCEs: first, an adaptive parameterization, whose hyperparameters generalize across images and models, together with distance regularization and late start of the diffusion process, allow us to generate images with minimal semantic changes to the original ones but different classification. Second, our cone regularization via an adversarially robust model ensures that the diffusion process does not converge to trivial non-semantic changes, but instead produces realistic images of the target class which achieve high confidence by the classifier. △ Less

Submitted 21 October, 2022; originally announced October 2022.

Comments: NeurIPS 2022

arXiv:2210.04886 [pdf, other]

Revisiting adapters with adversarial training

Authors: Sylvestre-Alvise Rebuffi, Francesco Croce, Sven Gowal

Abstract: While adversarial training is generally used as a defense mechanism, recent works show that it can also act as a regularizer. By co-training a neural network on clean and adversarial inputs, it is possible to improve classification accuracy on the clean, non-adversarial inputs. We demonstrate that, contrary to previous findings, it is not necessary to separate batch statistics when co-training on… ▽ More While adversarial training is generally used as a defense mechanism, recent works show that it can also act as a regularizer. By co-training a neural network on clean and adversarial inputs, it is possible to improve classification accuracy on the clean, non-adversarial inputs. We demonstrate that, contrary to previous findings, it is not necessary to separate batch statistics when co-training on clean and adversarial inputs, and that it is sufficient to use adapters with few domain-specific parameters for each type of input. We establish that using the classification token of a Vision Transformer (ViT) as an adapter is enough to match the classification performance of dual normalization layers, while using significantly less additional parameters. First, we improve upon the top-1 accuracy of a non-adversarially trained ViT-B16 model by +1.12% on ImageNet (reaching 83.76% top-1 accuracy). Second, and more importantly, we show that training with adapters enables model soups through linear combinations of the clean and adversarial tokens. These model soups, which we call adversarial model soups, allow us to trade-off between clean and robust accuracy without sacrificing efficiency. Finally, we show that we can easily adapt the resulting models in the face of distribution shifts. Our ViT-B16 obtains top-1 accuracies on ImageNet variants that are on average +4.00% better than those obtained with Masked Autoencoders. △ Less

Submitted 10 October, 2022; originally announced October 2022.

arXiv:2209.06953 [pdf, other]

On the interplay of adversarial robustness and architecture components: patches, convolution and attention

Authors: Francesco Croce, Matthias Hein

Abstract: In recent years novel architecture components for image classification have been developed, starting with attention and patches used in transformers. While prior works have analyzed the influence of some aspects of architecture components on the robustness to adversarial attacks, in particular for vision transformers, the understanding of the main factors is still limited. We compare several (non)… ▽ More In recent years novel architecture components for image classification have been developed, starting with attention and patches used in transformers. While prior works have analyzed the influence of some aspects of architecture components on the robustness to adversarial attacks, in particular for vision transformers, the understanding of the main factors is still limited. We compare several (non)-robust classifiers with different architectures and study their properties, including the effect of adversarial training on the interpretability of the learnt features and robustness to unseen threat models. An ablation from ResNet to ConvNeXt reveals key architectural changes leading to almost $10\%$ higher $\ell_\infty$-robustness. △ Less

Submitted 14 September, 2022; originally announced September 2022.

Comments: Presented at the "New Frontiers in Adversarial Machine Learning" Workshop at ICML 2022

arXiv:2206.06182 [pdf, other]

AI-based Data Preparation and Data Analytics in Healthcare: The Case of Diabetes

Authors: Marianna Maranghi, Aris Anagnostopoulos, Irene Cannistraci, Ioannis Chatzigiannakis, Federico Croce, Giulia Di Teodoro, Michele Gentile, Giorgio Grani, Maurizio Lenzerini, Stefano Leonardi, Andrea Mastropietro, Laura Palagi, Massimiliano Pappa, Riccardo Rosati, Riccardo Valentini, Paola Velardi

Abstract: The Associazione Medici Diabetologi (AMD) collects and manages one of the largest worldwide-available collections of diabetic patient records, also known as the AMD database. This paper presents the initial results of an ongoing project whose focus is the application of Artificial Intelligence and Machine Learning techniques for conceptualizing, cleaning, and analyzing such an important and valuab… ▽ More The Associazione Medici Diabetologi (AMD) collects and manages one of the largest worldwide-available collections of diabetic patient records, also known as the AMD database. This paper presents the initial results of an ongoing project whose focus is the application of Artificial Intelligence and Machine Learning techniques for conceptualizing, cleaning, and analyzing such an important and valuable dataset, with the goal of providing predictive insights to better support diabetologists in their diagnostic and therapeutic choices. △ Less

Submitted 20 July, 2022; v1 submitted 13 June, 2022; originally announced June 2022.

Comments: The work has been presented at the conference Ital-IA 2022 (https://www.ital-ia2022.it/)

arXiv:2205.07972 [pdf, other]

doi 10.1007/978-3-031-16788-1_9

Sparse Visual Counterfactual Explanations in Image Space

Authors: Valentyn Boreiko, Maximilian Augustin, Francesco Croce, Philipp Berens, Matthias Hein

Abstract: Visual counterfactual explanations (VCEs) in image space are an important tool to understand decisions of image classifiers as they show under which changes of the image the decision of the classifier would change. Their generation in image space is challenging and requires robust models due to the problem of adversarial examples. Existing techniques to generate VCEs in image space suffer from spu… ▽ More Visual counterfactual explanations (VCEs) in image space are an important tool to understand decisions of image classifiers as they show under which changes of the image the decision of the classifier would change. Their generation in image space is challenging and requires robust models due to the problem of adversarial examples. Existing techniques to generate VCEs in image space suffer from spurious changes in the background. Our novel perturbation model for VCEs together with its efficient optimization via our novel Auto-Frank-Wolfe scheme yields sparse VCEs which lead to subtle changes specific for the target class. Moreover, we show that VCEs can be used to detect undesired behavior of ImageNet classifiers due to spurious features in the ImageNet dataset. △ Less

Submitted 29 September, 2022; v1 submitted 16 May, 2022; originally announced May 2022.

Journal ref: GCPR 2022

arXiv:2202.13711 [pdf, other]

Evaluating the Adversarial Robustness of Adaptive Test-time Defenses

Authors: Francesco Croce, Sven Gowal, Thomas Brunner, Evan Shelhamer, Matthias Hein, Taylan Cemgil

Abstract: Adaptive defenses, which optimize at test time, promise to improve adversarial robustness. We categorize such adaptive test-time defenses, explain their potential benefits and drawbacks, and evaluate a representative variety of the latest adaptive defenses for image classification. Unfortunately, none significantly improve upon static defenses when subjected to our careful case study evaluation. S… ▽ More Adaptive defenses, which optimize at test time, promise to improve adversarial robustness. We categorize such adaptive test-time defenses, explain their potential benefits and drawbacks, and evaluate a representative variety of the latest adaptive defenses for image classification. Unfortunately, none significantly improve upon static defenses when subjected to our careful case study evaluation. Some even weaken the underlying static model while simultaneously increasing inference computation. While these results are disappointing, we still believe that adaptive test-time defenses are a promising avenue of research and, as such, we provide recommendations for their thorough evaluation. We extend the checklist of Carlini et al. (2019) by providing concrete steps specific to adaptive defenses. △ Less

Submitted 13 July, 2022; v1 submitted 28 February, 2022; originally announced February 2022.

Comments: ICML'22

arXiv:2108.10021 [pdf, ps, other]

QDEF and Its Approximations in OBDM

Authors: Gianluca Cima, Federico Croce, Maurizio Lenzerini

Abstract: Given an input dataset (i.e., a set of tuples), query definability in Ontology-based Data Management (OBDM) amounts to find a query over the ontology whose certain answers coincide with the tuples in the given dataset. We refer to such a query as a characterization of the dataset with respect to the OBDM system. Our first contribution is to propose approximations of perfect characterizations in te… ▽ More Given an input dataset (i.e., a set of tuples), query definability in Ontology-based Data Management (OBDM) amounts to find a query over the ontology whose certain answers coincide with the tuples in the given dataset. We refer to such a query as a characterization of the dataset with respect to the OBDM system. Our first contribution is to propose approximations of perfect characterizations in terms of recall (complete characterizations) and precision (sound characterizations). A second contribution is to present a thorough complexity analysis of three computational problems, namely verification (check whether a given query is a perfect, or an approximated characterization of a given dataset), existence (check whether a perfect, or a best approximated characterization of a given dataset exists), and computation (compute a perfect, or best approximated characterization of a given dataset). △ Less

Submitted 23 August, 2021; originally announced August 2021.

Comments: A more compact version of this paper will be published at the proceedings of the 30th ACM International Conference on Information and Knowledge Management. The associated DOI is: https://doi.org/10.1145/3459637.34824661

arXiv:2105.12508 [pdf, other]

Adversarial Robustness against Multiple and Single $l_p$-Threat Models via Quick Fine-Tuning of Robust Classifiers

Authors: Francesco Croce, Matthias Hein

Abstract: A major drawback of adversarially robust models, in particular for large scale datasets like ImageNet, is the extremely long training time compared to standard ones. Moreover, models should be robust not only to one $l_p$-threat model but ideally to all of them. In this paper we propose Extreme norm Adversarial Training (E-AT) for multiple-norm robustness which is based on geometric properties of… ▽ More A major drawback of adversarially robust models, in particular for large scale datasets like ImageNet, is the extremely long training time compared to standard ones. Moreover, models should be robust not only to one $l_p$-threat model but ideally to all of them. In this paper we propose Extreme norm Adversarial Training (E-AT) for multiple-norm robustness which is based on geometric properties of $l_p$-balls. E-AT costs up to three times less than other adversarial training methods for multiple-norm robustness. Using E-AT we show that for ImageNet a single epoch and for CIFAR-10 three epochs are sufficient to turn any $l_p$-robust model into a multiple-norm robust model. In this way we get the first multiple-norm robust model for ImageNet and boost the state-of-the-art for multiple-norm robustness to more than $51\%$ on CIFAR-10. Finally, we study the general transfer via fine-tuning of adversarial robustness between different individual $l_p$-threat models and improve the previous SOTA $l_1$-robustness on both CIFAR-10 and ImageNet. Extensive experiments show that our scheme works across datasets and architectures including vision transformers. △ Less

Submitted 7 August, 2022; v1 submitted 26 May, 2021; originally announced May 2021.

Comments: ICML 2022

arXiv:2103.01208 [pdf, other]

Mind the box: $l_1$-APGD for sparse adversarial attacks on image classifiers

Authors: Francesco Croce, Matthias Hein

Abstract: We show that when taking into account also the image domain $[0,1]^d$, established $l_1$-projected gradient descent (PGD) attacks are suboptimal as they do not consider that the effective threat model is the intersection of the $l_1$-ball and $[0,1]^d$. We study the expected sparsity of the steepest descent step for this effective threat model and show that the exact projection onto this set is co… ▽ More We show that when taking into account also the image domain $[0,1]^d$, established $l_1$-projected gradient descent (PGD) attacks are suboptimal as they do not consider that the effective threat model is the intersection of the $l_1$-ball and $[0,1]^d$. We study the expected sparsity of the steepest descent step for this effective threat model and show that the exact projection onto this set is computationally feasible and yields better performance. Moreover, we propose an adaptive form of PGD which is highly effective even with a small budget of iterations. Our resulting $l_1$-APGD is a strong white-box attack showing that prior works overestimated their $l_1$-robustness. Using $l_1$-APGD for adversarial training we get a robust classifier with SOTA $l_1$-robustness. Finally, we combine $l_1$-APGD and an adaptation of the Square Attack to $l_1$ into $l_1$-AutoAttack, an ensemble of attacks which reliably assesses adversarial robustness for the threat model of $l_1$-ball intersected with $[0,1]^d$. △ Less

Submitted 24 November, 2023; v1 submitted 1 March, 2021; originally announced March 2021.

Comments: In ICML 2021. Fixed typos in Eq. (3) and Eq. (4)

arXiv:2010.09670 [pdf, other]

RobustBench: a standardized adversarial robustness benchmark

Authors: Francesco Croce, Maksym Andriushchenko, Vikash Sehwag, Edoardo Debenedetti, Nicolas Flammarion, Mung Chiang, Prateek Mittal, Matthias Hein

Abstract: As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models. A key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. Our goal is to establish a standardized benchmark of adversarial robus… ▽ More As a research community, we are still lacking a systematic understanding of the progress on adversarial robustness which often makes it hard to identify the most promising ideas in training robust models. A key challenge in benchmarking robustness is that its evaluation is often error-prone leading to robustness overestimation. Our goal is to establish a standardized benchmark of adversarial robustness, which as accurately as possible reflects the robustness of the considered models within a reasonable computational budget. To this end, we start by considering the image classification task and introduce restrictions (possibly loosened in the future) on the allowed models. We evaluate adversarial robustness with AutoAttack, an ensemble of white- and black-box attacks, which was recently shown in a large-scale study to improve almost all robustness evaluations compared to the original publications. To prevent overadaptation of new defenses to AutoAttack, we welcome external evaluations based on adaptive attacks, especially where AutoAttack flags a potential overestimation of robustness. Our leaderboard, hosted at https://robustbench.github.io/, contains evaluations of 120+ models and aims at reflecting the current state of the art in image classification on a set of well-defined tasks in $\ell_\infty$- and $\ell_2$-threat models and on common corruptions, with possible extensions in the future. Additionally, we open-source the library https://github.com/RobustBench/robustbench that provides unified access to 80+ robust models to facilitate their downstream applications. Finally, based on the collected models, we analyze the impact of robustness on the performance on distribution shifts, calibration, out-of-distribution detection, fairness, privacy leakage, smoothness, and transferability. △ Less

Submitted 31 October, 2021; v1 submitted 19 October, 2020; originally announced October 2020.

Comments: The camera-ready version accepted at the NeurIPS'21 Datasets and Benchmarks Track: 120+ evaluations, 80+ models, 7 leaderboards (Linf, L2, common corruptions; CIFAR-10, CIFAR-100, ImageNet), significantly expanded analysis part (calibration, fairness, privacy leakage, smoothness, transferability)

arXiv:2006.12834 [pdf, other]

Sparse-RS: a versatile framework for query-efficient sparse black-box adversarial attacks

Authors: Francesco Croce, Maksym Andriushchenko, Naman D. Singh, Nicolas Flammarion, Matthias Hein

Abstract: We propose a versatile framework based on random search, Sparse-RS, for score-based sparse targeted and untargeted attacks in the black-box setting. Sparse-RS does not rely on substitute models and achieves state-of-the-art success rate and query efficiency for multiple sparse attack models: $l_0$-bounded perturbations, adversarial patches, and adversarial frames. The $l_0$-version of untargeted S… ▽ More We propose a versatile framework based on random search, Sparse-RS, for score-based sparse targeted and untargeted attacks in the black-box setting. Sparse-RS does not rely on substitute models and achieves state-of-the-art success rate and query efficiency for multiple sparse attack models: $l_0$-bounded perturbations, adversarial patches, and adversarial frames. The $l_0$-version of untargeted Sparse-RS outperforms all black-box and even all white-box attacks for different models on MNIST, CIFAR-10, and ImageNet. Moreover, our untargeted Sparse-RS achieves very high success rates even for the challenging settings of $20\times20$ adversarial patches and $2$-pixel wide adversarial frames for $224\times224$ images. Finally, we show that Sparse-RS can be applied to generate targeted universal adversarial patches where it significantly outperforms the existing approaches. The code of our framework is available at https://github.com/fra31/sparse-rs. △ Less

Submitted 7 February, 2022; v1 submitted 23 June, 2020; originally announced June 2020.

Comments: Accepted at AAAI 2022. This version contains considerably extended results in the L0 threat model

arXiv:2005.06225 [pdf, ps, other]

An improved solution approach for the Budget constrained Fuel Treatment Scheduling problem

Authors: Federico Della Croce, Marco Ghirardi, Rosario Scatamacchia

Abstract: This paper considers the budget constrained fuel treatment scheduling (BFTS) problem where, in the context of wildfire mitigation, the goal is to inhibit the potential of fire spread in a landscape by proper fuel treatment activities. Given a time horizon represented by consecutive unit periods, the landscape is divided into cells and represented as a grid graph where each cell has a fuel age that… ▽ More This paper considers the budget constrained fuel treatment scheduling (BFTS) problem where, in the context of wildfire mitigation, the goal is to inhibit the potential of fire spread in a landscape by proper fuel treatment activities. Given a time horizon represented by consecutive unit periods, the landscape is divided into cells and represented as a grid graph where each cell has a fuel age that increases over time and becomes old if no treatment is applied in the meantime: this induces a potential high fire risk whenever two contiguous cells are old. Cells fuel ages can be reset to zero under appropriate fuel treatments but there is a limited budget for treatment in each period. The problem calls for finding a suitable selection of cells to be treated so as to minimize the presence of old contiguous cells over the whole time horizon. We prove that problem BFTS is strongly NP-complete on paths and thus on grid graphs and show that no polynomial time approximation algorithm exists unless P = NP. We provide an enhanced integer linear programming formulation of the problem with respect to the relevant literature that shows up to be efficiently solved by an ILP solver on reasonably large size instances. Finally, we consider a harder periodic variant of the problem with the aim of finding a cyclic treatment plan with cycles of length T and propose a matheuristic approach capable of efficiently tackling those instances where an ILP solver applied to the ILP formulation runs into difficulties. △ Less

Submitted 13 May, 2020; originally announced May 2020.

arXiv:2003.12460 [pdf, ps, other]

An enhanced pinwheel algorithm for the bamboo garden trimming problem

Authors: Federico Della Croce

Abstract: In the Bamboo Garden Trimming Problem (BGT), there is a garden populated by n bamboos b(1), b(2), ... , b(n)$ with daily growth rates h(1) >= h(2) >= ... >= h(n). We assume that the initial heights of bamboos are zero. A gardener is in charge of the bamboos and trims them to height zero according to some schedule. The objective is to design a perpetual schedule of trimming so as to maintain the he… ▽ More In the Bamboo Garden Trimming Problem (BGT), there is a garden populated by n bamboos b(1), b(2), ... , b(n)$ with daily growth rates h(1) >= h(2) >= ... >= h(n). We assume that the initial heights of bamboos are zero. A gardener is in charge of the bamboos and trims them to height zero according to some schedule. The objective is to design a perpetual schedule of trimming so as to maintain the height of the bamboo garden as low as possible. We consider the so-called discrete BGT variant, where the gardener is allowed to trim only one bamboo at the end of each day. For discrete BGT, the current state-of-the-art approximation algorithm exploits the relationship between BGT and the classical Pinwheel scheduling problem and provides a solution that guarantees a 2-approximation ratio. We propose an alternative Pinwheel scheduling algorithm with approximation ratio converging to 12/7 when sum h(j) > > h(1). Also, we show that the approximation ratio of the proposed algorithm never exceeds 32000/16947 approximately 1.888. This is the first algorithm reaching a ratio strictly inferior to 19/10. △ Less

Submitted 3 June, 2020; v1 submitted 27 March, 2020; originally announced March 2020.

arXiv:2003.01690 [pdf, other]

Reliable evaluation of adversarial robustness with an ensemble of diverse parameter-free attacks

Authors: Francesco Croce, Matthias Hein

Abstract: The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of adversarial defenses is often insufficient and thus gives a wrong impression of robustness. Many promising defenses could be broken later on, making it difficult to identify the state-of-the-art. Frequent pitfalls in the evaluation are improper tuni… ▽ More The field of defense strategies against adversarial attacks has significantly grown over the last years, but progress is hampered as the evaluation of adversarial defenses is often insufficient and thus gives a wrong impression of robustness. Many promising defenses could be broken later on, making it difficult to identify the state-of-the-art. Frequent pitfalls in the evaluation are improper tuning of hyperparameters of the attacks, gradient obfuscation or masking. In this paper we first propose two extensions of the PGD-attack overcoming failures due to suboptimal step size and problems of the objective function. We then combine our novel attacks with two complementary existing ones to form a parameter-free, computationally affordable and user-independent ensemble of attacks to test adversarial robustness. We apply our ensemble to over 50 models from papers published at recent top machine learning and computer vision venues. In all except one of the cases we achieve lower robust test accuracy than reported in these papers, often by more than $10\%$, identifying several broken defenses. △ Less

Submitted 4 August, 2020; v1 submitted 3 March, 2020; originally announced March 2020.

Comments: In ICML 2020

arXiv:1912.00049 [pdf, other]

Square Attack: a query-efficient black-box adversarial attack via random search

Authors: Maksym Andriushchenko, Francesco Croce, Nicolas Flammarion, Matthias Hein

Abstract: We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the… ▽ More We propose the Square Attack, a score-based black-box $l_2$- and $l_\infty$-adversarial attack that does not rely on local gradient information and thus is not affected by gradient masking. Square Attack is based on a randomized search scheme which selects localized square-shaped updates at random positions so that at each iteration the perturbation is situated approximately at the boundary of the feasible set. Our method is significantly more query efficient and achieves a higher success rate compared to the state-of-the-art methods, especially in the untargeted setting. In particular, on ImageNet we improve the average query efficiency in the untargeted setting for various deep networks by a factor of at least $1.8$ and up to $3$ compared to the recent state-of-the-art $l_\infty$-attack of Al-Dujaili & O'Reilly. Moreover, although our attack is black-box, it can also outperform gradient-based white-box attacks on the standard benchmarks achieving a new state-of-the-art in terms of the success rate. The code of our attack is available at https://github.com/max-andr/square-attack. △ Less

Submitted 29 July, 2020; v1 submitted 29 November, 2019; originally announced December 2019.

Comments: Accepted at ECCV 2020; added imperceptible perturbations, analysis of examples that require more queries, results on dilated CNNs

arXiv:1909.05040 [pdf, other]

Sparse and Imperceivable Adversarial Attacks

Authors: Francesco Croce, Matthias Hein

Abstract: Neural networks have been proven to be vulnerable to a variety of adversarial attacks. From a safety perspective, highly sparse adversarial attacks are particularly dangerous. On the other hand the pixelwise perturbations of sparse attacks are typically large and thus can be potentially detected. We propose a new black-box technique to craft adversarial examples aiming at minimizing $l_0$-distance… ▽ More Neural networks have been proven to be vulnerable to a variety of adversarial attacks. From a safety perspective, highly sparse adversarial attacks are particularly dangerous. On the other hand the pixelwise perturbations of sparse attacks are typically large and thus can be potentially detected. We propose a new black-box technique to craft adversarial examples aiming at minimizing $l_0$-distance to the original image. Extensive experiments show that our attack is better or competitive to the state of the art. Moreover, we can integrate additional bounds on the componentwise perturbation. Allowing pixels to change only in region of high variation and avoiding changes along axis-aligned edges makes our adversarial examples almost non-perceivable. Moreover, we adapt the Projected Gradient Descent attack to the $l_0$-norm integrating componentwise constraints. This allows us to do adversarial training to enhance the robustness of classifiers against sparse and imperceivable adversarial manipulations. △ Less

Submitted 11 September, 2019; originally announced September 2019.

Comments: Accepted to ICCV 2019

arXiv:1907.02044 [pdf, other]

Minimally distorted Adversarial Examples with a Fast Adaptive Boundary Attack

Authors: Francesco Croce, Matthias Hein

Abstract: The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change… ▽ More The evaluation of robustness against adversarial manipulation of neural networks-based classifiers is mainly tested with empirical attacks as methods for the exact computation, even when available, do not scale to large networks. We propose in this paper a new white-box adversarial attack wrt the $l_p$-norms for $p \in \{1,2,\infty\}$ aiming at finding the minimal perturbation necessary to change the class of a given input. It has an intuitive geometric meaning, yields quickly high quality results, minimizes the size of the perturbation (so that it returns the robust accuracy at every threshold with a single run). It performs better or similar to state-of-the-art attacks which are partially specialized to one $l_p$-norm, and is robust to the phenomenon of gradient masking. △ Less

Submitted 20 July, 2020; v1 submitted 3 July, 2019; originally announced July 2019.

arXiv:1905.11213 [pdf, other]

Provable robustness against all adversarial $l_p$-perturbations for $p\geq 1$

Authors: Francesco Croce, Matthias Hein

Abstract: In recent years several adversarial attacks and defenses have been proposed. Often seemingly robust models turn out to be non-robust when more sophisticated attacks are used. One way out of this dilemma are provable robustness guarantees. While provably robust models for specific $l_p$-perturbation models have been developed, we show that they do not come with any guarantee against other $l_q$-per… ▽ More In recent years several adversarial attacks and defenses have been proposed. Often seemingly robust models turn out to be non-robust when more sophisticated attacks are used. One way out of this dilemma are provable robustness guarantees. While provably robust models for specific $l_p$-perturbation models have been developed, we show that they do not come with any guarantee against other $l_q$-perturbations. We propose a new regularization scheme, MMR-Universal, for ReLU networks which enforces robustness wrt $l_1$- and $l_\infty$-perturbations and show how that leads to the first provably robust models wrt any $l_p$-norm for $p\geq 1$. △ Less

Submitted 24 April, 2020; v1 submitted 27 May, 2019; originally announced May 2019.

arXiv:1903.11359 [pdf, other]

Scaling up the randomized gradient-free adversarial attack reveals overestimation of robustness using established attacks

Authors: Francesco Croce, Jonas Rauber, Matthias Hein

Abstract: Modern neural networks are highly non-robust against adversarial manipulation. A significant amount of work has been invested in techniques to compute lower bounds on robustness through formal guarantees and to build provably robust models. However, it is still difficult to get guarantees for larger networks or robustness against larger perturbations. Thus attack strategies are needed to provide t… ▽ More Modern neural networks are highly non-robust against adversarial manipulation. A significant amount of work has been invested in techniques to compute lower bounds on robustness through formal guarantees and to build provably robust models. However, it is still difficult to get guarantees for larger networks or robustness against larger perturbations. Thus attack strategies are needed to provide tight upper bounds on the actual robustness. We significantly improve the randomized gradient-free attack for ReLU networks [9], in particular by scaling it up to large networks. We show that our attack achieves similar or significantly smaller robust accuracy than state-of-the-art attacks like PGD or the one of Carlini and Wagner, thus revealing an overestimation of the robustness by these state-of-the-art methods. Our attack is not based on a gradient descent scheme and in this sense gradient-free, which makes it less sensitive to the choice of hyperparameters as no careful selection of the stepsize is required. △ Less

Submitted 25 September, 2019; v1 submitted 27 March, 2019; originally announced March 2019.

Comments: Accepted at International Journal of Computer Vision

arXiv:1811.11493 [pdf, other]

A randomized gradient-free attack on ReLU networks

Authors: Francesco Croce, Matthias Hein

Abstract: It has recently been shown that neural networks but also other classifiers are vulnerable to so called adversarial attacks e.g. in object recognition an almost non-perceivable change of the image changes the decision of the classifier. Relatively fast heuristics have been proposed to produce these adversarial inputs but the problem of finding the optimal adversarial input, that is with the minimal… ▽ More It has recently been shown that neural networks but also other classifiers are vulnerable to so called adversarial attacks e.g. in object recognition an almost non-perceivable change of the image changes the decision of the classifier. Relatively fast heuristics have been proposed to produce these adversarial inputs but the problem of finding the optimal adversarial input, that is with the minimal change of the input, is NP-hard. While methods based on mixed-integer optimization which find the optimal adversarial input have been developed, they do not scale to large networks. Currently, the attack scheme proposed by Carlini and Wagner is considered to produce the best adversarial inputs. In this paper we propose a new attack scheme for the class of ReLU networks based on a direct optimization on the resulting linear regions. In our experimental validation we improve in all except one experiment out of 18 over the Carlini-Wagner attack with a relative improvement of up to 9\%. As our approach is based on the geometrical structure of ReLU networks, it is less susceptible to defences targeting their functional properties. △ Less

Submitted 28 November, 2018; originally announced November 2018.

Comments: In GCPR 2018

arXiv:1811.02882 [pdf, ps, other]

doi 10.1016/j.cor.2010.10.017

Iterated local search and very large neighborhoods for the parallel-machines total tardiness problem

Authors: F Croce, Thierry Garaix, A. Grosso

Abstract: We present computational results with a heuristic algorithm for the parallel machines total weighted tardiness problem. The algorithm combines generalized pairwise interchange neighborhoods, dynasearch optimization and a new machine-based neighborhood whose size is non-polynomial in the number of machines. The computational results significantly improve over the current state of the art for this p… ▽ More We present computational results with a heuristic algorithm for the parallel machines total weighted tardiness problem. The algorithm combines generalized pairwise interchange neighborhoods, dynasearch optimization and a new machine-based neighborhood whose size is non-polynomial in the number of machines. The computational results significantly improve over the current state of the art for this problem. △ Less

Submitted 26 October, 2018; originally announced November 2018.

Journal ref: Computers \& Operations Research, 2012, 39 (6), pp.1213 - 1217

arXiv:1811.02822 [pdf, other]

A new exact approach for the Bilevel Knapsack with Interdiction Constraints

Authors: Federico Della Croce, Rosario Scatamacchia

Abstract: We consider the Bilevel Knapsack with Interdiction Constraints, an extension of the classic 0-1 knapsack problem formulated as a Stackelberg game with two agents, a leader and a follower, that choose items from a common set and hold their own private knapsacks. First, the leader selects some items to be interdicted for the follower while satisfying a capacity constraint. Then the follower packs a… ▽ More We consider the Bilevel Knapsack with Interdiction Constraints, an extension of the classic 0-1 knapsack problem formulated as a Stackelberg game with two agents, a leader and a follower, that choose items from a common set and hold their own private knapsacks. First, the leader selects some items to be interdicted for the follower while satisfying a capacity constraint. Then the follower packs a set of the remaining items according to his knapsack constraint in order to maximize the profits. The goal of the leader is to minimize the follower's profits. The presence of two decision levels makes this problem very difficult to solve in practice: the current state-of-the-art algorithms can solve to optimality instances with 50-55 items at most. We derive effective lower bounds and present a new exact approach that exploits the structure of the induced follower's problem. The approach successfully solves all benchmark instances within one second in the worst case and larger instances with up to 500 items within 60 seconds. △ Less

Submitted 12 November, 2018; v1 submitted 7 November, 2018; originally announced November 2018.

arXiv:1810.07481 [pdf, other]

Provable Robustness of ReLU networks via Maximization of Linear Regions

Authors: Francesco Croce, Maksym Andriushchenko, Matthias Hein

Abstract: It has been shown that neural network classifiers are not robust. This raises concerns about their usage in safety-critical systems. We propose in this paper a regularization scheme for ReLU networks which provably improves the robustness of the classifier by maximizing the linear regions of the classifier as well as the distance to the decision boundary. Our techniques allow even to find the mini… ▽ More It has been shown that neural network classifiers are not robust. This raises concerns about their usage in safety-critical systems. We propose in this paper a regularization scheme for ReLU networks which provably improves the robustness of the classifier by maximizing the linear regions of the classifier as well as the distance to the decision boundary. Our techniques allow even to find the minimal adversarial perturbation for a fraction of test points for large networks. In the experiments we show that our approach improves upon adversarial training both in terms of lower and upper bounds on the robustness and is comparable or better than the state-of-the-art in terms of test error and robustness. △ Less

Submitted 8 March, 2019; v1 submitted 17 October, 2018; originally announced October 2018.

Comments: In AISTATS 2019. Conference version with the following modifications: improved readability, comparison to Xiao et al (2018) added, section on visualizations extended

arXiv:1801.05489 [pdf, ps, other]

Longest Processing Time rule for identical parallel machines revisited

Authors: Federico Della Croce, Rosario Scatamacchia

Abstract: We consider the Pm || Cmax scheduling problem where the goal is to schedule n jobs on m identical parallel machines to minimize makespan. We revisit the famous Longest Processing Time (LPT) rule proposed by Graham in 1969. LPT requires to sort jobs in non-ascending order of processing times and then to assign one job at a time to the machine whose load is smallest so far. We provide new insights o… ▽ More We consider the Pm || Cmax scheduling problem where the goal is to schedule n jobs on m identical parallel machines to minimize makespan. We revisit the famous Longest Processing Time (LPT) rule proposed by Graham in 1969. LPT requires to sort jobs in non-ascending order of processing times and then to assign one job at a time to the machine whose load is smallest so far. We provide new insights on LPT and discuss the approximation ratio of a modification of LPT that improves Graham's bound from 4/3 - 1/(3m) to 4/3 - 1/(3(m-1)) for m >= 3 and from 7/6 to 9/8 for m = 2. We use Linear Programming (LP) to analyze the approximation ratio of our approach. This performance analysis can be seen as a valid alternative to formal proofs based on analytical derivation. Also, we derive from the proposed approach an O(n log n) heuristic. The heuristic splits the sorted jobset in tuples of m consecutive jobs (1,...,m; m+1,...,2m; etc.) and sorts the tuples in non-increasing order of the difference (slack) between largest job and smallest job in the tuple. Then, List Scheduling is applied to the set of sorted tuples. This approach strongly outperforms LPT on benchmark literature instances. △ Less

Submitted 16 January, 2018; originally announced January 2018.

arXiv:1801.04801 [pdf, ps, other]

Approximating the Incremental Knapsack Problem

Authors: Federico Della Croce, Ulrich Pferschy, Rosario Scatamacchia

Abstract: We consider the 0-1 Incremental Knapsack Problem (IKP) where the capacity grows over time periods and if an item is placed in the knapsack in a certain period, it cannot be removed afterwards. The contribution of a packed item in each time period depends on its profit as well as on a time factor which reflects the importance of the period in the objective function. The problem calls for maximizing… ▽ More We consider the 0-1 Incremental Knapsack Problem (IKP) where the capacity grows over time periods and if an item is placed in the knapsack in a certain period, it cannot be removed afterwards. The contribution of a packed item in each time period depends on its profit as well as on a time factor which reflects the importance of the period in the objective function. The problem calls for maximizing the weighted sum of the profits over the whole time horizon. In this work, we provide approximation results for IKP and its restricted variants. In some results, we rely on Linear Programming (LP) to derive approximation bounds and show how the proposed LP-based analysis can be seen as a valid alternative to more formal proof systems. We first manage to prove the tightness of some approximation ratios of a general purpose algorithm currently available in the literature and originally applied to a time-invariant version of the problem. We also devise a Polynomial Time Approximation Scheme (PTAS) when the input value indicating the number of periods is considered as a constant. Then, we add the mild and natural assumption that each item can be packed in the first time period. For this variant, we discuss different approximation algorithms suited for any number of time periods and for the special case with two periods. △ Less

Submitted 15 January, 2018; originally announced January 2018.

arXiv:1709.00252 [pdf, other]

MILP and Max-Clique based heuristics for the Eternity II puzzle

Authors: Fabio Salassa, Wim Vancroonenburg, Tony Wauters, Federico Della Croce, Greet Vanden Berghe

Abstract: The present paper considers a hybrid local search approach to the Eternity II puzzle and to unsigned, rectangular, edge matching puzzles in general. Both an original mixed-integer linear programming (MILP) formulation and a novel Max-Clique formulation are presented for this NP-hard problem. Although the presented formulations remain computationally intractable for medium and large sized instances… ▽ More The present paper considers a hybrid local search approach to the Eternity II puzzle and to unsigned, rectangular, edge matching puzzles in general. Both an original mixed-integer linear programming (MILP) formulation and a novel Max-Clique formulation are presented for this NP-hard problem. Although the presented formulations remain computationally intractable for medium and large sized instances, they can serve as the basis for developing heuristic decompositions and very large scale neighbourhoods. As a side product of the Max-Clique formulation, new hard-to-solve instances are published for the academic research community. Two reasonably well performing MILP-based constructive methods are presented and used for determining the initial solution of a multi-neighbourhood local search approach. Experimental results confirm that this local search can further improve the results obtained by the constructive heuristics and is quite competitive with the state of the art procedures. △ Less

Submitted 5 October, 2017; v1 submitted 1 September, 2017; originally announced September 2017.

arXiv:1707.02849 [pdf, other]

No-idle, no-wait: when shop scheduling meets dominoes, eulerian and hamiltonian paths

Authors: Jean-Charles Billaut, Federico Della Croce, Fabio Salassa, Vincent T'kindt

Abstract: In shop scheduling, several applications exist where it is required that some components perform consecutively. We refer to no-idle schedules if machines are required to operate with no inserted idle time and no-wait schedules if tasks cannot wait between the end of an operation and the start of the following one. We consider here no-idle/no-wait shop scheduling problems with makespan as performan… ▽ More In shop scheduling, several applications exist where it is required that some components perform consecutively. We refer to no-idle schedules if machines are required to operate with no inserted idle time and no-wait schedules if tasks cannot wait between the end of an operation and the start of the following one. We consider here no-idle/no-wait shop scheduling problems with makespan as performance measure and determine related complexity results. We first analyze the two-machine no-idle/no-wait flow shop problem and show that it is equivalent to a special version of the game of dominoes which is polynomially solvable by tackling an Eulerian path problem on a directed graph. We present for this problem an O(n) exact algorithm. As a byproduct we show that the Hamiltonian Path problem on a digraph G(V,A) with a special structure (where every pair of vertices i,j either has all successors in common or has no common successors) reduces to the two-machine no-idle/no-wait flow shop problem. Correspondingly, we provide a new polynomially solvable special case of the Hamiltonian Path problem. Then, we show that also the corresponding $m$-machine no-idle no-wait flow shop problem is polynomially solvable and provide an O(mn log n) exact algorithm. Finally we prove that the 2-machine no-idle/no-wait job shop problem and the 2-machine no-idle/no-wait open shop problem are NP-Hard in the strong sense. △ Less

Submitted 10 July, 2017; originally announced July 2017.

arXiv:1702.04211 [pdf, ps, other]

Dynamic programming algorithms, efficient solution of the LP-relaxation and approximation schemes for the Penalized Knapsack Problem

Authors: Federico Della Croce, Ulrich Pferschy, Rosario Scatamacchia

Abstract: We consider the 0-1 Penalized Knapsack Problem (PKP). Each item has a profit, a weight and a penalty and the goal is to maximize the sum of the profits minus the greatest penalty value of the items included in a solution. We propose an exact approach relying on a procedure which narrows the relevant range of penalties, on the identification of a core problem and on dynamic programming. The propose… ▽ More We consider the 0-1 Penalized Knapsack Problem (PKP). Each item has a profit, a weight and a penalty and the goal is to maximize the sum of the profits minus the greatest penalty value of the items included in a solution. We propose an exact approach relying on a procedure which narrows the relevant range of penalties, on the identification of a core problem and on dynamic programming. The proposed approach turns out to be very effective in solving hard instances of PKP and compares favorably both to commercial solver CPLEX 12.5 applied to the ILP formulation of the problem and to the best available exact algorithm in the literature. Then we present a general inapproximability result and investigate several relevant special cases which permit fully polynomial time approximation schemes (FPTASs). △ Less

Submitted 14 February, 2017; originally announced February 2017.

Showing 1–40 of 40 results for author: Croce, F