-
LLaMandement: Large Language Models for Summarization of French Legislative Proposals
Authors:
Joseph Gesnouin,
Yannis Tannier,
Christophe Gomes Da Silva,
Hatim Tapory,
Camille Brier,
Hugo Simon,
Raphael Rozenberg,
Hermann Woehrel,
Mehdi El Yakaabi,
Thomas Binder,
Guillaume Marie,
Emilie Caron,
Mathile Nogueira,
Thomas Fontas,
Laure Puydebois,
Marie Theophile,
Stephane Morandi,
Mael Petit,
David Creissac,
Pauline Ennouchy,
Elise Valetoux,
Celine Visade,
Severine Balloux,
Emmanuel Cortes,
Pierre-Etienne Devineau
, et al. (3 additional authors not shown)
Abstract:
This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges…
▽ More
This report introduces LLaMandement, a state-of-the-art Large Language Model, fine-tuned by the French government and designed to enhance the efficiency and efficacy of processing parliamentary sessions (including the production of bench memoranda and documents required for interministerial meetings) by generating neutral summaries of legislative proposals. Addressing the administrative challenges of manually processing a growing volume of legislative amendments, LLaMandement stands as a significant legal technological milestone, providing a solution that exceeds the scalability of traditional human efforts while matching the robustness of a specialized legal drafter. We release all our fine-tuned models and training data to the community.
△ Less
Submitted 29 January, 2024;
originally announced January 2024.
-
Untrue.News: A New Search Engine For Fake Stories
Authors:
Vinicius Woloszyn,
Felipe Schaeffer,
Beliza Boniatti,
Eduardo Cortes,
Salar Mohtaj,
Sebastian Möller
Abstract:
In this paper, we demonstrate Untrue News, a new search engine for fake stories. Untrue News is easy to use and offers useful features such as: a) a multi-language option combining fake stories from different countries and languages around the same subject or person; b) an user privacy protector, avoiding the filter bubble by employing a bias-free ranking scheme; and c) a collaborative platform th…
▽ More
In this paper, we demonstrate Untrue News, a new search engine for fake stories. Untrue News is easy to use and offers useful features such as: a) a multi-language option combining fake stories from different countries and languages around the same subject or person; b) an user privacy protector, avoiding the filter bubble by employing a bias-free ranking scheme; and c) a collaborative platform that fosters the development of new tools for fighting disinformation. Untrue News relies on Elasticsearch, a new scalable analytic search engine based on the Lucene library that provides near real-time results. We demonstrate two key scenarios: the first related to a politician - looking how the categories are shown for different types of fake stories - and a second related to a refugee - showing the multilingual tool. A prototype of Untrue News is accessible via http://untrue.news
△ Less
Submitted 16 February, 2020;
originally announced February 2020.
-
A Simulation Based Dynamic Evaluation Framework for System-wide Algorithmic Fairness
Authors:
Efrén Cruz Cortés,
Debashis Ghosh
Abstract:
We propose the use of Agent Based Models (ABMs) inside a reinforcement learning framework in order to better understand the relationship between automated decision making tools, fairness-inspired statistical constraints, and the social phenomena giving rise to discrimination towards sensitive groups. There have been many instances of discrimination occurring due to the applications of algorithmic…
▽ More
We propose the use of Agent Based Models (ABMs) inside a reinforcement learning framework in order to better understand the relationship between automated decision making tools, fairness-inspired statistical constraints, and the social phenomena giving rise to discrimination towards sensitive groups. There have been many instances of discrimination occurring due to the applications of algorithmic tools by public and private institutions. Until recently, these practices have mostly gone unchecked. Given the large-scale transformation these new technologies elicit, a joint effort of social sciences and machine learning researchers is necessary. Much of the research has been done on determining statistical properties of such algorithms and the data they are trained on. We aim to complement that approach by studying the social dynamics in which these algorithms are implemented. We show how bias can be accumulated and reinforced through automated decision making, and the possibility of finding a fairness inducing policy. We focus on the case of recidivism risk assessment by considering simplified models of arrest. We find that if we limit our attention to what is observed and manipulated by these algorithmic tools, we may determine some blatantly unfair practices as fair, illustrating the advantage of analyzing the otherwise elusive property with a system-wide model. We expect the introduction of agent based simulation techniques will strengthen collaboration with social scientists, arriving at a better understanding of the social systems affected by technology and to hopefully lead to concrete policy proposals that can be presented to policymakers for a true systemic transformation.
△ Less
Submitted 21 March, 2019;
originally announced March 2019.
-
Consistent Kernel Density Estimation with Non-Vanishing Bandwidth
Authors:
Efrén Cruz Cortés,
Clayton Scott
Abstract:
Consistency of the kernel density estimator requires that the kernel bandwidth tends to zero as the sample size grows. In this paper we investigate the question of whether consistency is possible when the bandwidth is fixed, if we consider a more general class of weighted KDEs. To answer this question in the affirmative, we introduce the fixed-bandwidth KDE (fbKDE), obtained by solving a quadratic…
▽ More
Consistency of the kernel density estimator requires that the kernel bandwidth tends to zero as the sample size grows. In this paper we investigate the question of whether consistency is possible when the bandwidth is fixed, if we consider a more general class of weighted KDEs. To answer this question in the affirmative, we introduce the fixed-bandwidth KDE (fbKDE), obtained by solving a quadratic program, and prove that it consistently estimates any continuous square-integrable density. We also establish rates of convergence for the fbKDE with radial kernels and the box kernel under appropriate smoothness assumptions. Furthermore, in an experimental study we demonstrate that the fbKDE compares favorably to the standard KDE and the previously proposed variable bandwidth KDE.
△ Less
Submitted 29 May, 2017; v1 submitted 24 May, 2017;
originally announced May 2017.
-
Sparse Approximation of a Kernel Mean
Authors:
E. Cruz Cortés,
C. Scott
Abstract:
Kernel means are frequently used to represent probability distributions in machine learning problems. In particular, the well known kernel density estimator and the kernel mean embedding both have the form of a kernel mean. Unfortunately, kernel means are faced with scalability issues. A single point evaluation of the kernel density estimator, for example, requires a computation time linear in the…
▽ More
Kernel means are frequently used to represent probability distributions in machine learning problems. In particular, the well known kernel density estimator and the kernel mean embedding both have the form of a kernel mean. Unfortunately, kernel means are faced with scalability issues. A single point evaluation of the kernel density estimator, for example, requires a computation time linear in the training sample size. To address this challenge, we present a method to efficiently construct a sparse approximation of a kernel mean. We do so by first establishing an incoherence-based bound on the approximation error, and then noticing that, for the case of radial kernels, the bound can be minimized by solving the $k$-center problem. The outcome is a linear time construction of a sparse kernel mean, which also lends itself naturally to an automatic sparsity selection scheme. We show the computational gains of our method by looking at three problems involving kernel means: Euclidean embedding of distributions, class proportion estimation, and clustering using the mean-shift algorithm.
△ Less
Submitted 1 March, 2015;
originally announced March 2015.
-
White Paper: Radio y Redes Cognitivas
Authors:
Carles Anton Haro,
Luis Castedo Ribas,
Javier del Ser Lorente,
Armin Dekorsy,
Miguel Egido Cortes,
Xavier Gelabert,
Lorenza Giupponi,
Xavier Mestre,
Jose Monserrat,
Carlos Mosquera,
Miquel Soriano,
Liesbet van der Perre,
Jon Arambarri,
Juan Antonio Romo
Abstract:
Traditionally, two different policies to access the radio spectrum have coexisted: licensed regulation, whereby the rights to use specific spectral bands are granted in exclusivity to an individual operator; or unlicensed regulation, according to which certain spectral bands are declared open for free use by any operator or individual following specific rules. While these paradigms have allowed th…
▽ More
Traditionally, two different policies to access the radio spectrum have coexisted: licensed regulation, whereby the rights to use specific spectral bands are granted in exclusivity to an individual operator; or unlicensed regulation, according to which certain spectral bands are declared open for free use by any operator or individual following specific rules. While these paradigms have allowed the wireless communications sector to blossom in the past, in recent years they have evidenced shortcomings and given signs of exhaustion. For instance, it is quite usual to encounter fully overloaded mobile communication systems coexisting with unused contiguous spectral bands. This clearly advocates for a more flexible and dynamic allocation of the spectrum resources which can only be achieved with the advent of the so-called cognitive radios and networks. This whitepaper provides an accurate description of priority research activities and open challenges related to the different functionalities of cognitive radios and networks. First, we outline the main open problems related to the theoretical characterization of cognitive radios, spectrum sensing techniques as well as the optimization of physical layer functionalities in these networks. Second, we provide a description of the main research challenges that arise from a system point of view: MAC protocol optimization, traffic modelling, RRM strategies, routing paradigms or security issues. Next, we point out other problems related to the practical hardware implementation of cognitive radios, giving especial emphasis to sensing capabilities, reconfigurability and cognitive control and management. Finally, we succinctly report on a number of current activities related to the standardization of cognitive radio systems.
△ Less
Submitted 24 February, 2015;
originally announced February 2015.