-
Exploring Large Language Models and Hierarchical Frameworks for Classification of Large Unstructured Legal Documents
Authors:
Nishchal Prasad,
Mohand Boughanem,
Taoufiq Dkaki
Abstract:
Legal judgment prediction suffers from the problem of long case documents exceeding tens of thousands of words, in general, and having a non-uniform structure. Predicting judgments from such documents becomes a challenging task, more so on documents with no structural annotation. We explore the classification of these large legal documents and their lack of structural information with a deep-learn…
▽ More
Legal judgment prediction suffers from the problem of long case documents exceeding tens of thousands of words, in general, and having a non-uniform structure. Predicting judgments from such documents becomes a challenging task, more so on documents with no structural annotation. We explore the classification of these large legal documents and their lack of structural information with a deep-learning-based hierarchical framework which we call MESc; "Multi-stage Encoder-based Supervised with-clustering"; for judgment prediction. Specifically, we divide a document into parts to extract their embeddings from the last four layers of a custom fine-tuned Large Language Model, and try to approximate their structure through unsupervised clustering. Which we use in another set of transformer encoder layers to learn the inter-chunk representations. We analyze the adaptability of Large Language Models (LLMs) with multi-billion parameters (GPT-Neo, and GPT-J) with the hierarchical framework of MESc and compare them with their standalone performance on legal texts. We also study their intra-domain(legal) transfer learning capability and the impact of combining embeddings from their last layers in MESc. We test these methods and their effectiveness with extensive experiments and ablation studies on legal documents from India, the European Union, and the United States with the ILDC dataset and a subset of the LexGLUE dataset. Our approach achieves a minimum total performance gain of approximately 2 points over previous state-of-the-art methods.
△ Less
Submitted 11 March, 2024;
originally announced March 2024.
-
Exploring Semi-supervised Hierarchical Stacked Encoder for Legal Judgement Prediction
Authors:
Nishchal Prasad,
Mohand Boughanem,
Taoufiq Dkaki
Abstract:
Predicting the judgment of a legal case from its unannotated case facts is a challenging task. The lengthy and non-uniform document structure poses an even greater challenge in extracting information for decision prediction. In this work, we explore and propose a two-level classification mechanism; both supervised and unsupervised; by using domain-specific pre-trained BERT to extract information f…
▽ More
Predicting the judgment of a legal case from its unannotated case facts is a challenging task. The lengthy and non-uniform document structure poses an even greater challenge in extracting information for decision prediction. In this work, we explore and propose a two-level classification mechanism; both supervised and unsupervised; by using domain-specific pre-trained BERT to extract information from long documents in terms of sentence embeddings further processing with transformer encoder layer and use unsupervised clustering to extract hidden labels from these embeddings to better predict a judgment of a legal case. We conduct several experiments with this mechanism and see higher performance gains than the previously proposed methods on the ILDC dataset. Our experimental results also show the importance of domain-specific pre-training of Transformer Encoders in legal information processing.
△ Less
Submitted 14 November, 2023;
originally announced November 2023.
-
A Hierarchical Neural Framework for Classification and its Explanation in Large Unstructured Legal Documents
Authors:
Nishchal Prasad,
Mohand Boughanem,
Taoufik Dkaki
Abstract:
Automatic legal judgment prediction and its explanation suffer from the problem of long case documents exceeding tens of thousands of words, in general, and having a non-uniform structure. Predicting judgments from such documents and extracting their explanation becomes a challenging task, more so on documents with no structural annotation. We define this problem as "scarce annotated legal documen…
▽ More
Automatic legal judgment prediction and its explanation suffer from the problem of long case documents exceeding tens of thousands of words, in general, and having a non-uniform structure. Predicting judgments from such documents and extracting their explanation becomes a challenging task, more so on documents with no structural annotation. We define this problem as "scarce annotated legal documents" and explore their lack of structural information and their long lengths with a deep-learning-based classification framework which we call MESc; "Multi-stage Encoder-based Supervised with-clustering"; for judgment prediction. We explore the adaptability of LLMs with multi-billion parameters (GPT-Neo, and GPT-J) to legal texts and their intra-domain(legal) transfer learning capacity. Alongside this, we compare their performance and adaptability with MESc and the impact of combining embeddings from their last layers. For such hierarchical models, we also propose an explanation extraction algorithm named ORSE; Occlusion sensitivity-based Relevant Sentence Extractor; based on the input-occlusion sensitivity of the model, to explain the predictions with the most relevant sentences from the document. We explore these methods and test their effectiveness with extensive experiments and ablation studies on legal documents from India, the European Union, and the United States with the ILDC dataset and a subset of the LexGLUE dataset. MESc achieves a minimum total performance gain of approximately 2 points over previous state-of-the-art proposed methods, while ORSE applied on MESc achieves a total average gain of 50% over the baseline explainability scores.
△ Less
Submitted 27 June, 2024; v1 submitted 19 September, 2023;
originally announced September 2023.
-
Ranking RDF Instances in Degree-decoupled RDF Graphs
Authors:
Elisa S. Menendez,
Marco A. Casanova,
Mohand Boughanem,
Luiz André P. Paes Leme
Abstract:
In the last decade, RDF emerged as a new kind of standardized data model, and a sizable body of knowledge from fields such as Information Retrieval was adapted to RDF graphs. One common task in graph databases is to define an importance score for nodes based on centrality measures, such as PageRank and HITS. The majority of the strategies highly depend on the degree of the node. However, in some R…
▽ More
In the last decade, RDF emerged as a new kind of standardized data model, and a sizable body of knowledge from fields such as Information Retrieval was adapted to RDF graphs. One common task in graph databases is to define an importance score for nodes based on centrality measures, such as PageRank and HITS. The majority of the strategies highly depend on the degree of the node. However, in some RDF graphs, called degree-decoupled RDF graphs, the notion of importance is not directly related to the node degree. Therefore, this work first proposes three novel node importance measures, named InfoRank I, II and III, for degree-decoupled RDF graphs. It then compares the proposed measures with traditional PageRank and other familiar centrality measures, using with an IMDb dataset.
△ Less
Submitted 5 September, 2018;
originally announced September 2018.