Zum Hauptinhalt springen

Showing 1–20 of 20 results for author: Shapira, O

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.16086  [pdf, other

    cs.CL

    SEAM: A Stochastic Benchmark for Multi-Document Tasks

    Authors: Gili Lior, Avi Caciularu, Arie Cattan, Shahar Levy, Ori Shapira, Gabriel Stanovsky

    Abstract: Various tasks, such as summarization, multi-hop question answering, or coreference resolution, are naturally phrased over collections of real-world documents. Such tasks present a unique set of challenges, revolving around the lack of coherent narrative structure across documents, which often leads to contradiction, omission, or repetition of information. Despite their real-world application and c… ▽ More

    Submitted 23 June, 2024; originally announced June 2024.

  2. arXiv:2406.00842  [pdf, other

    cs.CL

    The Power of Summary-Source Alignments

    Authors: Ori Ernst, Ori Shapira, Aviv Slobodkin, Sharon Adar, Mohit Bansal, Jacob Goldberger, Ran Levy, Ido Dagan

    Abstract: Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection, followed by text generation. In this context, alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data for some of the component tasks. Yet, this enabling alignment step has usually been applied he… ▽ More

    Submitted 2 June, 2024; originally announced June 2024.

    Comments: Accepted to ACL-Findings 2024

  3. arXiv:2403.15351  [pdf, other

    cs.CL

    Multi-Review Fusion-in-Context

    Authors: Aviv Slobodkin, Ori Shapira, Ran Levy, Ido Dagan

    Abstract: Grounded text generation, encompassing tasks such as long-form question-answering and summarization, necessitates both content selection and content consolidation. Current end-to-end methods are difficult to control and interpret due to their opaqueness. Accordingly, recent works have proposed a modular approach, with separate components for each step. Specifically, we focus on the second subtask,… ▽ More

    Submitted 31 March, 2024; v1 submitted 22 March, 2024; originally announced March 2024.

    Comments: NAACL 2024, findings

  4. arXiv:2312.04440  [pdf, other

    cs.CL

    OpenAsp: A Benchmark for Multi-document Open Aspect-based Summarization

    Authors: Shmuel Amar, Liat Schiff, Ori Ernst, Asi Shefer, Ori Shapira, Ido Dagan

    Abstract: The performance of automatic summarization models has improved dramatically in recent years. Yet, there is still a gap in meeting specific information needs of users in real-world scenarios, particularly when a targeted summary is sought, such as in the useful aspect-based summarization setting targeted in this paper. Previous datasets and studies for this setting have predominantly concentrated o… ▽ More

    Submitted 7 December, 2023; originally announced December 2023.

    Comments: EMNLP 2023

  5. arXiv:2308.08363  [pdf, other

    cs.CL

    SummHelper: Collaborative Human-Computer Summarization

    Authors: Aviv Slobodkin, Niv Nachum, Shmuel Amar, Ori Shapira, Ido Dagan

    Abstract: Current approaches for text summarization are predominantly automatic, with rather limited space for human intervention and control over the process. In this paper, we introduce SummHelper, a 2-phase summarization assistant designed to foster human-machine collaboration. The initial phase involves content selection, where the system recommends potential content, allowing users to accept, modify, o… ▽ More

    Submitted 16 October, 2023; v1 submitted 16 August, 2023; originally announced August 2023.

    Comments: EMNLP 2023 System Demonstrations

  6. arXiv:2212.05150  [pdf, other

    cs.LG

    Improving Precancerous Case Characterization via Transformer-based Ensemble Learning

    Authors: Yizhen Zhong, Jiajie Xiao, Thomas Vetterli, Mahan Matin, Ellen Loo, Jimmy Lin, Richard Bourgon, Ofer Shapira

    Abstract: The application of natural language processing (NLP) to cancer pathology reports has been focused on detecting cancer cases, largely ignoring precancerous cases. Improving the characterization of precancerous adenomas assists in developing diagnostic tests for early cancer detection and prevention, especially for colorectal cancer (CRC). Here we developed transformer-based deep neural network NLP… ▽ More

    Submitted 9 December, 2022; originally announced December 2022.

  7. arXiv:2202.06726  [pdf, other

    cs.HC cs.GR

    Experimental Augmented Reality User Experience

    Authors: Josef Spjut, Fengyuan Zhu, Xiaolei Huang, Yichen Shou, Ben Boudaoud, Omer Shapira, Morgan McGuire

    Abstract: Augmented Reality (AR) is an emerging field ripe for experimentation, especially when it comes to developing the kinds of applications and experiences that will drive mass adoption of the technology. While we aren't aware of any current consumer product that realize a wearable, wide Field of View (FoV), AR Head Mounted Display (HMD), such devices will certainly come. In order for these sophisticat… ▽ More

    Submitted 10 February, 2022; originally announced February 2022.

    Comments: 2 pages, 3 figures, work original completed in 2019

  8. arXiv:2112.08770  [pdf, other

    cs.CL cs.LG

    Proposition-Level Clustering for Multi-Document Summarization

    Authors: Ori Ernst, Avi Caciularu, Ori Shapira, Ramakanth Pasunuru, Mohit Bansal, Jacob Goldberger, Ido Dagan

    Abstract: Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition. Particularly, clusters were leveraged to indicate information saliency as well as to avoid redundancy. Such prior methods focused on clustering sentences, even though closely related sentences usually contain also non-aligned parts. In this… ▽ More

    Submitted 19 May, 2022; v1 submitted 16 December, 2021; originally announced December 2021.

    Comments: NAACl 2022

  9. arXiv:2112.05129  [pdf, other

    cs.RO

    Assistive Tele-op: Leveraging Transformers to Collect Robotic Task Demonstrations

    Authors: Henry M. Clever, Ankur Handa, Hammad Mazhar, Kevin Parker, Omer Shapira, Qian Wan, Yashraj Narang, Iretiayo Akinola, Maya Cakmak, Dieter Fox

    Abstract: Sharing autonomy between robots and human operators could facilitate data collection of robotic task demonstrations to continuously improve learned models. Yet, the means to communicate intent and reason about the future are disparate between humans and robots. We present Assistive Tele-op, a virtual reality (VR) system for collecting robot task demonstrations that displays an autonomous trajector… ▽ More

    Submitted 9 December, 2021; originally announced December 2021.

    Comments: 9 pages, 4 figures, 1 table. NeurIPS 2021 Workshop on Robot Learning: Self-Supervised and Lifelong Learning, Virtual, Virtual

  10. arXiv:2110.01073  [pdf, other

    cs.CL

    Multi-Document Keyphrase Extraction: Dataset, Baselines and Review

    Authors: Ori Shapira, Ramakanth Pasunuru, Ido Dagan, Yael Amsterdamer

    Abstract: Keyphrase extraction has been extensively researched within the single-document setting, with an abundance of methods, datasets and applications. In contrast, multi-document keyphrase extraction has been infrequently studied, despite its utility for describing sets of documents, and its use in summarization. Moreover, no prior dataset exists for multi-document keyphrase extraction, hindering the p… ▽ More

    Submitted 1 July, 2022; v1 submitted 3 October, 2021; originally announced October 2021.

  11. arXiv:2109.11621  [pdf, other

    cs.CL

    iFacetSum: Coreference-based Interactive Faceted Summarization for Multi-Document Exploration

    Authors: Eran Hirsch, Alon Eirew, Ori Shapira, Avi Caciularu, Arie Cattan, Ori Ernst, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Ido Dagan

    Abstract: We introduce iFacetSum, a web application for exploring topical document sets. iFacetSum integrates interactive summarization together with faceted search, by providing a novel faceted navigation scheme that yields abstractive summaries for the user's selections. This approach offers both a comprehensive overview as well as concise details regarding subtopics of choice. Fine-grained facets are aut… ▽ More

    Submitted 23 September, 2021; originally announced September 2021.

    Comments: Proceedings of EMNLP 2021, System Demonstrations. 7 pages and an appendix

  12. arXiv:2009.08380  [pdf, other

    cs.CL

    Evaluating Interactive Summarization: an Expansion-Based Framework

    Authors: Ori Shapira, Ramakanth Pasunuru, Hadar Ronen, Mohit Bansal, Yael Amsterdamer, Ido Dagan

    Abstract: Allowing users to interact with multi-document summarizers is a promising direction towards improving and customizing summary results. Different ideas for interactive summarization have been proposed in previous work but these solutions are highly divergent and incomparable. In this paper, we develop an end-to-end evaluation framework for expansion-based interactive summarization, which considers… ▽ More

    Submitted 17 September, 2020; originally announced September 2020.

  13. arXiv:2009.00590  [pdf, other

    cs.CL

    Summary-Source Proposition-level Alignment: Task, Datasets and Supervised Baseline

    Authors: Ori Ernst, Ori Shapira, Ramakanth Pasunuru, Michael Lepioshkin, Jacob Goldberger, Mohit Bansal, Ido Dagan

    Abstract: Aligning sentences in a reference summary with their counterparts in source documents was shown as a useful auxiliary summarization task, notably for generating training data for salience detection. Despite its assessed utility, the alignment step was mostly approached with heuristic unsupervised methods, typically ROUGE-based, and was never independently optimized or evaluated. In this paper, we… ▽ More

    Submitted 22 September, 2021; v1 submitted 1 September, 2020; originally announced September 2020.

    Comments: CoNLL 2021

  14. arXiv:2007.11348  [pdf, other

    cs.CL

    Massive Multi-Document Summarization of Product Reviews with Weak Supervision

    Authors: Ori Shapira, Ran Levy

    Abstract: Product reviews summarization is a type of Multi-Document Summarization (MDS) task in which the summarized document sets are often far larger than in traditional MDS (up to tens of thousands of reviews). We highlight this difference and coin the term "Massive Multi-Document Summarization" (MMDS) to denote an MDS task that involves hundreds of documents or more. Prior work on product reviews summar… ▽ More

    Submitted 22 July, 2020; originally announced July 2020.

  15. arXiv:1909.01214  [pdf, other

    cs.CL

    Better Rewards Yield Better Summaries: Learning to Summarise Without References

    Authors: Florian Böhm, Yang Gao, Christian M. Meyer, Ori Shapira, Ido Dagan, Iryna Gurevych

    Abstract: Reinforcement Learning (RL) based document summarisation systems yield state-of-the-art performance in terms of ROUGE scores, because they directly use ROUGE as the rewards during training. However, summaries with high ROUGE scores often receive low human judgement. To find a better reward function that can guide RL to generate human-appealing summaries, we learn a reward function from human ratin… ▽ More

    Submitted 3 September, 2019; originally announced September 2019.

    Comments: Accepted to EMNLP2019

  16. arXiv:1904.05929  [pdf, ps, other

    cs.CL

    Crowdsourcing Lightweight Pyramids for Manual Summary Evaluation

    Authors: Ori Shapira, David Gabay, Yang Gao, Hadar Ronen, Ramakanth Pasunuru, Mohit Bansal, Yael Amsterdamer, Ido Dagan

    Abstract: Conducting a manual evaluation is considered an essential part of summary evaluation methodology. Traditionally, the Pyramid protocol, which exhaustively compares system summaries to references, has been perceived as very reliable, providing objective scores. Yet, due to the high cost of the Pyramid method and the required expertise, researchers resorted to cheaper and less thorough manual evaluat… ▽ More

    Submitted 11 April, 2019; originally announced April 2019.

    Comments: 5 pages, 2 graphs, 1 table. Published in NAACL 2019

  17. arXiv:1810.10093  [pdf, other

    cs.CV

    Structured Domain Randomization: Bridging the Reality Gap by Context-Aware Synthetic Data

    Authors: Aayush Prakash, Shaad Boochoon, Mark Brophy, David Acuna, Eric Cameracci, Gavriel State, Omer Shapira, Stan Birchfield

    Abstract: We present structured domain randomization (SDR), a variant of domain randomization (DR) that takes into account the structure and context of the scene. In contrast to DR, which places objects and distractors randomly according to a uniform probability distribution, SDR places objects and distractors randomly according to probability distributions that arise from the specific problem at hand. In t… ▽ More

    Submitted 18 August, 2020; v1 submitted 23 October, 2018; originally announced October 2018.

    Comments: ICRA 2019; for video, see https://youtu.be/1WdjWJYx9AY

  18. arXiv:1410.8433  [pdf, ps, other

    cs.IT

    Binary Polarization Kernels from Code Decompositions

    Authors: Noam Presman, Ofer Shapira, Simon Litsyn, Tuvi Etzion, Alexander Vardy

    Abstract: In this paper, code decompositions (a.k.a. code nestings) are used to design binary polarization kernels. The proposed kernels are in general non-linear. They provide a better polarization exponent than the previously known kernels of the same dimensions. In particular, non-linear kernels of dimensions 14, 15, and 16 are constructed and are shown to have optimal asymptotic error-correction perform… ▽ More

    Submitted 6 March, 2015; v1 submitted 30 October, 2014; originally announced October 2014.

    Comments: The paper was accepted for publication in the Transactions on Information Theory. It can be considered as an extended version of "Binary Polar Code Kernels from Code Decompositions" arXiv:1101.0764

  19. arXiv:1107.0478  [pdf, ps, other

    cs.IT

    Polar Codes with Mixed-Kernels

    Authors: Noam Presman, Ofer Shapira, Simon Litsyn

    Abstract: A generalization of the polar coding scheme called mixed-kernels is introduced. This generalization exploits several homogeneous kernels over alphabets of different sizes. An asymptotic analysis of the proposed scheme shows that its polarization properties are strongly related to the ones of the constituent kernels. Simulation of finite length instances of the scheme indicate their advantages both… ▽ More

    Submitted 24 March, 2015; v1 submitted 3 July, 2011; originally announced July 2011.

  20. arXiv:1101.0764  [pdf, ps, other

    cs.IT

    Binary Polar Code Kernels from Code Decompositions

    Authors: Noam Presman, Ofer Shapira, Simon Litsyn

    Abstract: Code decompositions (a.k.a code nestings) are used to design good binary polar code kernels. The proposed kernels are in general non-linear and show a better rate of polarization under successive cancelation decoding, than the ones suggested by Korada et al., for the same kernel dimensions. In particular, kernels of sizes 14, 15 and 16 are constructed and shown to provide polarization rates better… ▽ More

    Submitted 3 July, 2011; v1 submitted 4 January, 2011; originally announced January 2011.