Zum Hauptinhalt springen

Showing 1–13 of 13 results for author: Berzak, Y

Searching in archive cs. Search in all archives.
.
  1. arXiv:2105.06354  [pdf, other

    cs.CL cs.HC

    Predicting Text Readability from Scrolling Interactions

    Authors: Sian Gooding, Yevgeni Berzak, Tony Mak, Matt Sharifi

    Abstract: Judging the readability of text has many important applications, for instance when performing text simplification or when sourcing reading material for language learners. In this paper, we present a 518 participant study which investigates how scrolling behaviour relates to the readability of a text. We make our dataset publicly available and show that (1) there are statistically significant diffe… ▽ More

    Submitted 17 November, 2021; v1 submitted 13 May, 2021; originally announced May 2021.

    Journal ref: Proceedings of the 25th Conference on Computational Natural Language Learning 2021

  2. arXiv:2010.11032  [pdf, other

    cs.CL

    Classifying Syntactic Errors in Learner Language

    Authors: Leshem Choshen, Dmitry Nikolaev, Yevgeni Berzak, Omri Abend

    Abstract: We present a method for classifying syntactic errors in learner language, namely errors whose correction alters the morphosyntactic structure of a sentence. The methodology builds on the established Universal Dependencies syntactic representation scheme, and provides complementary information to other error-classification systems. Unlike existing error classification methods, our method is app… ▽ More

    Submitted 27 October, 2020; v1 submitted 21 October, 2020; originally announced October 2020.

    Comments: CoNLL 2020

  3. arXiv:2009.14780  [pdf, other

    cs.CL

    Bridging Information-Seeking Human Gaze and Machine Reading Comprehension

    Authors: Jonathan Malmaud, Roger Levy, Yevgeni Berzak

    Abstract: In this work, we analyze how human gaze during reading comprehension is conditioned on the given reading comprehension question, and whether this signal can be beneficial for machine reading comprehension. To this end, we collect a new eye-tracking dataset with a large number of participants engaging in a multiple choice reading comprehension task. Our analysis of this data reveals increased fixat… ▽ More

    Submitted 15 October, 2020; v1 submitted 30 September, 2020; originally announced September 2020.

    Comments: CoNLL 2020

  4. arXiv:2004.14797  [pdf, other

    cs.CL

    STARC: Structured Annotations for Reading Comprehension

    Authors: Yevgeni Berzak, Jonathan Malmaud, Roger Levy

    Abstract: We present STARC (Structured Annotations for Reading Comprehension), a new annotation framework for assessing reading comprehension with multiple choice questions. Our framework introduces a principled structure for the answer choices and ties them to textual span annotations. The framework is implemented in OneStopQA, a new high-quality dataset for evaluation and analysis of reading comprehension… ▽ More

    Submitted 30 April, 2020; originally announced April 2020.

    Comments: ACL 2020. OneStopQA dataset, STARC guidelines and human experiments data are available at https://github.com/berzak/onestop-qa

  5. arXiv:1807.00914  [pdf, other

    cs.CL

    Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing

    Authors: Edoardo Maria Ponti, Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Thierry Poibeau, Ekaterina Shutova, Anna Korhonen

    Abstract: Linguistic typology aims to capture structural and semantic variation across the world's languages. A large-scale typology could provide excellent guidance for multilingual Natural Language Processing (NLP), particularly for languages that suffer from the lack of human labeled resources. We present an extensive literature survey on the use of typological information in the development of NLP techn… ▽ More

    Submitted 26 October, 2020; v1 submitted 2 July, 2018; originally announced July 2018.

  6. arXiv:1804.07329  [pdf, other

    cs.CL

    Assessing Language Proficiency from Eye Movements in Reading

    Authors: Yevgeni Berzak, Boris Katz, Roger Levy

    Abstract: We present a novel approach for determining learners' second language proficiency which utilizes behavioral traces of eye movements during reading. Our approach provides stand-alone eyetracking based English proficiency scores which reflect the extent to which the learner's gaze patterns in reading are similar to those of native English speakers. We show that our scores correlate strongly with sta… ▽ More

    Submitted 23 April, 2018; v1 submitted 19 April, 2018; originally announced April 2018.

    Comments: NAACL 2018 (license change to CC BY)

  7. arXiv:1704.07398  [pdf, other

    cs.CL

    Predicting Native Language from Gaze

    Authors: Yevgeni Berzak, Chie Nakamura, Suzanne Flynn, Boris Katz

    Abstract: A fundamental question in language learning concerns the role of a speaker's first language in second language acquisition. We present a novel methodology for studying this question: analysis of eye-movement patterns in second language reading of free-form text. Using this methodology, we demonstrate for the first time that the native language of English learners can be predicted from their gaze f… ▽ More

    Submitted 2 May, 2017; v1 submitted 24 April, 2017; originally announced April 2017.

    Comments: ACL 2017

  8. arXiv:1610.03349  [pdf, ps, other

    cs.CL

    Survey on the Use of Typological Information in Natural Language Processing

    Authors: Helen O'Horan, Yevgeni Berzak, Ivan Vulić, Roi Reichart, Anna Korhonen

    Abstract: In recent years linguistic typology, which classifies the world's languages according to their functional and structural properties, has been widely used to support multilingual NLP. While the growing importance of typological information in supporting multilingual tasks has been recognised, no systematic survey of existing typological resources and their use in NLP has been published. This paper… ▽ More

    Submitted 11 October, 2016; originally announced October 2016.

    Journal ref: COLING 2016

  9. arXiv:1605.04481  [pdf, ps, other

    cs.CL

    Anchoring and Agreement in Syntactic Annotations

    Authors: Yevgeni Berzak, Yan Huang, Andrei Barbu, Anna Korhonen, Boris Katz

    Abstract: We present a study on two key characteristics of human syntactic annotations: anchoring and agreement. Anchoring is a well known cognitive bias in human decision making, where judgments are drawn towards pre-existing values. We study the influence of anchoring on a standard approach to creation of syntactic resources where syntactic annotations are obtained via human editing of tagger and parser o… ▽ More

    Submitted 21 September, 2016; v1 submitted 14 May, 2016; originally announced May 2016.

    Comments: EMNLP 2016

  10. arXiv:1605.04278  [pdf, ps, other

    cs.CL

    Universal Dependencies for Learner English

    Authors: Yevgeni Berzak, Jessica Kenney, Carolyn Spadine, Jing Xian Wang, Lucia Lam, Keiko Sophie Mori, Sebastian Garza, Boris Katz

    Abstract: We introduce the Treebank of Learner English (TLE), the first publicly available syntactic treebank for English as a Second Language (ESL). The TLE provides manually annotated POS tags and Universal Dependency (UD) trees for 5,124 sentences from the Cambridge First Certificate in English (FCE) corpus. The UD annotations are tied to a pre-existing error annotation of the FCE, whereby full syntactic… ▽ More

    Submitted 7 June, 2016; v1 submitted 13 May, 2016; originally announced May 2016.

    Comments: Updated parsing experiments to EWT v1.3, improved grammatical error marking, minor revisions. To appear in ACL 2016

  11. arXiv:1603.08079  [pdf, other

    cs.CV cs.AI cs.CL

    Do You See What I Mean? Visual Resolution of Linguistic Ambiguities

    Authors: Yevgeni Berzak, Andrei Barbu, Daniel Harari, Boris Katz, Shimon Ullman

    Abstract: Understanding language goes hand in hand with the ability to integrate complex contextual information obtained via perception. In this work, we present a novel task for grounded language understanding: disambiguating a sentence given a visual scene which depicts one of the possible interpretations of that sentence. To this end, we introduce a new multimodal corpus containing ambiguous sentences, r… ▽ More

    Submitted 26 March, 2016; originally announced March 2016.

    Comments: EMNLP 2015

    Journal ref: Conference on Empirical Methods in Natural Language Processing (EMNLP), 2015, pages 1477--1487

  12. arXiv:1603.07609  [pdf, other

    cs.CL

    Contrastive Analysis with Predictive Power: Typology Driven Estimation of Grammatical Error Distributions in ESL

    Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

    Abstract: This work examines the impact of cross-linguistic transfer on grammatical errors in English as Second Language (ESL) texts. Using a computational framework that formalizes the theory of Contrastive Analysis (CA), we demonstrate that language specific error distributions in ESL writing can be predicted from the typological properties of the native language and their relation to the typology of Engl… ▽ More

    Submitted 24 March, 2016; originally announced March 2016.

    Comments: Published in CoNLL 2015

    Journal ref: Proceedings of the 19th Conference on Computational Language Learning, pages 94-102, Beijing, China, July 30-31, 2015

  13. arXiv:1404.6312  [pdf, other

    cs.CL

    Reconstructing Native Language Typology from Foreign Language Usage

    Authors: Yevgeni Berzak, Roi Reichart, Boris Katz

    Abstract: Linguists and psychologists have long been studying cross-linguistic transfer, the influence of native language properties on linguistic performance in a foreign language. In this work we provide empirical evidence for this process in the form of a strong correlation between language similarities derived from structural features in English as Second Language (ESL) texts and equivalent similarities… ▽ More

    Submitted 28 May, 2014; v1 submitted 25 April, 2014; originally announced April 2014.

    Comments: CoNLL 2014

    Journal ref: Proceedings of the Eighteenth Conference on Computational Language Learning , pages 21-29, Baltimore, Maryland USA, June 26-27 2014