Daniel van Strien

Daniel van Strien

Glasgow, Scotland, United Kingdom
4K followers 500+ connections

Erleben Sie

Bildung

  • City, University of London Graphic

    City University London

    -

    My Masters dissertation tackled the different approaches that research funders have taken to funding Open Access. Through this I have developed an extensive understanding of HEFCE and research funder mandates around open access, the role of publishing for career progression and the wider research landscape.

  • -

  • -

Publications

  • Computer Vision for the Humanities: An Introduction to Deep Learning for Image Classification (Part 1)

    Programming Historian

    This is the first of a two-part lesson introducing deep learning based computer vision methods for humanities research. Using a dataset of historical newspaper advertisements and the fastai Python library, the lesson walks through the pipeline of training a computer vision model to perform image classification.

    Other authors
    See publication
  • Assessing the Impact of OCR Quality on Downstream NLP Tasks

    Proceedings of the 12th International Conference on Agents and Artificial Intelligence - Volume 1: ARTIDIGH

    Abstract: A growing volume of heritage data is being digitized and made available as text via optical character recognition (OCR). Scholars and libraries are increasingly using OCR-generated text for retrieval and analysis. However, the process of creating text through OCR introduces varying degrees of error to the text. The impact of these errors on natural language processing (NLP) tasks has only been partially studied. We perform a series of extrinsic assessment tasks — sentence…

    Abstract: A growing volume of heritage data is being digitized and made available as text via optical character recognition (OCR). Scholars and libraries are increasingly using OCR-generated text for retrieval and analysis. However, the process of creating text through OCR introduces varying degrees of error to the text. The impact of these errors on natural language processing (NLP) tasks has only been partially studied. We perform a series of extrinsic assessment tasks — sentence segmentation, named entity recognition, dependency parsing, information retrieval, topic modelling and neural language model fine-tuning — using popular, out-of-the-box tools in order to quantify the impact of OCR quality on these tasks. We find a consistent impact resulting from OCR errors on our downstream tasks with some tasks more irredeemably harmed by OCR errors. Based on these results, we offer some preliminary guidelines for working with text produced through OCR.

    See publication
  • Resolving places, past and present: toponym resolution in historical british newspapers using multiple resources

    Proceedings of the 13th Workshop on Geographic Information Retrieval

    Newspapers and their metadata are richly geographical, not only in their distribution but also their content. Attending to these spatial features is a prerequisite in newspaper research. Following other projects to have geoparsed place names in newspapers, we describe our approach to linking historical geospatial information in text to real-world locations which 1) adopts an expansive definition of what counts as a locatable entity; 2) uses knowledge bases derived from contemporaneous sources;…

    Newspapers and their metadata are richly geographical, not only in their distribution but also their content. Attending to these spatial features is a prerequisite in newspaper research. Following other projects to have geoparsed place names in newspapers, we describe our approach to linking historical geospatial information in text to real-world locations which 1) adopts an expansive definition of what counts as a locatable entity; 2) uses knowledge bases derived from contemporaneous sources; and 3) leverages contextual information to disambiguate hard-to-locate places. This method depends on combining historical and non-historical resources and the paper discusses the potential benefits for humanities research.

    Other authors
    • Mariona Coll Ardanuy
    • Katherine McDonough
    • Amrey Krause
    • Daniel C S Wilson
    • Kasra Hosseini
    See publication
  • The Transition to Open Access: to What Extent Can Research Funders Influence the Market for APCs?

    Masters Dissertation, City University London

    This dissertation assesses the extent to which research funders can influence the market for APCs. This question is tackled through the use of case studies, particularly the Wellcome Trust and the Austrian Science Fund (FWF). The dissertation assesses the secondary literature, policy documents and APC data. This is supported through interviews with the Wellcome Trust and FWF. In addressing the extent to which research funders influence the market for APCs a number of issues are explored. The…

    This dissertation assesses the extent to which research funders can influence the market for APCs. This question is tackled through the use of case studies, particularly the Wellcome Trust and the Austrian Science Fund (FWF). The dissertation assesses the secondary literature, policy documents and APC data. This is supported through interviews with the Wellcome Trust and FWF. In addressing the extent to which research funders influence the market for APCs a number of issues are explored. The role of policy in shaping open access is analysed. Chapter two outlines the motivations government and research funders have in supporting open access. In particular, this chapter assesses the recommendations of the Finch Report (2012) and the requirements for the next Research Excellence Framework (REF 2020) exercise. Following this, the market for Article Processing Charges (APCs) is explored in chapter three. This focuses on assessing the current size of the market and the average price of APCs. It also seeks to understand the extent to which the market for APCs fits the definition of a monopolistic market. Chapter four compares the current policies of the Wellcome Trust and FWF, outlining the extent to which their policies aim to influence the market and the mechanisms used. Price caps are identified as a major difference in policy and subsequently discussed in detail. The dissertation concluded that, although research funders influence the market for APCs, this influence is constrained by a number of factors: research funders may primarily intend their open access policies to ensure the research they fund is available open access while external factors limit the extent to which research funders can, independently, influence the market for APCs.

    See publication
  • #CityMash Markdown introduction presentation/workshop

    -

    Notes for an introduction to Markdown session held at #CityMash (http://citymash.github.io/) on the 13th June 2015.

    See publication

View Daniel’s full profile

  • See who you know in common
  • Get introduced
  • Contact Daniel directly
Join to view full profile

Other similar profiles

Gemeinsame Artikel erkunden

We’re unlocking community knowledge in a new way. Experts add insights directly into each article, started with the help of AI.

Explore More

Others named Daniel van Strien

Add new skills with these courses