Planned intervention: On Thursday 19/09 between 05:30-06:30 (UTC), Zenodo will be unavailable because of a scheduled upgrade in our storage cluster.

There is a newer version of the record available.

Published February 7, 2024 | Version v2
Dataset Open

Dead Sea Scrolls data collection (images, labels, prediction plots) for dating ancient manuscripts using radiocarbon and AI-based writing style analysis

  • 1. University of Groningen
  • 2. University of Southern Denmark
  • 3. University of Pisa
  • 4. KU Leuven

Description

The dataset is associated with the following article:
Title: Dating ancient manuscripts using radiocarbon and AI-based writing style analysis
Authors: Mladen Popović, Maruf A. Dhali, Lambert Schomaker, Johannes van der Plicht, Kaare Lund Rasmussen, Jacopo La Nasa, Ilaria Degano, Maria Perla Colombini, and Eibert Tigchelaar
(Under review)

This data set is collected for the ERC project:
The Hands that Wrote the Bible: Digital Palaeography and Scribal Culture of the Dead Sea Scrolls
PI: Mladen Popović
Grant agreement ID: 640497
Project website: https://cordis.europa.eu/project/id/640497

 

Copyright (c)     University of Groningen, 2024. All rights reserved.
Disclaimer and copyright notice for all data contained on the *.tar.gz files:

1) permission is hereby granted to use the data for research purposes. It is not allowed to distribute this data for commercial purposes.

2) provider gives no express or implied warranty of any kind, and any implied warranties of merchantability and fitness for purpose are disclaimed.

3) provider shall not be liable for any direct, indirect, special, incidental, or consequential damages arising out of any use of this data.

4) the user should refer to the first public article mentioned above on this data set.

5) the recipient should refrain from proliferating the data set to third parties external to his/her local research group. Please refer interested researchers to this site to obtain their own copy.

 

Organization of the data:
(updated on 07-Feb-2024: OxCal data for selected ranges added in a new directory in addition to previously available original OxCal data. Enoch's prediction plots and test images are reorganized for easy access to the users.
Please use the files from this version and disregard the previous one: 10.5281/zenodo.8168210)

There are four *.tar.gz files:

C14-Oxcal-data.tar.gz contains one directory with radiocarbon data (OxCal [1] raw data) for all 30 manuscripts. Two additional directories contain name-corrected files for original OxCal data and files with significant selected ranges. Please refer to the original article for details about OxCal data and the manuscripts. 25 out of 30 raw OxCal data are used (after the selection of significant ranges) as the training labels during the training of Enoch, the date prediction model.

train-images-c14.tar.gz contains the clean and preprocessed (binarized, aligned, and arrangement corrected) training images for the 25 radiocarbon-dated training manuscripts (including 4Q52; 64 images in total). 

test-images-all.tar.gz contains the clean and preprocessed test images for 135 previously undated manuscripts. The images are organized in three different directories: the first one with all 359 images for the 135 manuscripts, the second one with the selected 135 images, and the final one with 25 images to illustrate the poor quality of images. 

Enoch-predictions.tar.gz contains the date prediction plots for each of the 135 test images. There are two directories inside the *.tar.gz file:

- prediction-plots-for-selected-135: Prediction plots with data balancing threshold of 0.05. These plots are used by expert palaeographers' evaluation of Enoch's style-based date predictions of 135 previously undated manuscripts.

- extra-plots: contains four additional directories:
     - Enoch-predictions-c14wo4Q52-balanced05: Prediction plots with data balancing threshold of 0.05. 
     - Enoch-predictions-c14wo4Q52-balanced10: Prediction plots with data balancing threshold of 0.1.
     - Enoch-predictions-c14wo4Q52-unbalanced: Unbalanced raw predictions.
     - Enoch-predictions-c14wo4Q52-combined: Combined plots with all three prediction plots (unbalanced, 0.05, 0.1).
Please refer to the original article for more details.

The code to run the plot is available here: https://doi.org/10.5281/zenodo.10629569

If you have any questions, please get in touch with us:
Mladen Popović <m.popovic(at)rug.nl>
Maruf A. Dhali <m.a.dhali(at)rug.nl>
Lambert Schomaker <l.r.b.schomaker(at)rug.nl>

 

References:
1. Bronk Ramsey, C. (2001). Development of the radiocarbon calibration program. Radiocarbon43(2A), 355-363.

Dateien

Files (222.7 MB)

Name Size Download all
md5:fd4c744ed6407d6beb4b025fbcfe8fca
28.2 kB Download
md5:4322e2aaee0693215973baf72b1af06c
136.0 MB Download
md5:440cbec9561675e465948a0140bddc4f
65.5 MB Download
md5:e41f6c5e7fe2315efee49a8d8f137cd9
21.2 MB Download

Additional details

Finanzierung

HandsandBible – The Hands that Wrote the Bible: Digital Palaeography and Scribal Culture of the Dead Sea Scrolls 640497
European Commission