The 2,496th Meeting of the Society

May 17, 2024 at 8:00 PM

Powell Auditorium at the Cosmos Club

The 93rd Joseph Henry Lecture

Unlost – Recovering the Text of Burnt and Carbonized Scrolls

Reading Herculaneum Papyri using X-ray CT and AI

Brent Seales

Professor of Computer Science
University of Kentucky

Sponsored by MWZB Law

About the Lecture

The Herculaneum papyrus scrolls, buried and carbonized by the eruption of Mount Vesuvius in 79 CE and then excavated in the 18th century, are original, classical texts from the shelves of the only known library to have survived from antiquity. The 250-year history of science and technology applied to the challenge of opening and then reading them has created a fragmentary, damaged window into their literary and philosophical secrets. In 1999, with more than 400 scrolls still unopened, methods for physical unwrapping were permanently halted. The intact scrolls present an enigmatic challenge: preserved by the fury of Vesuvius, yet still lost. Using a non-invasive approach, we have now shown how to recover their texts, rendering them “unlost”. The path we have forged uses high energy physics, artificial intelligence, and the collective power of a global, scientific community inspired by prizes, collaborative generosity, and the common goal of shared accomplishment, and the glory of reading original classical texts for the first time in 2,000 years.

Selected Reading & Media References
https://www.scientificamerican.com/article/inside-the-ai-competition-that-decoded-an-ancient-scroll-and-changed/
https://time.com/6326563/vesuvius-challenge-herculaneum-papyri-ai/
https://www.nytimes.com/2023/10/20/podcasts/ai-black-box.html
https://www.engr.uky.edu/herculaneum
https://www.habelt.de/openaccess
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0215775

About the Speaker

W. Brent Seales is the Stanley and Karen Pigman Chair of Heritage Science and Professor of Computer Science at the University of Kentucky. He has also held research positions at INRIA Sophia-Antipolis, UNC Chapel Hill, Google (Paris), and the Getty Conservation Institute.

Brent also is the founder of EduceLab, a heritage science research lab at the University of Kentucky to promote the development and application of techniques in machine learning and data science to the digital restoration of damaged materials. The research program is funded by the National Science Foundation, the National Endowment for the Humanities, the Arts and Humanities Research Council of Great Britain, the Andrew W. Mellon Foundation, and Google.

Brent also is a co-founder of the Vesuvius Challenge, an international contest formed around the goal of the virtual unwrapping of Herculaneum scrolls. He continues to work with challenging, damaged material such as the Herculaneum Scrolls and the Dead Sea Scrolls. The effort has achieved notable successes in the scroll from En-Gedi (Leviticus), the Morgan MS M.910 (The Acts of the Apostles), and PHerc.Paris.3 and 4 (Philodemus / Epicureanism). The recovery of readable text from still-unopened material has been hailed worldwide as an astonishing achievement fueled by open scholarship, interdisciplinary collaboration, and extraordinary leadership generosity.

Brent earned a BS in Computer Science at the University of Southwestern Louisiana and an MS and PhD in Computer Science at the University of Wisconsin-Madison

Social Media
Webpage(s): https://www2.cs.uky.edu/dri/ and https://educelab.engr.uky.edu/
LinkedIn Profile: https://www.linkedin.com/in/william-seales-7663334/

Minutes

On May 17, 2024, in the Powell Auditorium of the Cosmos Club in Washington, D.C., President Larry Millstein called the 2,496th meeting of the Society to order at 8:016 p.m. ET. He began by welcoming attendees, thanking sponsors for their support and announcing new members. Scott Mathews then read the minutes of the previous meeting which included the lecture by David Spergel, titled “Lessons from the Universe’s Baby Picture”. The minutes were approved as read, pending minor corrections.

President Millstein then introduced the speaker for the evening, Brent Seales, of the University of Kentucky. His lecture was titled “Unlost: Recovering the Text of Burnt and Carbonized Scrolls”.

The speaker began by presenting the inspiration for his work: an unreadable letter from a WWII veteran to his daughter. Seales and his team were able to digitally recover the text so that the letter was readable. After reading sections of the letter, Seales asked, “Is that not a reason to pull texts back from the brink?”

He then discussed the carbonized papyrus scrolls recovered from Herculaneum, a town in close proximity to the volcano Vesuvius. He showed pictures of “Cornices” (trays of complete, wrapped scrolls which are largely untouched) and “Scorze” (fragments of scrolls resulting from attempts to unroll them). Seales estimated that the recovered carbonized scrolls represent about 8000 pages of text.

The speaker then discussed some of the digital techniques used for recovering text from the Herculaneum scrolls: virtual flattening, virtual unwrapping, segmentation, texturing, and virtual re-assembly. He discussed early attempts at using x-ray tomography to recover text from in-tact scrolls, and the fact that they were largely unsuccessful due to the low contrast of the inks.

Seales said that their first real success came with the restoration of the scroll from En-Gedi, an ancient Hebrew parchment found in Israel in 1970. His team was able to achieve sufficient ink-contrast with x-ray tomography and recover sections of text without unwrapping the scroll. He indicated that this process allowed them to test and optimize many of their algorithms. He said that success with En-Gedi scroll, along with the more advanced techniques developed as a result of this work served as the motivation to begin text recovery efforts on the Herculaneum scrolls.

Seales then presented what he called the “carbon ink challenge”: the fact that the carbon-based inks used on the Herculaneum scrolls provide little or no contrast when measured with x-ray tomography. However, Seales showed that artificial intelligence and deep learning algorithms were able to recognize carbon ink text on test samples. Further work revealed that the algorithms were actually responding to shape changes of the surface, rather than x-ray contrast. Additionally, they found that carbon-based inks created significant changes in the surface morphology or roughness of the papyrus. They therefore concluded that they needed higher resolution imaging techniques, combined with artificial intelligence.

As a result, they began acquiring data using a synchrotron, at the Diamond Light Source in Oxfordshire, England. With this instrument, they were able to achieve a volume resolution of less than 8 um on a side (8 um voxel). With this dataset, they were able to train a machine learning neural network and recover text from a Herculaneum scroll.

Seales then described the “Vesuvius Challenge”, an open competition to recover text from Herculaneum scrolls with a grand prize of $700,000. For this competition, Seales and his team publicly released all of their code, their best data, tutorials explaining everything, and paid workers to do the “brute force virtual unwrapping” of their data. The competitors developed algorithms and machine learning systems to extract text, as well as a vibrant online community to share ideas amongst the various teams. In December of 2023, a team of three graduate students succeeded in recovering significant sections of text from a Herculaneum scroll and were awarded the grand prize.

Seales concluded his talk by proclaiming the Herculaneum scrolls “unlost”, and returning to the letter from the soldier to his daughter, he reminded the audience of “the sacred bond between author and reader.”

The lecture was followed by a Question and Answer session:

A member asked about using AI to translate the recovered works. Seales indicated that paporoligists do not want to use AI for translations, “they want to do it themselves”.

A guest asked how to achieve better resolution in the scans, and at what point increased resolution stops yielding better information. Seales responded “We don’t know.” He said that it is clear that at some point higher resolution begins to create problems with scan time, data size, and possible beam damage to the scroll. He said that as result, they need to find the “sweet spot”, but that it is difficult to determine at the present time.

A member asked about the need to retrain the neural network on each scroll, due to changes in handwriting or changes in character forms over time. Seales replied that, initially, they will probably have to re-train for each new scroll, but that eventually he hoped that eventually they could create a “Uber Network”, trained on a wide variety of papyrus, that would be able to recover text from any new scrolls measured.

A viewer on the livestream asked about the hyper-parameters of the winning network: could Seales tell how many hidden layers were used, how wide was the network, and did they use RELU? Seales responded that he could not gives such specific details. He indicated that many different approaches were tried, even within a single team, and that the hyper-parameters varied quite widely over the course of the competition, as well as from team-to-team.

After the question and answer period, President Millstein thanked the speaker and presented him with a PSW rosette, a signed copy of the announcement of his talk, and a signed copy of Volume 1 of the PSW Bulletin. He then announced speakers of up-coming lectures, made a number of housekeeping announcements, and invited guests to join the Society. He adjourned the 2496th meeting of the society at 10:04 pm ET.

Temperature in Washington, DC: 18.3° Celsius
Weather: Cloudy
Audience in the Powell auditorium: 82
Viewers on live stream: 32 …for a total of 114 live viewers
Views of the video in the first two weeks: 2155

Respectfully submitted, Scott Mathews: Recording Secretary

Highlights