Archaeological Data Making

Jan 30, 2025
3 min read

One of the things that is increasingly discussed in archaeological literature and scholarship is the work necessary to turn field or legacy data into publishable data. Publishable data is not only documented with sufficient detail to be reused and evaluated, but also organized in such a way that it can serve as evidence for archaeological arguments.

Over the last few years, we’ve stepped up efforts to organize and document data produced by the Princeton Cyprus Expedition at Polis. In the next week, we hope to submit a series of interrelated datasets to Open Context for publication. Each of these datasets represents effort to collate and standardize data from various sources and to prepare each data set to concatenate with other datasets from the project.

The goal of writing this up here is two fold. First, I need to write up some meta-/paradata for these tables before I send them to be published. Second and more importantly, I wanted to try to articulate the complexity of even relatively simple and small datasets and give an idea of how much work it takes to organize an archaeological data for publication.

Our current data organization consists of six tables each of which serves a particular function and relates to the other tables (or files) in some established way.

Every table uses a standardized identifier for each excavation context. This identifier is a composite of the year, area, trench, level, and pass. For example we identify an excavation context as [1988].[E.F1]:[R08].[3].[1].

EF1_Summary_Table

This table adds phase and description to the excavated contexts. In other words, it creates a concordance between our excavated contexts — these are the contexts created during excavation — and our archaeological contexts which reflect our understanding of the depositional history of the site. This serves as a kind of “master” table for all the other information associated with excavated contexts.

Also in this table is a record for whether we have studied the pottery from this level or not. In some cases, such as topsoil, winter wash, scarp cleaning, and units contaminated with modern trash, we did not break the unit into batches, but we did look at the ceramics and included summaries of what we saw.

EF1_Notebook_Images

This table connects the excavated contexts to our scans of notebook pages (well to the file names associated with the scanned notebook pages). There are approximate 500 images of scanned notebook pages. We’e noted all references to any archaeology context on each page (recognizing that some pages include multiple references to archaeological contexts). This will allow us to display all relevant notebook pages along with their archaeological context.

EF1_Registry

This table is among the most complicated to put together and it speaks to the challenges of connecting legacy data to newly developed datasets. The Polis Inventoried Finds Registry was the official database for registering inventoried finds. The database probably dates to the 1990s, if not earlier, and while it was migrated into contemporary formats (right now, it lives in Access), the data in the tables was pretty rough. While the project’s registrars maintained things like registry finds, they were less careful with other fields such as the format used to identify archaeological contexts. We had to standardize the archaeological contexts in this table to connect it to our other tables.

What makes this more significant is that our registered finds are already published on Open Context.

EF1_Catalogue

This table connects catalogue entries that we will publish in the first Polis volume with archaeological contexts. These are largely ceramic artifacts, but it also includes some stone artifacts and a coin. Since many of these artifacts are in the EF1_Registry database, this catalogue has to connect (and expand) to this table as well as to the EF1_Context_Pottery table.

EF1_Image_Concordance

This table is complicated. It has to connect the file names of the artifact illustrations to their archaeological context (EF1_Summary_Table), to the EF1_Catalogue, and to the EF1_Context_Pottery table.

EF1_Context_Pottery

This table is massive. It contains all the context pottery from EF1 as well as records for catalogued pottery. The context pottery from each archaeological context is grouped in batches of similar types of pottery (according to part of the vessel, fabric, color). Each batch gets a number that can be connected to its archaeological context to create a unique number for each batch of sherds. Catalogued artifacts get their own batches to the EF1_Catalogue will connect with this table.

Archaeological Data Making

Recent Posts

Comments