2.1 Overview 

The general data workflow outlines how biomolecular and archaeological data are managed from their generation through various usage and up to structured archival. Figure 1# below visualises the core progression, showing data infrastructure, in this case based on SharePoint worksheets and ARHUT data management system and emphasising feedback loops in data management: 

Figure 1. 

2.2 Sampling and Initial Documentation 

Sampling is based on research design and must follow consistent documentation practices. Each sample is given a unique identifier following local laboratory principles of sample labelling, with metadata covering object/artefact type, its collection number, excavation context, coordinates, sample collection date, sampler, and analysis method planned. Documentation begins in field or lab-books and is later transcribed into cloud-based worksheets (e.g., SharePoint, Google Docs). 

References

Niven, K., Jakobsson, U. Databases and spreadsheets: A guide to good practice https://zenodo.org/records/7740647  

MINAS: (DNA) http://www.mixs-minas.org/ 

Stable isotopes: https://doi.org/10.1016/j.quaint.2022.02.027;  

Roberts P, Fernandes R, Craig OE, Larsen T, Lucquin A, Swift J, Zech J. Calling all archaeologists: guidelines for terminology, methodology, data handling, and reporting when undertaking and reviewing stable isotope applications in archaeology. Rapid Commun Mass Spectrom. 2018 Mar 15;32(5):361-372. doi: 10.1002/rcm.8044. PMID: 29235694; PMCID: PMC5838555. 

 

Reiter, Samantha S., Staniuk, Robert, Kolář, Jan, Bulatović, Jelena, Rose, Helene Agerskov, Ryabogina, Natalia E., Speciale, Claudia, Schjerven, Nicoline, Paulsson, Bettina Schulz, Lee, Victor Yan Kin, Canteri, Elisabetta, Revill, Alice, Dahlberg, Fredrik, Sabatini, Serena, Frei, Karin M., Racimo, Fernando, Ivanova-Bieg, Maria, Traylor, Wolfgang, Kate, Emily J., Derenne, Eve, Frank, Lea, Woodbridge, Jessie, Fyfe, Ralph, Shennan, Stephen, Kristiansen, Kristian, Thomas, Mark G. and Timpson, Adrian. "The BIAD Standards: Recommendations for Archaeological Data Publication and Insights From the Big Interdisciplinary Archaeological Database" Open Archaeology, vol. 10, no. 1, 2024, pp. 20240015. https://doi.org/10.1515/opar-2024-0015 

2.3 Data Acquisition and Initial Recording 

Instrument outputs–such as mass-spectrometry (IRMS, GC-MSLC-MS/MS) and sequencing files, or microscopy visuals are collected in vendor-specific raw formats (e.g., RAW, FASTQ), and preferably stored in instrument-related computer and copied into project (shared) folders, securing the back-up versions of initial measurement files.  This raw data is then referenced and linked in combined worksheets (e.g. SharePoint or Google Sheets) that record initial metadata, sampling context, and lab-specific identifiers. These worksheets are used for early-stage review and validation. 

2.4 Data Structuring and Collaborative Editing 

Raw entries are transformed into structured research datasets by moving them to tabular data sheets (e.g. Excel), cleaning data, standardising terminology, and checking for consistency. This step includes: 

These structured datasets form the basis for computational analysis and are maintained within SharePoint for collaborative editing (e.g. “live” editing for Excel, but for other files, it might include different versions edited by different people). Access permissions are set to control changes and ensure data provenance. 

2.5 Data Analysis 

Once structured, datasets can be exported (typically as Comma-Separated Values, CSV files) and processed using computational tools tailored to specific research questions, analyses and data types. This includes data interpretation and evaluation, statistical modelling, pattern recognition, and visualisation. Analyses are typically performed in environments like R, Python, or specialised software, such as OxCal, IsoReader, mMass, or MaxQuant. 

Analytical outputs must be reproducible and versioned, with all scripts and parameter settings documented and stored alongside the dataset, either in SharePoint or linked repositories (e.g., GitHub). 

 

2.6 Data Validation and Feedback 

Structured datasets are subjected to both planned and unplanned quality checks. Users can verify data completeness, coherence, and consistency with raw entries during a formal review process, but often various problems are noticed while working with data. In some cases, those require contextual knowledge, and it is thus not possible to catch all of those during any formal review. Feedback is communicated via SharePoint comments or tracked changes or ARHUT comments / tasks in case the dataset has already been entered to the ARHUT system. Datasets may cycle through multiple revisions before finalization. This feedback mechanism is essential for maintaining data quality and for correcting inconsistencies before deposition. 

2.7 Curation and Archival in ARHUT 

Finalized datasets are transferred to the ARHUT data platform, where they are archived with: 

These datasets become part of the long-term record and are linked to both internal systems (e.g., SharePoint, Archemy, Department of Archaeology) and external repositories (e.g  Zenodo Dryad) and databases (e.g BIAD). 

2.8 Storage Platforms and File Formats 

Each phase of the workflow is supported by designated platforms: 

2.9 Dissemination 

Finalised and curated datasets archived in ARHUT are made available through their dissemination. ARHUTs web interface (https://arh.ut.ee/) allows for structured querying and access to project-specific datasets, enriched with contextual metadata and persistent identifiers. PaleoMIX O.A.D. builds on the ARHUT infrastructure, offering public-facing access to selected datasets from PaleoMIX and related projects. This system enables transparent sharing of research outputs, supports interdisciplinary collaboration, and fosters broader reuse by both academic and public audiences.  


References

Reiter, Samantha S., Staniuk, Robert, Kolář, Jan, Bulatović, Jelena, Rose, Helene Agerskov, Ryabogina, Natalia E., Speciale, Claudia, Schjerven, Nicoline, Paulsson, Bettina Schulz, Lee, Victor Yan Kin, Canteri, Elisabetta, Revill, Alice, Dahlberg, Fredrik, Sabatini, Serena, Frei, Karin M., Racimo, Fernando, Ivanova-Bieg, Maria, Traylor, Wolfgang, Kate, Emily J., Derenne, Eve, Frank, Lea, Woodbridge, Jessie, Fyfe, Ralph, Shennan, Stephen, Kristiansen, Kristian, Thomas, Mark G. and Timpson, Adrian. "The BIAD Standards: Recommendations for Archaeological Data Publication and Insights From the Big Interdisciplinary Archaeological Database" Open Archaeology, vol. 10, no. 1, 2024, pp. 20240015. https://doi.org/10.1515/opar-2024-0015