View Source

1. Rationale for Data Dissemination

1.1 Open Science and Reproducibility

Open science promotes transparency, accessibility, and collaboration in research. By sharing data openly, researchers enable reproducibility of results, foster innovation, and facilitate interdisciplinary studies. This approach aligns with the broader scientific community's move towards openness and accountability.

1.2 FAIR Principles

The FAIR principles provide a framework for effective data management:

Findable: Data should be easily located by humans and machines.
Accessible: Once found, data should be retrievable using standardised protocols.
Interoperable: Data should be compatible with other datasets and tools.
Reusable: Data should be well-described to allow replication and further use.

Adhering to these principles ensures long-term utility and integration of datasets across platforms.

2. Infrastructure for Data Publication

2.1 PaleoMIX Open Archaeological Database (O.A.D.)

The PaleoMIX O.A.D., part of the ARHUT data platform, is built on Directus, an open-source content management system. This setup offers:

Customizable Data Models: Tailored schemas to fit diverse archaeological data types.
API Integration: Automatic generation of REST and GraphQL APIs for seamless data access and integration.
User-Friendly Interface: Intuitive dashboards for data entry, management, and visualisation.

This infrastructure supports efficient data management and facilitates collaboration among researchers.

2.2 External Data Repositories

For broader dissemination and preservation, data can be deposited in established repositories:

Dryad: A curated resource for data underlying scientific publications, ensuring data citation and accessibility.
IsoArcH: A platform dedicated to bioarchaeological isotope data, promoting standardisation and sharing within the community.
Radiocarbon Databases: Repositories like the IntCal series provide calibration datasets essential for radiocarbon dating analyses.

Utilising these repositories enhances data visibility and integration into global research efforts.

3. Data Upload and Publishing Protocols

3.1 Preparation of Data

Before uploading, ensure that the datasets:

Are Complete: Include all relevant raw data and metadata.
Follow Standard Formats: Use widely accepted file formats (e.g., CSV, JSON, TIFF).
Include Metadata: Provide detailed descriptions, including methodologies, instruments used, and data collection contexts.

3.2 Upload Procedure

Access the Platform: Log into the PaleoMIX O.A.D. portal.
Select Appropriate Collection: Choose the relevant data category (e.g., Proteomics, Radiocarbon Dating).
Upload Files: Use the interface to upload data files and associated metadata.
Review and Validate: Ensure all information is accurate and complete.
Submit for Publication: Finalise the upload, making the data available to authorised users or the public, depending on access settings.

3.3 Licensing and Access

Assign appropriate licenses (e.g., CC BY 4.0) to datasets, clearly indicating usage rights. Define access levels to balance openness with any necessary restrictions.

4. Integration with Collaborative Initiatives

4.1 API Utilisation

The Directus-based infrastructure provides APIs designed for efficient data management and enhanced interoperability:

Automated Data Retrieval: Enables integration with various analytical tools and external platforms.
Dynamic Data Visualisation: Supports real-time updates and interactive data interfaces.
Interoperability: Allows datasets to be seamlessly integrated with other data sources for comprehensive analyses.

Utilising APIS significantly expands the utility and accessibility of Paleomix datasets.

4.2 Data Usability for Artificial Intelligence

Recently, emphasis has been placed on structuring and managing data to optimise its usability for artificial intelligence (AI) models. An example of this effort is the COST Action MAIA (Managing Artificial Intelligence in Archaeology), which specifically addresses the challenges and solutions related to making archaeological datasets standardised and AI-compatible, promoting advanced analytical techniques and collaborative research.

5. Requirements for Raw Data and Quantitative Summaries

5.1 Raw Data

Raw datasets should include:

Original Measurements: Unprocessed data directly from instruments or observations.
Calibration Information: Details on standards and procedures used.
Contextual Metadata: Information about the sample, location, and conditions of data collection.

5.2 Quantitative Summaries

Processed data should provide:

Statistical Analyses: Summaries such as means, medians, and standard deviations.
Visual Representations: Graphs, charts, or maps illustrating key findings.
Interpretative Commentary: Brief explanations of the results and their significance.

Ensuring both raw and processed data are available supports transparency and facilitates further research.