View Source

In lipid analysis, we usually have three steps of workflow: generating a series of raw data -> interpreted data, -> integrated data.

Data Level	Description	Output
Raw Data	Direct GC-FID/MS outputs: chromatograms, mass spectra, FID signals	MassHunter files (.d, .csv, .xls), chromatograms
Interpreted Data	Identified compounds, concentration estimates, and matching to standards	Excel tables with compound IDs, concentrations
Integrated Data	Data aligned with excavation/sample metadata, synthesised into a summary for research or publication	Master tables, figures, and interpretations

Different levels of data should be archived in accordance with sample information (excavation, preparation and instrumental analysis) to create data traceability.

Excavation and sample	Sample prep	Instrumental analysis
Site name	Lab ID	Instrument (GC-MS or GC-c-IRMS)
Collection number	Vial label	Method (e.g. AQUA-SIM)
Period	Storage shelf	Methodacquisition
Sample type	Extraction method
	Preparation date
	Standards (Internal and methylation correction)

Data Acquisition and Export (GC-FID, GC-MS, GC-SIM, GC-C-IRMS)

GC-FID and GC-MS (scan and SIM) data are generated by MassHunter software.

For example, there is a sequence of measurements.

Bioarchaeological Computational Manual > Lipid Data Workflow > image-2025-6-25_15-55-17.png

Each measurement will generate a folder containing a series of raw data (method acquisition metadata file and data files). Therefore, it is important to store each sample in its initial acquisition folder.

Bioarchaeological Computational Manual > Lipid Data Workflow > image-2025-6-25_15-55-25.png

GC-C-IRMS data is generated by Qtegra.
Each sequence produces a single container file (.imexp) that contains all the measurement data. It is not possible to extract individual measurements from this file. The .imexp format can only be opened with Qtegra® software.

Bioarchaeological Computational Manual > Lipid Data Workflow > image-2025-6-25_15-55-36.png

After initial checks and corrections in Qtegra®, the relevant data is exported to .xlsx format for calibration and for general accessibility.

Bioarchaeological Computational Manual > Lipid Data Workflow > image-2025-6-25_15-55-50.png

It is also possible to export all chromatogram data points, allowing graphs to be reconstructed in Excel ( .xlsx) or other programs as well.

Data Analysis and Computational Tools (Mass Hunter, Chem Station, PCA, Clustering, Omnic)

GC-FID/MS raw data is interpreted by MassHunter software for compound integration, identification and quantification. The samples meeting the overall lipid content threshold (Potsherds: ≥ 5 µg/g; Food crust: ≥ 100 µg/g) are considered interpretable and selected for further interpretation. We’ll identify the compounds (peaks) in each chromatogram and summarise the key compound information (fatty acids, di- and hydroxy-fatty acids, alkanes, alcohols, sterols, etc.) in an Excel table (interpreted data). For acid-extracted samples, concentrations of palmitic acid and stearic acid methyl esters should be quantified for further dilution before GC-IRMS analysis.

Bioarchaeological Computational Manual > Lipid Data Workflow > image-2025-6-25_15-56-3.png

The interpreted data will be further integrated by statistics or modelling (integrated data).

Data reliability/limitation of method

Quantification is a problem for statistics. Every instrument (GC-MS and GC-C-IRMS) generates data under its specific quantitative scale. We are doing qualitative and semi-quantitative analysis for GC-MS. The interpreted data is usually (1 vs 0) present/absent or the ratio/percentage of two/several compounds. But other analyses like GC-C-IRMS are quantitative, which generate accurate concentration data with exact quantified numbers (for example, 25.68 ‰ of carbon isotopes). It’s hard to do statistics when we integrate qualitative GC-MS data with other quantitative data.