This platform implements a multi-stage Bayesian meta-model pipeline for the probabilistic estimation of biological sex from cranial morphological traits. It integrates population-aware conditional probability tables, observer-specific eye-tracking decision weights, logistic discriminant scoring, and a meta-logistic fusion model to produce calibrated posterior sex estimates.
Developed by Nicole Mantl at the Department of Security and Crime Science, University College London as part of ongoing doctoral research into computational methods in forensic anthropology.
This tool is intended for forensic anthropologists, osteologists, and researchers working with skeletal assemblages who require probabilistic sex estimation grounded in Bayesian inference. It is designed to support both casework assessment and academic research into observer behaviour and trait-based classification.
Navigate to the Sex Estimation tab and select the Estimator subtab. Score each of the five cranial traits on a 1β5 scale (1 = absent/minimal, 5 = strongly protruding/maximal), complete the observer profile, and click Run Estimation. Results include Ground Truth and Behavioural pathway outputs, population posteriors, and the observer's predicted expertise group. The Pipeline subtab provides a full methodological narrative and annotated Bayesian network diagram.
Enter cranial trait scores (1β5) and observer background to compute population-posterior and meta-model sex estimation probabilities.
If the population affinity is known, select it to condition the CPTs directly. Leave blank to marginalise over all populations with uniform priors.
The model is a multi-layered inference system that integrates anthropological prior knowledge, empirical training data, observer characteristics, Bayesian population-aware updating, and discriminative modelling. The system produces sex estimates from cranial traits under two parallel parent variables: True Sex (biological ground truth) and Given Sex (behavioural labels assigned by observers). Both strategies propagate through the same computational architecture, ensuring directly comparable outputs.
The upper-left component of the model represents the observer-level variables, including degree, years of experience, professional status, experience with casework, experience with 3D models, and confidence. These variables are used to classify the observer into an expertise Group (e.g., novice, intermediate, expert) using a Bayesian classifier trained on background information from multiple datasets. The predicted Group determines the pattern of visual attention to cranial traits and therefore governs how the model weights the contribution of each trait in the likelihood-ratio computation.
To incorporate how observers actually use cranial traits during decision-making, the model derives decision weights from eye-tracking (ET) data. These weights quantify trait salience based on visual behaviour within each Group. For each trait, two attention measures are extracted from fixation-level data: a fixation-count probability, capturing how often the trait is inspected, and a fixation-duration probability, capturing how long observers concentrate on that trait. These are combined multiplicatively to produce a joint attention weight W(t) = PFixCount(t) × PFixDur(t), then normalised to yield a probability distribution across the five cranial traits. These normalised decision weights enter directly into the LLR computation, ensuring that traits which are more visually influential for the observer's Group exert proportionally greater statistical influence in the final likelihood ratio.
The bottom-left region of the model describes two sources of trait distributions β conditional probability tables (CPTs). Prior CPTs are derived from published anthropological datasets and literature, encoding population-specific distributions of cranial trait scores conditioned on sex. Training CPTs are derived from empirical datasets comprising a “True Sex” sample and a “Behavioural” sample, reflecting observer-estimated sex labels. These CPTs provide complementary sources of information: prior anthropological knowledge and observed data from the training sample.
The model merges prior and empirical CPTs using Bayesian updating to produce posterior CPTs of the form P(score | sex, population). Depending on whether population information is known, the model either conditions on a specific population or marginalises over populations weighted by population priors. This allows the inference process to remain sensitive to ancestry effects while remaining robust when population is unknown.
The posterior CPTs serve as the probabilistic engine for computing the log-likelihood ratio: LLR = ∑t Wnorm(t) · log(P(scoret | Male) / P(scoret | Female)), where Wnorm(t) are the Group-specific decision weights. Two forms are computed: a population-marginalised LLR, integrating over all possible populations, and a population-specific LLR when population is known. Positive LLR values indicate that the observed scores are more consistent with the male CPTs; negative values indicate female-consistent evidence.
In parallel with LLR, the model fits a logistic regression classifier to the empirical sample for each parent strategy. Trait values are z-scored (either per class or pooled), and the model yields a linear decision function: DFlogit = b0 + ∑t βt z(xt). This captures discriminative structure in the data independently of the CPT-based probabilistic assumptions.
Because LLR and DFlogit represent complementary sources of evidence — one generative and population-aware, the other discriminative and pattern-based — the system employs a meta-logistic model. This meta-model takes as input the z-scored LLR, the z-scored DFlogit, and the parent variable (True Sex / Given Sex), and outputs a unified meta-score and posterior probability of male. The meta-model learns how to optimally weight the two systems across both parent strategies.
Finally, using the posterior CPTs, the system computes P(population | sex, scores), yielding a population distribution consistent with the cranial traits under each sex class. This provides an additional layer of contextual information that can assist interpretation.
The overall model integrates anthropological priors, empirical trait distributions, observer background characteristics, eye-tracking–derived decision weights, Bayesian CPT updating, likelihood-ratio reasoning, logistic regression, and a meta-model fusion step. The decision-weight mechanism ensures that the model not only incorporates trait morphology but also reflects how observers actually distribute visual attention during sex assessment tasks.
If eye-tracking fixation data is available for this scoring session, enter it here to update the decision-weight model for the observer's group.