Geotechnical data QA/QC: how to reduce risk
Geotechnical data QA/QC: how to reduce risk
QA/QC as risk governance, not as an administrative step.
Geotechnical data QA/QC is often treated as an administrative step, a set of consistency checks to “clean spreadsheets” before feeding models. This framing is insufficient and, in complex operations, dangerous. Geotechnical data quality should be understood as a safety barrier and an explicit component of risk governance: it underpins decisions on stability, drainage, operational triggers, intervention prioritization, and ultimately, the protection of people. When an organization fails to control data quality, it loses the ability to distinguish real variability of the soil mass from measurement noise, transforms uncertainty into false precision, and creates a chain of decisions that may be technically indefensible in the face of audits, incidents, or regulatory challenges.
Where the real risk arises: data production and use.
The origin of risk is rarely in the software or a lack of analytical capacity. Generally, risk arises in the data, or more precisely, in how it is produced, contextualized, tracked, and authorized for use in critical decisions. Samples collected without a consistent chain of custody, accepted assays without acceptance criteria, incomplete metadata, database integration without versioning, instrumentation treated as a "number" and not as a physical system—each of these flaws creates quality leaks that, when combined, erode the reliability of engineering. This is why QA/QC, when well-structured, is not a checklist. It's a layered architecture that combines quality assurance, quality control, traceability, change governance, and targeted auditing, with clear responsibilities and real power to block inappropriate data for critical uses.
Define criticality: data exists to protect decisions.
The starting point is not defining generic validation rules. The starting point is making explicit what the data needs to protect: decisions. The company must establish data quality objectives, aligned with use cases that vary in criticality and error tolerance. The same type of data may be adequate for trend analysis and inadequate for defining project parameters; it may be sufficient for initial characterization and insufficient to support an operational trigger. This distinction needs to be formalized by a data criticality model. A pragmatic scheme is to classify data into levels that reflect the risk associated with its use: a level focused on critical decisions and operational triggers, a level for project analysis and model review, a level for characterization and trend analysis, and an observational or informational level. In operational terms, this avoids the recurring error of "upgrading" low-quality data to high-impact decisions for convenience, and forces the organization to create an upgrade mechanism: for data to migrate to more critical levels, it must undergo additional reviews, expanded evidence, and independent validation.
Overly consistent or "too good" results also warrant caution. In naturally heterogeneous materials, low dispersion may indicate sampling, preparation, or method problems, and not technical excellence. Mature QA/QC questions both noise and artificial perfection.
Minimal governance: roles, authority, and blocking power
Criticality only works if there is governance. This means defining roles and, above all, defining authority. A design that is sustainable over time separates data production, curation, and use in decision-making, connecting these links through traceability and validation rules. Production involves fieldwork, laboratory work, topography, and instrumentation. Curation organizes standards, metadata, automatic validations, databases, and versioning. Use in decision-making involves engineering and operation, with explicit responsibility for declaring quality and uncertainty when using evidence. This arrangement needs to include a QA/QC leadership function with formal authority to block data intended for critical decisions and design parameters when minimum criteria are not met. Without this blocking power, the process inevitably degrades under deadline pressure, and QA/QC becomes a symbolic ritual.
Layered architecture for QA/QC of critical data
Layer 1 – Planning and technical specification
With criticality and defined roles, implementation should operate through a simple and rigorous flow, from planning to auditing. The first layer is planning and technical specification before the data exists. Here, the objective of the data is defined, the method appropriate to the material, the frequency, the need for replicates, the tolerances, and the acceptance criteria. Stopping points are also defined, at which collection or processing does not proceed without verification. The logic is prevention: reducing the probability that the data will be inadequate from the start. In sensitive materials such as rejects and fines, this step is crucial; inadequate testing or preparation methods generate coherent results "on paper," but disconnected from field behavior, and the organization ends up calibrating decisions based on parameters that do not represent operational reality.
Operational triggers based on low-quality data are false triggers. They increase the likelihood of false negatives, normalize real deviations, and create a sense of control that does not withstand field testing.
Layer 2 – Execution with traceability in collection.
The second layer is execution with traceability at the time of collection. Here, the focus is on chain of custody, unambiguous identification, georeferencing, standardized registration, and context capture. Data without minimum metadata should be treated as orphan data and should not enter the official database. This is a tipping point: mature organizations do not try to "fix later" what was not recorded when it was possible. They block, quarantine, and trigger correction at the source. The same reasoning applies to instrumentation: installation, reference readings, calibration records, and integrity inspections are not attachments; they are part of the data.
Layer 3 – Automatic validation and quarantine
The third layer is automatic validation and quarantine. Upon receiving data, the system should apply objective quality checks: unit consistency, plausibility range, duplicates, field completeness, temporal coherence, coordinates and depths, file integrity, and standardization. This should generate flags, not discussions. Data with critical flaws does not proceed through the flow; it remains quarantined until correction, retesting, or disposal. This layer reduces the human workload and prevents trivial errors from contaminating sophisticated analyses. It also prepares the ground for what really matters: technical validation.
Layer 4 – Technical review and criticality classification
The fourth layer is a technical review by an engineer, guided by physical consistency, geological consistency, and the suitability of the method to the material. This review should test whether the result "makes sense" within the domain and historical context, and whether the dispersion is compatible with the expected variability. One of the most common blind spots is accepting "too good" results as a sign of excellence, when in practice they may indicate problems with preparation, procedure, or method bias. The technical review should also decide when to apply replicates and cross-checks, especially on data that will support parameters or triggers. Finally, the data is classified at the appropriate level and, if intended for critical levels, proceeds to formal approval.
The minimum QA/QC workflow for critical data is simple and uncompromising: production as specified, automatic validation, quarantine when necessary, technical review, formal approval, and use restricted to the authorized criticality level. Any shortcut in this workflow increases systemic risk.
Layer 5 – Single Source of Truth, Versioning, and Auditing
The fifth layer is publication in a single source of truth, with versioning and an audit trail. The organization needs to eliminate the environment of multiple live spreadsheets, parameters that change without recording, and uncontrolled "updated" topographic surfaces. For data, parameters, and models, there must be a minimum change control mechanism. Whenever a parameter changes, the system records what changed, why it changed, what evidence supported the change, who approved it, and what the impact on the decision and risk was. This discipline is what makes engineering reproducible and defensible, and reduces organizational exposure to rework, internal disagreements, and external disputes.
Instrumentation as a physical system, not as a number.
Instrumentation requires specific treatment because its risk lies not only in incorrect values but also in incorrect interpretations. A robust QA/QC program for monitoring is not limited to range checks; it incorporates drift detection, flatline identification, rate of change limitation, redundancy validation, and, most importantly, triangulation with operational and hydrological context. Piezometric readings need to be interpreted in relation to rainfall and operations, displacements need to be correlated with blasting and fronts, and drain flows need to be correlated with water levels and cleaning conditions. Furthermore, there is an operational rule that differentiates maturity from improvisation: data anomalies are not "closed" without field verification when the signal has the potential to impact critical decisions. The goal is to reduce false alarms without increasing false negatives; false negatives are a structural risk because they create a sense of normalcy in a system that is degrading.
Measure, govern, and audit to sustain the system.
For the system to be sustainable, it is necessary to measure and govern. Quality indicators are not vanity metrics; they are risk management mechanisms. A minimum dashboard should monitor metadata completeness, percentage of data in quarantine, validation time, rework rate, number of parameter changes and their reasons, percentage of instruments with up-to-date inspections, and volume of anomalies opened beyond a defined deadline. These metrics feed into short and disciplined routines: weekly anomaly and quarantine screening, monthly quality review and parameter changes, and quarterly targeted sampling audits, in which some critical items are traced back to their origin to test reproducibility and adherence to the procedure. Without this cycle, QA/QC becomes a set of good intentions that does not learn from its own deviations.
Auditing as a mechanism for institutional learning.
In the context of QA/QC, auditing is not an occasional formal event. It is a method of institutional learning. By periodically tracing critical data back to its source, the organization identifies where quality leaks are recurring and corrects the system, not just the case. This is the step that transforms QA/QC from reactive control to capacity building. The practical result is a reduction in operational surprises, a decrease in rework, an improvement in the quality of technical debate in committees, and an increase in the reliability of the triggers and models used in managing the rainy season and the stability of structures.
Maturity test: decision auditability
The most honest way to assess maturity is to test the auditability of a decision. Faced with a critical decision, can the team, in a few hours, reconstruct the chain of evidence with an identified dataset, controlled version, complete metadata, recorded validations, explicit uncertainty, and traceable approval? If the answer is no, the real risk is greater than it seems, because the organization is operating with weak evidence, even if it possesses a large volume of data. Volume without governance is just complexity, and complexity without traceability becomes vulnerability.
Pragmatic implementation and integrity under pressure.
When implemented with discipline, QA/QC reduces risk through direct mechanisms: it anticipates relevant signals by differentiating noise from real change; it strengthens the reliability of parameters and models by tying them to traceable evidence; and it creates institutional learning by preventing the company from repeating the same mistakes with each new campaign. This does not eliminate uncertainty, but it prevents uncertainty from becoming a surprise. And, in operational geotechnics, this difference is often what separates mature risk management from a sequence of late reactions.
A pragmatic implementation plan can be executed in short cycles. In the first few weeks, the organization defines criticality levels, roles, and blocking authority, standardizes minimum metadata, establishes a single source of truth, and applies basic automated validations with quarantine. Following this, it institutes technical review and formal approval for critical data and parameters, creates a change log for parameters and models, and puts a minimum quality dashboard into operation. Finally, it runs the first targeted sampling audit and systematically corrects the largest identified leaks. This path is simple. The difficult part is maintaining integrity under pressure. This is precisely why QA/QC should be treated as risk governance, with rules, evidence, and consequences, and not as an optional data organization step.
Authors:
John Paul dos Santos
Bachelor in Mining Engineering (UFMG), Master in Civil Engineering and Management (University of Glasgow), Specialist in Geotechnical Engineering and Project Management.
Mining Engineer specializing in geotechnics and project management, an international reference in dams and geotechnical structures applied to mining.
Matheus Vicentini
Civil Engineer (Unilavras), Specialist in Geotechnical Engineering (PUC Minas).
Civil Engineer with experience in geotechnics applied to mining, with experience in projects, audits and dam decommissioning works.