Data Auditing

A list of ArchivesSpace metadata elements that are subject to periodic auditing.

data_point

principle

type

query

Containers not associated with an archival object

Completeness

Top containers

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/containers/unassociated_containers.sql

Digital objects not associated with an archival object

Completeness

Digital objects

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/containers/unassociated_digital_objects.sql

Agents not linked to a record

Completeness

Agent person, agent corporate entity, agent family

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/agents_and_subjects/agent_orphans.sql

Subjects not linked to a record

Completeness

Subject

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/agents_and_subjects/subject_orphans.sql

Containers without barcodes

Completeness

Top containers

Total number of containers, total number with and without barcodes

Containers without locations

Completeness

Top containers

Duplicate agent and subject records

Consistency

Agent person, agent family, agent corporate entity, subject

https://github.com/yalemssa/chit_archives_scripts/blob/master/authority_reconciliation/match_names.py

Malformed URLs

Accuracy

Agent person, agent family, agent corporate entity, subject, notes, file versions, external documents

Dead links

Accuracy

Notes, file versions, external documents

Malformed internal links

Accuracy

Notes (odd)

Unstructured dates - expression but no begin/end

Consistency

Dates

Dates with same beginning and end

Accuracy, consistency

Dates

Improper date type usage (i.e. single date for inclusive date)

Accuracy

Dates

Malformed structured dates

Accuracy

Dates

Structured date with no expression

Consistency

Dates

Duplicate/erroneous controlled values

Accuracy, consistency

Enumeration values

Unused controlled values

Accuracy, consistency

Enumeration values

Only active users have permissions

Accuracy

Users

Permission review for departed employees

Accuracy (security?)

Users

Check admin users

Accuracy

Users

Spam users

Accuracy

Users

Accessions unpublished

Accuracy, consistency

Accessions

Preservica digital objects unpublished

Accuracy, consistency

Digital objects

Unauthorized note labels

Consistency

Notes

Boxes without types

Completeness

Top containers

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/containers/boxes_missing_types.sql

DACS compliance

Completeness

Resources

https://github.com/YaleArchivesSpace/yams_data_auditing/blob/master/dacs_minimum.sql

Correct field usage

Accuracy

TBD, see sheet for possible fields to check

Non-dates in date fields

Accuracy

Dates, agent person, agent family, agent corporate entity

Qualifier field usage - only uniform titles

Accuracy, consistency

Agent person, agent family, agent corporate entity

Correct data type usage within fields

Accuracy

TBD

Leading and trailing spaces, periods

Consistency

TBD

Extra parentheses in container summaries

Consistency

Extents

Thumbnail logic

Completeness, accuracy, consistency

Digital objects, file version

BDAWG BD description guideline compliance

Completeness, consistency

Archival objects

Agent and subject best practice compliance

Completeness, consistency

Agent person, agent family, agent corporate entity

BornDigital restriction usage

Completeness, consistency

Archival objects, notes

Subject term usage

Accuracy, consistency

Subjects

Malformed accession numbers (MSSA only?)

Accuracy, consistency

Accessions

Check format of imported LCNAF records

Accuracy, consistency

Agent person, agent family, agent corporate entity

Expired restrictions

Accuracy

Notes

Student names unpublished (MSSA)

Accuracy

Archival objects

Access note formatting for “open” materials

Accuracy, consistency

Notes

Repeated note types in a single record

Consistency

Notes

Missing bib IDs

Completeness

Resource, user defined

Voyager/AS consistency

Completeness, consistency

Resource, agent person, agent family, agent corporate entity, subjects

Archival objects linked to Preservica DOs without events (MSSA)

Completeness

Archival objects, events

AV/BD archival objects without appropriate notes (MSSA)

Completeness

Archival objects, notes

Materials present in DL have thumbnail links

Completeness

Archival objects, digital objects

Text restrictions without machine-actionable restriction

Completeness

Archival objects, resources, notes

Note type usage

General reporting

Notes

Acquisitions reporting - new accessions, part of existing collection or new?

General reporting

Accessions

Counts of records and subrecords by repo, collection

General reporting

Various

Counts of enumeration value usage

General reporting

Enumeration values

Published/unpublished records by collection

General reporting

Resources, archival objects

Unpublished agents linked to published records

General reporting

Agents, resources

Counts of records created by year, month, day, record type, user, etc.

General reporting

Various

Records with no dates

General reporting

Archival objects, resources

Records with no language data

General reporting

Archival objects, resources

Digitization reports - digitized materials, not digitized AV, etc.

General reporting

Archival objects, digital objects, resources

AV/BD reports

General reporting

Archival objects

Preferred Citations (DACS requirements)

General reporting