aspace_generate_sort_order
A lightweight Python application to generate a custom sort order from ArchivesSpace.
What does this application do?
The application performs the following tasks on each row of a CSV report containing information about all archival objects within a given collection:
Retrieves the JSON record for the object
Extracts the
ancestorarray from the JSON and reverses it so that the collection-level record is first, etc.Retrieves JSON for each ancestor, excluding the collection-level record, as resource records do not have position values
Extracts the
positionvalue from each ancestor and, if the position value is less than 5 digits in length, add leading zeros. The largest collection in ArchivesSpace has ~88000 records linked to it, meaning that even if the collection were totally flat there would not be a position value greater than 5 digits in lengthConcatenate each position value to a string variable, with a dot after each position. The sort order will end up looking something like this: 00001.00027.00005.00435
Adds this position value in a new column in the input CSV file
Requirements
Running the executable file requires a Mac. You do not need to install Python or any other dependencies. You can just double-click on the file to run the script.
Running the Python script requires Python 3.8+ and the
requestsandrichthird-party libraries
Tutorial
Generating the input file
This application takes the custom All Archival Objects report (CSV format) as input. This report is generated within the ArchivesSpace staff interface. To run this report:
Click on the gear icon next to the repository name in the staff interface
Select
Reportsfrom the drop downClick on the
All Archival ObjectsreportEnter the call number of the collection in the
Call NumberboxSelect
CSVfrom theFormatdropdown menuClick
Start Jobto start the report jobWhen the report finishes, click the
Refresh PagebuttonClick the
Download Reportlink to download the report
Configuration settings
The application comes with a config.json file, which allows the user to specify ArchivesSpace login information and the path to the input CSV file.
Sample config.json formatting:
{
"input_csv": "full/path/to/input_csv.csv",
"aspace_api_url": "https://archivesspace.library.yale.edu/api",
"aspace_username": "yourusername",
"aspace_password": "yourpassword"
}
If the configuration file is not completed, the application will prompt the user for each of these inputs.
Running the application
Double click on the executable file to run the script. If the configuration file is complete, the script will begin immediately. If not the user will be prompted to enter the input file path and ArchivesSpace login data.
Depending on the number of archival objects associated with the collection, the script could take a while to run (test runs completed approximately 675 records per minute). A progress bar will appear which includes the number of records processed and the estimated time remaining.
Output
The application outputs a CSV file, and stores it in the same directory as the input file. The filename will be the same as the input file, with the addition of _output at the end of the filename - i.e. full/path/to/input_csv_output.csv
The output CSV file includes a new first column, sort_order, which stores the sort order that is generated during the process. The values in this column can be sorted within spreadsheet software, script, or other application as needed.
NOTE: If sorting in spreadsheet software, it is important to specifically open the file as plain text. Many spreadsheet software applications will default to General format, which can cause leading zeros to be dropped from the sort order values. Obviously if this happens the sort order will not work properly.
If using Excel, follow these steps to ensure that the output file is opened in plain text:
Open a blank workbook in Excel
Select
Data > Get External Data > Import Text File...In Step 1 of the Text Wizard, select the
Delimitedradio button and pressNext >in Step 2, check the
Commabox in theDelimitersmenuIn Step 3, click on the first column to highlight it
Select the
Textradio button in the Column data format menuClick
FinishSelect the
Existing sheetradio button from the next menu and clickOKThe spreadsheet will populate with the formatted data. To sort, click the
Data > Sortbutton.In the sort menu, check the
My list has headersbutton and selectsort_orderfrom theColumndrop-down. ClickOK.You may receive a Sort Warning asking whether you should
Sort anything that looks like a number, as a numberorSort numbers and numbers stored as text separately. SelectSort numbers and numbers stored as text separatelyand clickOK