Please note – The CAIRSS blog has relocated to http://cairss.caul.edu.au/blog
A short time ago CAIRSS was approached by a Repository Manager from within the CAIRSS community to assist with exporting the contents of their repository to a spreadsheet. It was made apparent that accomplishing this task would greatly assist with Institutional Repository management tasks and most importantly ERA related work.
There are many ways that data can be extracted, moved and converted. The wisest choice is to use tools that are interoperable. An example of this would be choosing OAI-PMH to extract data rather than attempting to communicate with an individual Repositories data storage device or database etc.
Our CAIRSS Technical Officer Tim McCallum has completed a solution to address this task in the form of a Java Web Application. FoREveR – Flexible Repository Export Reporter.
After testing different methods of converting the data including XSLT and Python some research was done revealing some excellent JSON libraries written in Java. The final choice was Java given the fact that the JSON libraries could meet the requirements for this application and that OAI-PMH, The Fascinator and SOLR were all already written in Java.
The JSON data returned is the result of an HTTP request (can be set to fetch all by default). This data is converted to Java Maps and Java ArrayLists for further processing. The application loops through every record that has been returned and creates a Java Set (unique list/master list). This Set is then displayed in the users browser. This is a last minute chance to select or deselect metadata before the final report is written. It is sometimes the case that a metadata field containing a large amount of content is best left out, as this can make the spreadsheet unmanageable from an end users perspective.
Once approved the application creates an HTML file with all data saved to a table. The table includes table headings, table rows, table data cells and unordered lists for repeating information. This file can be opened in Microsoft Excel and Open Office spreadsheet applications or viewed in a browser.
As this software is in the very early stages of its life cycle reports can be created by CAIRSS and emailed out to you. Please contact CAIRSS Central if you think that your institution can benefit from the use of this tool.
The source code is available at http://cairss.caul.edu.au/trac/browser/code/FoREveR for your interest, however it has not been extensively tested. All feedback is welcome. CAIRSS will endeavor to improve and enhance the software to meet your needs.