RAM FAQ for DPC Members
On this page we provide answers to the questions that DPC Members have asked us about using DPC RAM. If you have any other questions you think we should add to this list, please let us know.
What support can I get for completing a RAM self-assessment?
All DPC members are eligible for advice and support on their annual RAM assessment from the DPC. We can talk through your assessment with you, discuss your target levels and answer any questions you might have. Do contact us if you would like to arrange some immediate support, or join one of our member-only RAM events (e.g. RAM Jam or RAM-bulance surgery sessions).
When should I complete and share a RAM assessment?
The DPC encourages Members to complete and share a RAM assessment on an annual basis. You will receive communications about this in April each year so that the DPC can collate assessments in early June. If you would prefer to carry out an annual RAM assessment to your own internal timetable that is absolutely fine too. Do feel free to use RAM whenever suits you and share it with us at any time of year - we will always be happy to hear from you.
How do I share a RAM assessment with the DPC?
We would like to make the process of sharing your assessment with us as simple as possible. Simply email Jenny Mitcham (jenny.mitcham@dpconline.org) a copy of your RAM worksheet and we will take it from there.
Is it worth sharing our RAM results if nothing has changed?
Yes, please do let us know if nothing has changed when you carry out your RAM assessment. This is still useful information for us and we can roll over your results from last year to include in our analysis.
What are the benefits of carrying out a RAM assessment every year?
Like any maturity model or assessment framework, DPC RAM will have most impact if carried out on a regular basis to check in on progress, refine goals and inform forward plans. Even if targets set within RAM focus on a longer time period, checking in on where you are every year can be helpful and shouldn’t be too onerous a task to complete.
What are the benefits of sharing our RAM assessment with the DPC?
Sharing your RAM assessment with the DPC has benefits both for your organization and for the DPC community as a whole.
-
For you: If DPC staff have access to your RAM assessment it will enable better and more appropriate support to be provided when you contact us for advice and support - it gives us a quick and easy way to understand your current digital preservation capabilities and a good overview of the challenges you are facing.
-
For the DPC community: Combining the RAM assessments of all members provides useful summary statistics which can be used not only for benchmarking by the community but for tailoring DPC activities going forward to target those areas where more support is needed.
How will the DPC use our RAM assessment?
The DPC will be able to use your RAM self-assessment to better understand your organization and its digital preservation practices, along with your current challenges and areas where support might be most needed. We can use this baseline of information to inform our interactions with you and provide better advice and support. Each year we also collate RAM information that is shared with us to get a broad understanding of where the membership sits as a whole. This information is used on an annual basis to inform our work planning and build our prospectus for the year ahead.
What benchmarking information can I have access to?
Summary information from RAM self-assessments can be found on the benchmarking with RAM page. Note that as access to this information is a member benefit and you will need to log into the DPC website in order to view the page.. Please do not share this information outside of the DPC membership.
DPC Members are also entitled to request more specific benchmarking information if they would like to do so. Perhaps you are interested in benchmarking against summary data that is specific to a particular geographic location (for example the UK) or sector (for example higher education). Do contact us with your requirements and we will see if we can help. Note that our ability to service requests such as these is dependent on the availability of an adequate dataset and is typically only available to those members who have submitted information themselves. In order to protect the identity of specific organizations we will not distribute benchmarking data unless a large enough sample of RAM responses is available.
We don’t want others to see our assessment - will it remain confidential?
The DPC are committed to ensuring that individual Member self-assessments are not made available to others and that benchmarking information is made available in summary form only and fully anonymised. When we share benchmarking information, we ensure that the individual organizations cannot be identified within the results.
We are very happy for you to share your own results and compare them with peers and we know many members find this to be a useful process. Though we may set up opportunities to share experiences between members we would never share your RAM results with another organization without your explicit permission.
Will you ever share RAM information outside of the DPC membership?
Access to benchmarking information is only available to members, however, broad observations on RAM results (for example, “this section is typically one of the highest scoring among DPC Members”) may be shared with the wider digital preservation community in the form of blogs or conference papers. More detailed statistics or information about which organizations have engaged with the process will not be shared.
What else can I do to help the community move forward with DPC RAM?
We are always keen to read blogs, articles or conference papers that describe member experiences with DPC RAM. It is particularly helpful to read how an organization has made progress towards their target levels and what tools, techniques and resources helped them do so. If you have a story to tell and would like to write a blog for us we’d love to hear from you.
Do you have advice to help us move towards our RAM targets?
We have produced a Level up with DPC RAM resource that provides tips, resources and case studies intended to provide inspiration and help to move forward with RAM.
Every year (typically November/December), the DPC holds a members-only 'RAM Jam' event which is a forum for DPC Members to share their experiences of using and moving forward with RAM. This can be a great way of picking up tips and ideas from others within the community. Keep an eye out for this opportunity on our events programme.
Another opportunity to discuss DPC RAM with the DPC is our RAM-bulance surgery sessions which occur in April and May every year. Members can book a drop-in session and talk with DPC staff in confidence about any aspect of their RAM self-assessment.
The DPC-DISCUSSION mailing list is a useful forum for asking questions of the whole DPC community. Do use this channel as appropriate.
Further resources and case studies
Case studies
Here are some examples of how DPC RAM has been used by members of the community to help track their progress in digital preservation. If you have a example of DPC RAM in action that you would like to share please contact us:
-
Assessing where we are with digital preservation (2021) - a blog post from Fabiana Barticioti, Digital Assets Manager at LSE Library.
-
From 'starting digital preservation' to 'business as usual' (2021) - a blog post from Anna McNally, Senior Archivist at the University of Westminster.
-
The Postal Museum’s Case Study of the DPC Rapid Assessment Model (2021) - a blog post from Helen Dafter, Archivist at the Postal Museum.
-
5 tips to rock the RAM (2021) - a blog post from Kim Harsley, Archivist at the Natwest Group.
-
Unprecedented times (2021) - a blog post from Hania Smerecka, Archivist at Lloyds Banking Group.
-
Are we winning? Other measurables for digital preservation (2021) - a paper presented at the 2021 iPRES conference from Tim Evans of the Archaeology Data Service
-
DPC RAM: Levelling up (2022) - a blog post from Silvia Gallotti, Archivist at the LSE Library.
Further reading
Other articles and papers about DPC RAM are listed below.
-
Going for Gold or Good Enough? Observations on three years of benchmarking with DPC RAM (2022) - an article by Jenny Mitcham and Paul Wheatley published in the proceedings of the iPres conference in 2022
How was DPC RAM developed?
How was it developed?
The model is primarily based on Adrian Brown's Digital Preservation Maturity Model (published in Practical Digital Preservation: a how-to guide for organizations of any size, 2013).
This model was developed with the following guiding principles in mind. It aimed to be:
- Applicable for organizations of any size and in any sector
- Applicable for all content of long-term value
- Preservation strategy and solution agnostic
- Based on existing good practice
- Simple to understand and quick to apply
The first version of DPC RAM was developed, tested and refined with input from DPC Members and Supporters including those who make up our Research and Practice Sub-Committee. Particular thanks go to Adrian Brown for his support throughout the process. Work on the DPC RAM was carried out in conjunction with the Nuclear Decommissioning Authority as part of a two year collaborative digital preservation project ‘Reliable, Robust and Resilient Digital Infrastructure for Nuclear Decommissioning‘.
The first version of DPC RAM was launched at the iPRES conference in Amsterdam in September 2019 in the Lightning Talks session.
Version 2 of DPC RAM was released in March 2021. Revisions to the model were made in response to community feedback and evolving good practice. Particular thanks go to Hervé L'Hours and Simon Wilson for their detailed feedback and the DPC's Research and Practice Sub-Committee and Adrian Brown for reviewing the proposed changes. A summary of some of the changes made can be found in the following blog post: DPC RAM (version 2) - what has changed and why?
Preserving records from an EDRMS: a case study
Hugh Campbell, Public Records Office of Northern Ireland (PRONI)
The Northern Ireland Civil Service (NICS) selected TRIM as the software platform for its corporate Electronic Document and Records Management (EDRM) system following a procurement exercise in the early 2000s. TRIM has subsequently evolved through a number of manifestations and is now (Micro Focus) Content Manager. The NICS currently uses CM 9.4.
A number of Public Record Office of Northern Ireland (PRONI) staff were involved in the initial procurement project and PRONI was one of three lead implementers of the system. This proved to be very beneficial as we had a member of staff who was interested in records management and was an obvious selection for the role of system administrator for the PRONI implementation. This afforded us a great opportunity to learn about the product particularly as we had someone with higher privileges than regular users.
We were very aware that, although Retention and Disposal hadn’t been implemented at that point, PRONI would receive records from the corporate EDRM system at some point in the future. The two obvious areas for research and investigation therefore were:
- Metadata; and
- Export
We spent some time researching metadata standards before a very simple realisation dawned on us – the only metadata we could get was what was in the system. It didn’t matter what was being recommended, if it wasn’t in the source system then we weren’t going to get it. This led to a more focussed look at the actual metadata within the EDRM system. We did this by:
- Going through all the screens and recording the metadata; and
- Using the out of the box export and examining the output.
The next stage involved lots of meetings and discussion as we examined each piece of metadata and tried to make an objective decision as to the value of keeping it (in a digital repository for ever). In making our decision, we took into account what we understood the metadata to mean and considered how useful (or confusing) it may be for future generations. For example, one item within the EDRM system which generated considerable discussion was ‘Creator’. On the surface, ‘Creator’ sounds like an important piece of metadata to retain. Investigation, however, revealed that ‘Creator’ did not necessarily guarantee a meaningful association with a digital object. It simply recorded who saved the record into the EDRM. In the case of senior civil servants, who may be creating a substantial percentage of the content in which future generations may be interested, records were often being saved by secretaries or personal assistants (who had no other association with the record). In this case, we decided that it could be a very confusing piece of metadata and so we decided not to take it. The various date fields stored in the EDRM system also generated considerable discussion. It should be noted, however, that not every piece of metadata warranted the same level of consideration, particularly those items that were obviously required.
One of the benefits of the EDRM system was expected to be the reduction in duplication which would arise from the use of ‘links’. Undoubtedly this has been the case, particularly when a ‘link’ rather than an attachment is emailed to multiple recipients. These ‘links’, however, were also the subject of considerable discussion in an attempt to reach a decision on what we would do with them. The final decision here was to generate a text ‘stub’ based on the content of the link.
After lengthy research and discussion, we eventually settled on the metadata fields we would take from the EDRM system – this is shown below.
The EDRM system was supported by a Managed Service Provider when we were developing the means to export records and metadata. We worked with the Managed Service Provider to specify and develop an export that:
- Copied each container selected for transfer out into a Windows folder on the file system, and;
- Created a metadata csv file within each folder, with one row of metadata for every object within the folder.
We also used this metadata layout as a standard template for the metadata associated with all digital records transferring to PRONI. As part of our processing, we will supplement this with more metadata, for example the metadata generated by DROID, and we will populate the ‘PRONI use’ fields.
Like most great plans, however, it has not all been plain sailing. We have sought to tweak the metadata slightly over the last few years and we know that there will be occasions when we will have to develop some scripts to manipulate metadata before it is presented to our digital preservation system for processing. To date, two Public Inquiries have transferred over 51,000 records from the EDRM system to PRONI - proof that the process works.
To find out more about PRONI, please visit our website and follow us on Facebook and Twitter.
PRONI - EDRM system metadata template
FIELD NAME |
DESCRIPTION |
SysInfo |
Name of originating System |
SysVersion |
Originating System version |
LocalSysName |
Local System Name |
DataExportDate |
Date exported from EDRM System |
ClassificationTitle |
The titles, separated by space | (pipe) space, of the classification levels excluding container holding records |
ContainerTitle |
The title of the container or folder containing records |
ContainedRecords |
The number of original digital objects in a container |
ContainerRecordType |
The container record type description |
ContainerId |
The ID of the EDRM container level classification |
ContainerLongId |
The full ID of the EDRM container level classification |
ContainerLevel |
The level of the container within the classification |
ContainerNotes |
From the Notes tab of the container |
OriginalFolderPath |
Path of interim location of data files on the export server prior to transfer to PRONI |
RelativeFolderPath |
This is the relative data path following structure defined by PRONI (Accession Number\"data"\transfer identifier\ContainerID\) |
DateClosed |
Date that the container was closed |
DPID |
FOR PRONI USE (Digital Preservation Unique Identifier) |
RecordType |
Name of the Record Type |
Description |
The original textual description of the record |
Filename |
The filename and extension of the digital object |
RecordNumber |
The unique identifier within an EDRM System |
RecordLongID |
The unique identifier within an EDRM System |
Notes |
From the object's 'Notes' tab record metadata |
Language |
Language of the intellectual content of the resource |
DateCreated |
Date of creation of the digital object |
DateModified |
The date on which the digital object was last modified |
Author |
Person who composed the digital object |
FileSize |
Exact size of the object in bytes |
RelatedRecord |
Details of related objects |
RelationshipDetails |
Description of relationship eg attachment to email or document embedded within another document |
AccessDecision |
Determines whether or not the Access decision permits the digital object to be viewed by the public |
RecordAccessExemptions |
If record is Closed for FOI/DPA/or other reasons |
ClosureReason |
Free text field describing reasons for decisions to close |
NextAction |
Next Action for record |
NextActionDate |
The date on which the next action on the record will occur |
OriginalFilename |
If the filename is more than 200 characters, the filename should be recorded here prior to being truncated - see Filename |
BusinessArea |
Business area to which the record relates |
InformationAssetOwner |
Information Asset Owner as determined by the business |
Reviewer |
Name of person who reviewed file |
DateReviewed |
Date file was reviewed |
DepartmentalInformationManager |
Name of Departmental Information Manager approving decision |
DateApproved |
Date approved by Departmental Information Manager |
RightsStatus |
This will be either Crown Copyright (Government Records) or other details agreed at submission with depositor |
RightsCustodian |
The person identified as having management powers over the digital object with regards to access |
RightsNotes |
Free text field containing additional information on the copyright/licensing of the digital object |
AccessCopyRequired |
Is an access copy required for access systems |
Comments |
Free text field containing any comments relating to entries on the file format registry |
PCPRef |
FOR PRONI USE |
MD5Checksum |
MD5 checksum if EDRMS stores checksum |
UserDefined2 |
|
UserDefined3 |
|
UserDefined4 |
|
UserDefined5 |
|
EOSM |
End of standard metadata |
AdditionalMetadata |
Determining risk: a case study
Nicola Steele, Grosvenor Estates
Background
The Grosvenor Estate is an international, diverse, privately owned company. The Grosvenor Estate encompasses all the activities of the Grosvenor family and each of its parts has a distinct focus but share the same values and a common purpose of delivering lasting commercial and social benefit.
The Family Office portion of the estate manages the Grosvenor family’s rural estates in the United Kingdom and Spain, their philanthropic activities through the Westminster Foundation, Realty Insurances, and other specialist functions largely focused on heritage and conservation.
The collection, dating as far back as the 12th century, documents the history of the Grosvenor Family as well as that of the Eaton Estate and other rural estates and businesses owned by trusts of the Grosvenor Estate on behalf of the Grosvenor Family. The archives collection is held primarily for the benefit of the Grosvenor Family, internal departments and the Trustees of the Grosvenor Estate.
The EDRMS Preservation Task Force
Back in early 2020, I volunteered to join a new taskforce initiative from the DPC on Electronic Document and Records Management System (EDRMS) preservation. Although I was not part of the implementation of the current EDRMS (SharePoint) in use in our organization, I was keen to learn, and be part of the learning process, about how safe data and records are in an EDRMS environment. As the Assistant Archivist in our organisation, working largely on digital preservation, I was especially interested as to the possibility of records remaining in an EDRMS long term. Huge amounts of data are held within EDRM systems, and some will be identified as having long term value and therefore be flagged for preservation. This raised many questions about safe transfer of records from an EDRMS to a digital archive, how and to what extent processes can be automated and what metadata can and should be captured. More specifically for this case study, a subgroup addressed the issue of how safe it is to leave data within an EDRMS long term and what features and functionality (or policies) should be in place to provide assurance that the records are safe in the EDRMS for a period of time. It was determined that, rather than re-inventing the wheel, use could be made of existing risk assessment models to aid us in gauging how safe EDRMS environments are.
Testing the tools
I offered to trial the National Archives UK's DiAGRAM tool against the digital collections held in SharePoint, our organization’s EDRMS. To do this, certain questions from the National Digital Stewardship Alliance (NDSA) Levels of Preservation and the Digital Preservation Coalition Rapid Assessment Model (DPC RAM) tools needed to be answered and used to populate some answers in the DiAGRAM tool. I very quickly decided to complete both former tools in their entirety, as opposed to just a few questions from each, because I felt they could offer more of a rounded understanding and assessment of the EDRMS environment. That decision was helped by the fact that both tools are not terribly time consuming to complete and are easy to use.
Once I had completed the NDSA and DPC RAM tools, I then started with the DiAGRAM tool. A set of questions need to be answered before the tool could be used in earnest, so I created a word document to provide the questions and answers required. I did attempt to be as accurate in my answers as possible, but some were a little out of my knowledge range and so I had to accept that the results may be slightly inaccurate, but by no means far from correct!
Firstly, I created a model as a baseline for the assessment. This in itself proved very useful. The input into the tool was straightforward and the end result, specifically the visualisation of the results, is excellent. I have often commented that this sort of visualisation is what senior managers in our organisation will find the most informative and useful, even as a way of showing our current capabilities, without even thinking of what we could do to improve.
The next step was to create a scenario in the model, where answers to some questions are changed, to try and improve the results and to illustrate how actions taken can have a big impact. This is a great way to ask permission from senior management for activities to take place, or to justify why you have decided to take certain actions on your digital collections, and the results they have. Since I wanted this to be as close to a real-world activity as possible, I chose to alter my answers in the Information Management section, to show what position we would like to be at and what I hope we would be able to achieve with some work and collaboration. These altered answers achieved an astonishing increase in our results (from 3% to 30% for renderability). This would allow me to create an action plan and roadmap to present to the appropriate management to show current downfalls, what we would ultimately like to achieve (and why) and how we could achieve that. In terms of risk assessment, it seemed clear that actions taken around preservation metadata would improve our confidence in the intellectual control we have over, and safety of, our digital assets as a first step. Future risks and steps could be articulated, but for this exercise, I chose to concentrate on one area I believe we can tackle and make progress in effectively.
It is perhaps worth noting that I found the NDSA and DPC RAM tools more applicable to digital preservation environments, whereas the DiAGRAM tool can be easily applied to any environment holding digital assets. In this instance, the environment was our EDRMS, SharePoint. Therefore, when using the former tools mentioned, it is worth remembering that the EDRMS is not necessarily functioning at this point with preservation activities in mind. It is functioning to fulfil a current business requirement and so some of the questions should be approached with this in mind.
Conclusion
In conclusion, I found all three models useful for gaining an understanding of the capabilities and functions of our EDRMS environment, but the DiAGRAM model stood out the most for me. When I presented my findings to our subgroup, I was asked if I would choose to keep records long term in an EDRMS, or transfer them over to a digital preservation environment, having completed this exercise and had time to digest the results of all of the models used. My answer depends on current circumstances. If we did not have a digital preservation system in place (which we do), then I would push fairly quickly for some changes around metadata, specifically preservation metadata, to be made within the EDRMS environment. But, since we do have a digital preservation system in place, my answer for our situation was that I would have records moved to this environment once they had served their business and/or legal functions. Some organisations will not have the luxury of having use of any type of digital preservation environment, so we cannot dismiss the idea that their EDRMS covers all of their digital assets and potentially their preservation actions.
Having completed this exercise, I believe using a tool such as DiAGRAM, or a compilation of tools as I did, is a very useful (and probably could be considered essential) project for any organisation dealing with digital material deemed worthy of long-term preservation. Whether as part of a business case to enhance the current EDRMS setup, to procure or develop a digital preservation system, or to form part of a risk or disaster register for example. The potential uses are numerous and could be hugely beneficial to any organisation.
National Archives of Australia EDRMS Sentencing and Transfer Project: a case study
James Doig, National Archives of Australia
Corporate EDRMS
The National Archives of Australia (NAA) has had a corporate EDRMS since 2000. The product initially purchased was TOWER Software’s TRIM Captura, which NAA upgraded to TRIM Context in 2006. NAA has regularly upgraded the EDRMS and we have now deployed Micro Focus Content Manager 9.4 in all State and Territory offices. The EDRMS technology used by NAA has remained the same - TOWER software was acquired by Hewlett-Packard in 2008, which sold its software division to Micro Focus of the UK in 2016.
The EDRMS is integrated with Outlook and the common desktop record-creating applications, so that emails and documents can be checked into TRIM from the applications themselves, or by drag-and-drop. Other government agencies have customised more complex integrations, for example Sharepoint-EDRMS.
Project background and outcomes
In 2012, NAA commenced a project to sentence all records in TRIM Context created between 1 January 1998 and 31 December 2008, export the “Retain as National Archives (RNA)” component from the EDRMS, and ingest the RNA component into the digital preservation system. Although the project was completed almost ten years ago, and noting that processes and system functionality have improved in the meantime, there are still some useful lessons learned, particularly regarding preservation issues.
The project team comprised 4 people including three sentencing officers, and indeed the focus was on sentencing, a uniquely Australian term for the process of applying disposal decisions to records using the legal authorisation – a Records Authority (RA) for functions specific to an agency, or a General Records Authority (GRA) for general administrative functions. The project took about 8 months to complete, and sentencing, which was effectively a manual process, took about 5 months. About 34,000 TRIM files or containers were sentenced comprising about a million records. Records were sentenced at TRIM file/container level, unless there was a good risk-based reason to go into the file and look at actual records. The proportion of records sentenced “Retain as National Archives (RNA)” was about 10%, quite a high proportion compared with physical records (generally 3-5% for permanent retention), and about 3,000 files, comprising close to 100,000 records, were ingested into the digital preservation system.
More detailed project statistics are as follows:
-
31,693 TRIM files/containers were sentenced (comprising over a million records)
-
2% were approved for destruction in 2012
-
80% were identified for destruction in future years
-
2% were placed on hold
-
10% were transferred as Retain as National Archives (RNA)
-
6% were identified for destruction using the Normal Administrative Practice (NAP) Policy (empty, redundant or practice files or files with no documents attached)
Sentencing
EDRMSs are defined by their compliance with international recordkeeping standards such as ISO 15489 and ISO 16175. Therefore, EDRMS products must have appraisal and disposal functionality built into them. In this case, following sentencing, the appropriate disposal class (a unique number that links to a disposal action in a RA or a GRA) and the disposal action (RNA, Destroy, NAP) were entered into the EDRMS and a User Stamp applied, which automatically applied the name of the sentencer and a timestamp.
The time-consuming, manual approach to sentencing was identified as a significant pain point and subsequent work has focused on the feasibility of using AI and machine learning technology to automate disposal decisions; that is, to develop an accurate and scalable way to decide the value of government digital information and data in order to determine whether it should be retained or destroyed.
Destruction concurrence
The process of obtaining business owner approval to destroy records is known as destruction concurrence (or just concurrence). Concurrence was automated as a digital workflow in the EDRMS, which created efficiencies, though at times it was difficult to identify the business owner due to organisational change over time. More importantly, staff were still using records whose destruction due date had passed, so in many cases records were retained in the system and not destroyed.
Review of records to confirm RNA status and quality check metadata
Key record metadata (record number, title, disposal class, security level, date created, date closed) were exported into a spreadsheet and reviewed to confirm RNA status and to quality check and correct errors in record titles (including expanding acronyms). In addition, a unique item number was applied to each record, a requirement of NAA’s archival management system. When the review was complete, the revised metadata file was imported back into the EDRMS using the TRIM import/export application called TRIMDataPort.
Export RNA records and metadata
Using TrimDataPort, records identified as RNA were exported out of the EDRMS into a directory location. The record export process does not retain the physical aggregations of records represented in the EDRMS (e.g. Containers: in the NAA example “Files” and “File Boxes”), for example via a directory/folder structure. Rather, these aggregations are represented in recordkeeping metadata via the record number, and so could be reconstructed in the archival management system through item relationships such as Item/Sub Item or Aggregate Item/Constituent Item.
Also, at the digital file level, files were given TRIM database identifiers (e.g. rec_1387634.DOCX), rather than the record title given by the record creator. Since this exercise, the functionality now exists to choose the record title, record URI, database ID, or a combination of the three.
TRIMDataPort exported recordkeeping metadata to delimited form (e.g. CSV). While TRIMDataPort can export metadata in XML, NAA’s archival management system requires metadata to be imported in a delimited format. In addition, the archival management system can import and manage only a subset of the full suite of recordkeeping metadata, and decisions needed to be made about what, if any, additional metadata needed to be retained (for example, is it necessary to retain Movement History and TRIM audit metadata?) and how to manage the additional metadata (for example, this metadata could be managed as a Control Series in the archival control system, or managed in the digital preservation system).
Dates are always critical for archival control. The NAA’s archival control system requires Date Created, Date Last Updated (particularly important for us as it determines when a record becomes publicly available), and Date Registered (what the Australian Series System calls Accumulation Date).
Ingest into the digital preservation system
A Submission Information Package was created using an in-house developed SIP creator, which included the generation of checksums for each digital object. The SIP was successfully ingested into the NAA’s bespoke digital preservation system. Note that NAA has recently procured Preservica as the replacement digital preservation system, and different tools and processes are in development.
Lessons Learned
There were many useful lessons learned from this project, and these have fed into improved processes and better use of Content Manager functionality. Those listed here relate directly to digital preservation and ongoing access issues:
-
Emails with stub attachments linking to a record in the EDRMS are problematic when exported from TRIM. These stub attachments are usually made when sending a record reference from the EDRMS, but they can also be made via Outlook. These links fail when records are exported from TRIM, and automatically generated metadata about the linked record (usually the record number and record title) might not have been retained in the body of the email, therefore the record is incomplete. Even if the record number was retained, this doesn’t mean that this was the version of the record actually emailed as it could well have been edited after the email was sent.
-
A possible solution to this problem would be to capture version number, not record number in the body of the email. However, NAA transferred finalised records, not TRIM versions. Unless versions were captured as separate records in the EDRMS, versions would not be captured.
-
A key lesson learned was the need to do a detailed analysis of formats prior to ingest into the digital preservation system. An EDRMS is not fussy about what formats you can check into it, and we’ve found there are lots of complex formats in TRIM that we could have identified up front as needing, for example, better documentation, such as dozens of legacy Access databases and AutoCAD files. Some complex formats, for example aggregate email formats such as PST and MBOX could usefully be de-aggregated and described prior to ingest. The preferred approach to format analysis would be to use a format identification tool such as DROID following record export from the EDRMS. There may also be EDRMS functionality to run a report, for example by file format extension, though this isn’t a failsafe method of identifying format.
-
A good example of problems resulting from not undertaking a thorough analysis of formats is issues encountered dealing with a couple of email formats. A feature of earlier versions of TRIM is that MS Outlook emails, when checked into TRIM, were saved as TRIM Outlook Saved Message Format with the extension VMBX. A similar format is MS Windows Outlook Express email, which has an extension MBX. These formats are plain text files; any attachments are base64 encoded in the body of the file. While VMBX and MBX files can be rendered perfectly in the TRIM viewer, when exported from TRIM the base64 encoding will need to be decoded for access. We have about 40,000 of these files in the digital preservation system and we’ve made a PRONOM submission for them so that they can be identified by PRONOM-based format identification tools. Later versions of TRIM, or what is now called Micro Focus Content Manager, has a built in Mail Conversion Format tool that can migrate these formats to EML.
-
As described above in the section on destruction concurrence, TRIM can also automate authorisation and approval workflows, for example authorising expenditure. These digital workflows are retained as metadata, which will need to be retained if the authorisations/approvals are part of the RNA record.
-
The MS Windows character limit on file names (260 characters) caused problems, but once the problem was identified it was possible to script a solution.
-
Finally, archival control of records from EDRMSs is not a trivial exercise. Good archival management depends on a number of factors that can be hard to control. First is the quality of recordkeeping metadata. Archives reuse recordkeeping metadata for archival control, so the quality of metadata is critical, and this can vary dramatically within and between government agencies, particularly record titles. Second is the difference between EDRMS metadata capability and the metadata capability of the archival control system. Decisions need to be made about what recordkeeping metadata is retained and where it is stored. Third is the sophistication of the archival control system to properly manage record relationships and representations. In other words, what do you do if the data model of your archival control system can’t deal with the complex web of record relationships that we see in EDRMSs? This issue is not just about effectively documenting relationships within and between records, but also relationships within and between other entities, and relationships/integrations with other software applications. The solutions – replacing the archival control system, or introducing a new data model - are significant, long-term projects. This issue resulted in a large project to develop a new data model, the Archival Control Model, for government records, and a revised metadata schema.
Conclusion
The key learning of the project was the need to fully analyse and understand the EDRMS prior to transfer and ingest. EDRMSs provide a range of options for configuration. Options may include differing system interfaces (web, simplified and full featured versions), methods of integration with other software applications, presentation of search results, and export options/functionality. Some system settings may affect the operation of other, seemingly unrelated, aspects of the system. Reasons for choosing certain options, views and settings should be documented and understood. Similarly, the EDRMS does not operate in isolation. Policies and guidelines governing use of the EDRMS should be understood as part of the system analysis process and also captured in the transfer process, for example there may be rules governing titling, capturing record versions, email attachment record references and so on.
Transfer of records from an EDRMS into a Digital Preservation system: a case study
Elvis Valdes Ramirez, UN International Residual Mechanism for Criminal Tribunals (IRMCT)
Background
The International Residual Mechanism for Criminal Tribunals (“the Mechanism”) is the successor of the International Criminal Tribunal for the former Yugoslavia (“ICTY”) and the International Criminal Tribunal for Rwanda (“ICTR”) which has, over the last two decades, accumulated large quantities of digital records. The Mechanism is mandated, under Article 27 of its statute, with managing, including preserving and providing access to, the archives of the ICTR, the ICTY, and the Mechanism itself. The digital component of the archives is estimated at approximately 3 petabytes and is composed of all types of born digital and digitized material in a variety of formats coming from network shared drives, business systems, Electronic Documents and Records Management Systems (“EDRMS”), email systems, websites and a selection of bespoke systems which were developed in-house. The Mechanism implemented an EDRMS, currently HP Records Manager, which is in use for the management and access of records, and a Digital Preservation System (DPS) (Preservica) for the preservation of digital material. The Mechanism’s implementations of the EDRMS and the DPS adhere to policies and guidelines established by the United Nations Archives and Records Management Section (ARMS) and follows International good practice and standards.
The challenge
The main challenge was to find a solution that would facilitate appropriate packaging and structuring of metadata records and their related objects (files) after they are exported out of the EDRMS and before they are ingested into the DPS. This was all to be done in a manner that conforms to United Nations policies and international standards for good practice.
The following list highlights some of the agreed prerequisites, constraints and assumptions for the solution
-
Records exported out of the EDRMS consist of metadata and objects (files).
-
There are no similar technical implementations by other organizations to transfer records from an EDRMS to a DPS which could be used.
-
A proper assessment of the export functionalities and capabilities provided by the Mechanism’s EDRMS system is done.
-
Metadata of each record exported out of the EDRMS must be packaged and formatted in XML, using a metadata standard approved by the Mechanism. A proprietary metadata schema must be built to accommodate information that cannot be mapped using existing metadata standards.
-
Ingested records must have a pre-defined minimum set of metadata for digital preservation.
-
No resources (both technical and human) are available to develop a programming interface using the APIs and SDKs that are available in both systems.
-
The DPS technical capabilities for creation of Submission Information Packages (SIP) and ingest of records are properly assessed and well understood.
-
A solid mechanism of integrity checks and access control must be implemented during the process.
-
Solution must be implemented with existing resources and endorsed and approved by management.
Case study
In order to address the challenge the Digital Archivists of the organization clearly articulated the business requirements. Initial assessments were made to either wrap the selected metadata within METS files or create BAGIT files and then ingest those as SIPs into the DPS. After testing this was discarded in preference for a tool which came with the DPS for creation of SIPs that includes descriptive metadata. Other metadata (structural, administrative, preservation and technical) are added during creation of Archival Information Packages (AIP). A decision was subsequently taken to develop an application to automate the packaging and structuring of metadata and associated objects for ingest. The application must input records exported out of the EDRMS and save descriptive metadata files and related objects (files) in the predefined structure which is required for creation of SIPs by the SIP Creator tool which came with the DPS.
Main steps in the application’s workflow
1. The application is launched.
2. A user enters the following parameters:
-
Location of exported file from the EDRMS
-
A character delimiter, if a delimited separated value file is uploaded
-
The style sheet to use (the style sheets are based on a metadata schema e.g. MODS)
-
Additional information such as: prefix for output file names, output file’s extension, etc.
-
Output folder where the files and their metadata are going to be saved
3. A user starts the process.
4. The application packages records (metadata) exported out of the EDRMS in the selected metadata standard format and related objects (files), and creates a predefined structure in the selected location, where the output is saved to be used as input by the SIP creator tool of the DPS.
The following were the key requirements/specifications of the application
-
Metadata exported out of the EDRMS and used as input by the application must be in a delimited separated values file (comma, tab, etc.), or XML format.
-
Columns on the exported files out of the EDRMS must contain metadata information, and optionally some configuration information used by the application.
-
Metadata created for each record must be based on an XML schema (international metadata standard or bespoke schema).
-
Separate style sheets (xslt) must be created for each metadata standard used in the application, mapping columns on the delimited separated values file exported from the EDRMS against respective schema elements.
-
Style sheets used to create metadata files must be (routinely/regularly) validated against corresponding schemas by the application.
-
Mapping of metadata columns on the delimited separated values file exported from the EDRMS against schema elements must be validated by the application.
-
Checksum calculation must be conducted on objects (files) when they are moved or copied to the output location during the process using one of existing algorithms (MD5, SHA1, etc.)
-
The application must save the objects (files) and their related descriptive metadata files in a predefined hierarchical structure.
Which metadata to preserve
The table below lists some of the metadata fields that you may wish to capture from an EDRMS or other record keeping system when migrating records of long term value to a preservation system.
Recognising that different organisations may require different fields depending on their context and the anticipated future users and use cases of the records, a set of metadata fields are listed with some description and notes and a list of reasons why it might be important to capture in particular contexts.
It should be noted that not all record keeping systems will capture and store all of the metadata fields described below. Many of the fields may be commonly found in EDRMS, but perhaps not other less controlled systems in which records are stored.
Decisions on which metadata to capture will need to factor in the following considerations:
-
Does the record keeping system store this information?
-
Can this information be extracted from the record keeping system?
-
Can this information be stored within the digital archive?
Record level metadata
You may wish to capture the following metadata at record level:
Metadata field |
Definition |
Notes |
Why you might need this |
File name |
The file name of the record as stored in the record keeping system |
Note that the system may allow duplicate file names or may allow file names to include special characters that may cause problems once the files are exported into a file system (e.g. \/:?*”<>|). If this is the case, files may be renamed on export and it is important to ensure that the metadata includes details of the original file name of the object as stored in the system. |
You should consider capturing this information in the following circumstances:
|
File format |
The file format of the record as defined in the system |
An EDRMS or other system may record the file format of each record. This may not be as thorough or accurate as the file format identification that you would wish to carry out within a digital archive (for example it may state the file is a PDF but not which version). It seems likely that file format identification would be carried out outside of the system, either as a pre-ingest step or as a part of the ingest process as records move into the digital archive. |
You should consider capturing this information in the following circumstances:
|
Previous file format or file extension |
The previous file format or extension of a record |
In certain circumstances, a record keeping system may change the format of a file on capture or upload. An example that has been noted is the conversion of emails to a format specific to the EDRMS in which they are stored. If a conversion such as this has occurred, there may be evidence of this within the metadata. |
You should consider capturing this information in the following circumstances:
|
MIME type |
The MIME type of the record as defined in the system |
The system may store information about the MIME type of each record, but also is typically captured as part of pre-ingest or ingest routines within a digital archive. |
You should consider capturing this information in the following circumstances:
|
File size |
The size of the record (in KB/MB as appropriate) |
The record keeping system may store information about the file size of each record, but this information is also typically captured as part of pre-ingest or ingest routines within a digital archive. |
You should consider capturing this information in the following circumstances:
|
Number of files |
The number of files that make up a single record within the system. For example this may apply to the contents of ZIP file, emails with attachments or number of messages within a PST file. |
This metric will only apply to certain records. Note that metadata about number of files in total within a transfer or export is discussed under ‘Transfer level metadata’. |
You should consider capturing this information in the following circumstances:
|
Digital object specific dimensions |
This metadata would be specific to particular types of digital object and could include:
|
The system may contain metadata relating to the dimensions of digital objects and this will be specific to the types of records contained within it. |
You should consider capturing this information in the following circumstances:
|
Language |
Language of the digital object. |
|
You should consider capturing this information in the following circumstances:
|
Character encoding |
Character encoding of the digital object. For example ASCII, Unicode, UTF-8. |
|
You should consider capturing this information in the following circumstances:
|
Unique identifier (system generated) |
The unique reference of a record within originating system (typically assigned automatically) |
Note that there may be more than one version of this identifier that can be captured. The identifier may reflect the function, context or structure of the record and how it was used. |
You should consider capturing this information in the following circumstances:
|
Agency assigned identifier |
Catalogue or local identifier of the record within the system (typically assigned by a human operator) |
May reflect the function/context of the object and how it was used. |
You should consider capturing this information in the following circumstances:
|
Previous identifier |
A previous identifier allocated to a record |
A previous identifier metadata field may be of value where records have previously been migrated from another system. The previous identifier field may be particularly important If relationships between documents are defined using these identifiers. |
You should consider capturing this information in the following circumstances:
|
Title |
Title or short description of the record |
Sometimes records may not have meaningful titles assigned, or a set of records will share a very generic title. In some cases a short description field may be present instead of a title. |
You should consider capturing this information in the following circumstances:
|
Description |
More detailed description of the digital object |
|
You should consider capturing this information in the following circumstances:
|
Export date |
Date record was exported from the system |
This date does not exist within the system but may be included as part of an export or transfer process. Can help demonstrate provenance. May also help with disaster recovery. |
You should consider capturing this information in the following circumstances:
|
Creation date |
Date record was originally created |
Note that this date may still be attached to the files as system info once the record is extracted, but system dates are vulnerable to change so extracting this date as metadata is a sensible precaution. Note that this date may reflect the date a record was originally uploaded to the system rather than the original creation date. |
You should consider capturing this information in the following circumstances:
|
Last modified date |
Date record was last modified |
Note that this date may still be attached to the files as system info once the record is extracted, but system dates are vulnerable to change so extracting this date as metadata is a sensible precaution. The system may be configured to capture a full audit trail, including dates of all edits to a record. Consider what level of detail is required for the digital archive. |
You should consider capturing this information in the following circumstances:
|
Date folder was closed |
The date an folder was closed may act as a trigger date for export to digital archive. |
Depending on local practices, this action may be manually applied or automatically generated. |
You should consider capturing this information in the following circumstances:
|
Review date |
If a record is closed to the public this is the date it needs to be reviewed to see If it can be opened (unless a date open is already recorded - see below). |
Note that this may be more broadly categorised as date of next action (where other proposed actions relating to a record are recorded) |
You should consider capturing this information in the following circumstances:
|
Date open to public |
The date a record can be (or was) opened for public access. |
|
You should consider capturing this information in the following circumstances:
|
Date that the file became a record |
The date that a file is marked as a record. |
This may be a feature of some EDRMS and will depend on local practices. Depending how the field is used in practice, it may not be particularly meaningful. For example sometimes a file may be marked as a record years after the record was created and/or last edited. |
You should consider capturing this information in the following circumstances:
|
Disposal date |
Date that record can be disposed of. |
This field will not be applicable to all organisations and implementations, but in some cases records transferred to archive will need to be disposed of at a later date. |
You should consider capturing this information in the following circumstances:
|
Creator |
Individual or group primarily responsible for creating the record |
There may be more than one - depending on context you may want to record more granular roles. Note that there may be issues relating how to this is configured within the system (for example just as an identifier, which would need additional information to interpret). Important to ensure you get the details you need. Note also that there may be inaccuracies within the metadata. Systems and local practices will vary, but ensure you understand how it was generated. Is it added manually, extracted from the embedded metadata of a document or does the system generate it based on who uploaded the record (which may be different to who created the document)? |
You should consider capturing this information in the following circumstances:
|
Creating organization |
Details of organization responsible for creating record |
As above there may be issues relating how to this is configured in the system (for example just as an identifier, which would need additional information to interpret). It is important to ensure you get the details you need. Note also that there may be inaccuracies within the metadata. Systems and local practices will vary, but ensure you understand how it was generated. Is it added manually or does the system generate it based on who uploaded the record (which may be different to who created the document)? |
You should consider capturing this information in the following circumstances:
|
Edited by |
Information about who has edited the record since creation |
Record keeping systems may capture a full audit trail for a record, including details of any edits made. You may want to capture this information alongside edit dates (described above under ‘last modified date’) |
You should consider capturing this information in the following circumstances:
|
Classification code |
Classification code |
Also relevant is the record identifier (discussed earlier) |
You should consider capturing this information in the following circumstances:
|
Classification |
Human readable description of above code |
The classification code may consist of a series of acronyms which are hard for a user to interpret. The system may also store a more human readable description of this code |
You should consider capturing this information in the following circumstances:
|
Permissions |
Who has rights to read/copy/edit a record within the system |
May be applicable in some circumstances - depends on the context. Can cover a variety of things. |
You should consider capturing this information in the following circumstances:
|
IPR and holder |
Including copyrights |
|
You should consider capturing this information in the following circumstances:
|
Checksum |
Checksum for the record |
Alongside the checksum itself it may also be helpful to extract details of the date the checksum was generated and the algorithm used. Note that if a batch of records have been imported into an EDRMS or other system they may have come with a checksum. It would also be useful to capture this information about previous checksum If it is present. |
You should consider capturing this information in the following circumstances:
|
Versioning |
The version of the record |
Multiple versions of any one record may exist within the system |
You should consider capturing this information in the following circumstances:
|
Location within folder structure or hierarchy |
Records within an EDRMS or other record keeping system may be placed in a particular structure/hierarchy or ‘tagged’. |
Where a record sits within a structure can give valuable context to a record. It may not make sense once it is moved out of this structure. The location of the record within the structure should be captured in some way, this may or may not be through the metadata export. |
You should consider capturing this information in the following circumstances:
|
Relationships with other records |
Relationships with other records (not apparent through the folder structure or hierarchy) |
Relationships with other records within the system may be present in other ways outside of the relationships described through a record hierarchy or folder structure within the system. For example an email record may contain an attachment or multiple files may form a single logical record (for example a GIS layer or a website) |
You should consider capturing this information in the following circumstances:
|
Other descriptive metadata |
Other descriptive metadata that exists within the system |
Local practices will dictate what additional descriptive metadata is contained within any record keeping system and this will typically be used to help current users with locating and interpreting the records. |
You should consider capturing this information in the following circumstances:
|
Transfer level metadata
You may wish to capture the following metadata at for the batch of records as a whole (rather than at record level):
Metadata field |
Definition |
Notes |
Why you might need this |
Total number and total size of files/records |
The number of and size of files and/or records extracted from the system |
Totals for records and files may be different (for example one record may consist of multiple files) so two different figures may need to be captured here. |
You should consider capturing this information in the following circumstances:
|
System details |
Details of the system that the records are being transferred from (for example name and version) |
This information may need to be captured manually and incorporated into the metadata for each record. Additional documentation may also be required (see below) |
You should consider capturing this information in the following circumstances:
|
Additional documentation
Some organizations will also wish to capture a full set of system documentation relating to the record keeping system and how it was configured and used. This may include a data dictionary, records management policy and procedure, users manual and documentation relating to the configuration or set up of the system. This level of documentation will provide an additional level of detail about the system and provide context for the records that are being preserved.
How this resource was created
In early 2020 the DPC established the EDRMS Preservation Task Force. The task force was set up in response to a request to investigate this topic emerging from a digital preservation project with the Nuclear Decommissioning Authority (NDA).
The DPC invited Members to express an interest in joining the task force for a set period of 6 months with the aim of bringing together multiple stakeholders on the issue of EDRMS preservation to identify and elicit good practice. It was intended that not just the NDA but the whole of the DPC Membership would be able to benefit from this knowledge exchange.
The task force intended to:
-
Articulate the challenge/s of preserving records from an EDRMS
-
Share experiences of tackling these issues and learning from each other
-
Highlight other useful case studies or examples of good practice
-
Gather together existing sources of guidance
-
Highlight gaps in current guidance
-
Make recommendations for concrete DPC outputs or events to help address the challenge (for example: briefing day, technology watch report, guidance notes, case studies, webinars, blog posts)
As this initial 6 month period came to an end, task force members agreed to continue to meet in order to carry out some agreed actions - the creation of some online guidance on EDRMS preservation (this toolkit!) and a briefing day on the topic. At this point a call for new task force members went out to DPC Members to gather further volunteers to engage with this programme of work.
The text for this resource was created by a series of subgroups of the task force and through an online booksprint event which was held in January 2021.
Some of the booksprint team
Our briefing day event ‘Unbroken records: A briefing day on Digital Preservation and EDRMS‘ was held on 20th May 2021 and involved a great line up of presentations, from both members of the task force and other invited speakers. Many of these talks are linked from relevant points from within this online resource.
A big thank you to members of the EDRMS Preservation Task Force for sharing their challenges, knowledge and experience on this topic and their hard work and good humour throughout.
-
Kyle Browness - Library and Archives Canada
-
Hugh Campbell - PRONI
-
Kevin De Vorsey - NARA
-
James Doig - National Archives of Australia
-
Tim Gollins - National Records of Scotland
-
James Lappin - University of Loughborough
-
Rachel MacGregor - Warwick University
-
Jenny Mitcham - Digital Preservation Coalition
-
Bob Radford - Nuclear Decommissioning Authority
-
Kristen Schuster - King's College London
-
Caylin Smith - University of Cambridge
-
Sara Somerville - University of Glasgow
-
Nicola Steele - Grosvenor Estate
-
Zsuzsanna Tozser Milam - European Central Bank
-
Elvis Valdes Ramirez - United Nations International Residual Mechanism for Criminal Tribunals
-
Lorna Williams - Bank of England
-
Emma Yan - University of Glasgow
-
Paul Young - The National Archives UK
![]() |
The EDRMS Preservation Task Force was established by the DPC as a result of a digital preservation project with the Nuclear Decommissioning Authority and our thanks go to them for supporting this work. |
Further resources & case studies
Listed below are a number of resources that relate to the topic of EDRMS preservation. This list was collated during the course of the work of the EDRMS Preservation Task Force and it was noted by Task Force members that there wasn’t a huge quantity of existing guidance available on this subject. Please contact us if you know of other useful resources that should be referenced here.
General guidance
-
Unbroken records: A briefing day on Digital Preservation and EDRMS (2021)
-
Migrating information between records management systems, The National Archives, UK (2017)
-
Project checklist for records management systems migration, The National Archives, UK (2017)
-
Testing for continuity checklist, The National Archives UK (2017)
-
Good Migrations: A Checklist for Moving from One Digital Preservation Stack to Another, NDSA (2020)
-
CITS ERMS (Electronic Records Management Systems), DILCIS Board
-
ERMS-Archival transfer service - a presentation at the DLM Forum, Angela Dappert (2015)
Relevant guidance from task force members
Case studies demonstrating how organizations have tackled the challenge of records preservation.
We would like to provide further case studies demonstrating how different organizations have tackled this challenge. Please contact us if you have a relevant case study you would like to share.
Subcategories
Template for building a Business Case
This section provides guidance on the content that will be useful to include in your business case, but it will likely need to be adapted to the structure used in your organization’s template.