Custom Online Databases
Data collected, presented and disseminated in custom online databases that is not stored elsewhere, particularly data at risk when it is locked in the database because no export or harvest options are available. |
||
Digital Species: Databases, Research Outputs, Web |
Trend in 2024: No Change |
Consensus Decision |
Added to List: 2023 |
New Entry |
|
Imminence of Action Action is recommended within three years, detailed assessment within one year |
Significance of Loss The loss of tools, data or services within this group would impact on different people and sectors. |
Effort to Preserve | Inevitability It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques. |
Examples Custom databases created project websites for research, citizen science. |
||
‘Critically Endangered’ in the Presence of Aggravating Conditions Lack of export options; lack of system maintenance; expired domain; lack of export functionality; lack of technical knowledge and skills; limited or dysfunctional data management planning; web capture challenges that means unlikely to be picked up by automatic crawlers; uncertainty over IPR or the presence of orphaned works. |
||
‘Vulnerable’ in the Presence of Good Practice Backup and documentation; preservation capability in designated repository; use of open formats and open source or other licencing that enables preservation; enabled export options; robust data management planning; documented and managed professionally. |
||
2023 Review This was a new Bit List entry nominated and approved by the 2023 Council to draw attention to the particular challenges of preservation for custom online databases. This entry focuses on distinct risks relating to online databases that cannot go through traditional web archiving tools. While there are challenges to preserving databases both off- and online, it was nominated in the context of projects which set up a custom online database to record, present, and disseminate collected data, but this data is not stored elsewhere (e.g. in a long-term digital archive) and often is locked in the database because no export or harvest options are available. Identified areas of risks for these online databases can include: the maintenance of the system after the end of a project when it is not ensured, and online databases disappear because of security issues or because the domain expires; not all data is open and, after the end of a project, no one is responsible for granting access; the data is not stored elsewhere (e.g. in some trusted repository); the data is locked in and cannot be exported in (e.g. CSV) for further re-use. Additionally, the nomination of the entry also highlighted a gap in the Bit List for databases more broadly. The 2023 Council agreed a new higher-level Databases digital species group should be created to address this gap, inviting nominations for other database-related entries to be considered for the next major revision of the Bit List. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments The preservation is highly dependent on the software used but, no matter what, once the project has reached its end, it starts to become vulnerable. Often, the online databases are of interest to a sub-discipline-specific group of people, e.g. archaeologists specialized on cuneiform tablets. But the material itself often is then invaluable for this group because of the great effort invested in compiling it. Databases for citizen science also provide an example where the upload of information directly into it makes it distinctive. Emulation can be used to preserve these databases. For example, Yale University is preserving databases, especially SQL databases for websites, using EAASI. There are technical challenges, but the databases can be preserved, and have found issues are often around access to data and workforce development of technical skills to undertake preservation actions. There is a risk, however, that some of the databases cannot be exposed to the web as they have no survival time and/or cannot make them available as they were intended to be used. See also:
|