Custom Online Databases

   Endangered large

Data collected, presented and disseminated in custom online databases that is not stored elsewhere, particularly data at risk when it is locked in the database because no export or harvest options are available.

Digital Species: Databases, Research Outputs, Web

Trend in 2024:

No change No Change

Consensus Decision

Added to List: 2023

New Entry

Imminence of Action

Action is recommended within three years, detailed assessment within one year

Significance of Loss

The loss of tools, data or services within this group would impact on different people and sectors.

Effort to Preserve | Inevitability

It would require a small effort to preserve materials in this group, requiring the application of proven tools and techniques.

Examples

Custom databases created project websites for research, citizen science.

‘Critically Endangered’ in the Presence of Aggravating Conditions

Lack of export options; lack of system maintenance; expired domain; lack of export functionality; lack of technical knowledge and skills; limited or dysfunctional data management planning; web capture challenges that means unlikely to be picked up by automatic crawlers; uncertainty over IPR or the presence of orphaned works.

‘Vulnerable’ in the Presence of Good Practice

Backup and documentation; preservation capability in designated repository; use of open formats and open source or other licencing that enables preservation; enabled export options; robust data management planning; documented and managed professionally.

2023 Review

This was a new Bit List entry nominated and approved by the 2023 Council to draw attention to the particular challenges of preservation for custom online databases. This entry focuses on distinct risks relating to online databases that cannot go through traditional web archiving tools. While there are challenges to preserving databases both off- and online, it was nominated in the context of projects which set up a custom online database to record, present, and disseminate collected data, but this data is not stored elsewhere (e.g. in a long-term digital archive) and often is locked in the database because no export or harvest options are available. Identified areas of risks for these online databases can include: the maintenance of the system after the end of a project when it is not ensured, and online databases disappear because of security issues or because the domain expires; not all data is open and, after the end of a project, no one is responsible for granting access; the data is not stored elsewhere (e.g. in some trusted repository); the data is locked in and cannot be exported in (e.g. CSV) for further re-use.

Additionally, the nomination of the entry also highlighted a gap in the Bit List for databases more broadly. The 2023 Council agreed a new higher-level Databases digital species group should be created to address this gap, inviting nominations for other database-related entries to be considered for the next major revision of the Bit List.

2024 Interim Review

These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend).

Additional Comments

The preservation is highly dependent on the software used but, no matter what, once the project has reached its end, it starts to become vulnerable.

Often, the online databases are of interest to a sub-discipline-specific group of people, e.g. archaeologists specialized on cuneiform tablets. But the material itself often is then invaluable for this group because of the great effort invested in compiling it.

Databases for citizen science also provide an example where the upload of information directly into it makes it distinctive.

Emulation can be used to preserve these databases. For example, Yale University is preserving databases, especially SQL databases for websites, using EAASI. There are technical challenges, but the databases can be preserved, and have found issues are often around access to data and workforce development of technical skills to undertake preservation actions. There is a risk, however, that some of the databases cannot be exposed to the web as they have no survival time and/or cannot make them available as they were intended to be used.

See also:


Scroll to top