Maritime Archaeological Archives
These are collection of digital records from maritime archaeological work including photographs, maps and plans, field notebooks, post-excavation finds analysis and other analytical records. |
||
Group: Museum Data |
Trend: New Entry |
Consensus Decision |
Added to List: New Entry |
Last update: New Entry |
Previous category: New Entry |
Imminence of Action Action is recommended within 12 months, detailed assessment is now a priority |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques. |
Examples Records of excavations in marine environments which may fall outside the jurisdiction of terrestrial heritage services. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Poor documentation; lack of preservation mandate; dependence on proprietary and non-standard data types |
||
‘Endangered’ in the Presence of Good Practice Preservation planning from the outset; subject specialist repository; user community |
||
2019 Review This is a new entry taken from the open submission process in 2019. It is grouped with Museum data sets as archaeological archives typically make their way to museums, but it is also closely aligned to research data. |
||
Additional Jury Comments There are trusted custodians of this data such as ADS, DANS or the British Museum as well as in oceanographic research agencies, but perhaps hard to integrate good practice at an international scale. The real challenge therefore is in identifying and sustaining a custodian as other bodies have experience with this data. The proliferation of innovative data recording technologies also implies likely problems of format dependence and documentation. |
Data Posted to Defunct or Little-used Social Media Platforms
Data Posted to Defunct or Little-used Social Media Platforms
Older or less widely used social media platforms to which content has been uploaded but for which no guarantees have been made about the long term. |
||
Digital Species: Social Media |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency. |
Significance of Loss The loss of tools, data or services within this group would impact on many people and sectors. |
Effort to Preserve | Inevitability Loss seems inevitable. Loss has already occurred or is expected to occur before tools or techniques develop. |
Examples BeBo, MySpace, Google Buzz, Parler and others. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Closure of platform; lack of offline equivalent; lack of export functionality; no preservation undertaking from service provider; unstable business plan from service provider; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Offline Replication; clear notice periods and alerts; committed ongoing maintenance of service. |
||
2023 Review This entry was nominated in 2017 and added to the Bit List following the 2019 assessment to highlight the different threats faced when attempting to preserve materials on older or defunct social media, emphasizing the different threats faced by social media users who uploaded content to defunct or little-used social media platforms. Because these services are older, the need to act is more urgent than for others. Often, the significance is only brought to attention once they are lost. The 2021 Jury noted a trend towards greater risk due to the existing risks of defunct or little-used platforms with recognition of the need to develop tools or techniques for applying to others that may follow the same path. The 2022 Taskforce agreed these risks remain on the same basis as before (no change to the trend). The 2023 Council agreed with the Critically Endangered classification. They noted an increase in imminence and effort to preserve, recognizing that while the need for major efforts to prevent or reduce losses continues on the same basis as before, it is now much more likely that loss of material has already occurred and will continue to do so by the time tools or techniques have been developed. Therefore, immediate action is necessary. They also recommended that the next major review for the Bit List consider merging this entry with the ‘Consumer Social Media Free at the Point of Use’ entry to provide examples of loss prompted by aggravating conditions. |
||
2024 Interim Review The 2024 Council agreed these risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). They recommend combining this entry with ‘Consumer Social Media Free at the Point of Use’ by illustrating how social media is Practically Extinct in the presence of aggravating conditions, such as sudden loss or loss with little notice, citing examples such as GeoCities and MySpace. Additionally, they recommend that the 2025 review address recommendations about scoping or combining with ‘Consumer Social Media Free at Use’ and moving 'Defunct' social media into 'Consumer SM' as examples of extinct platforms. Additionally, ‘Lesser Known' platforms might be re-scoped as 'Local or Small Community' platforms, such as BBSs/Blogs maintained by a community, or national platforms like Skyblog (which has been successfully archived by Bibliothèque nationale de France). These smaller platforms may be at a lower risk of extinction due to fewer political/corporate entanglements and an invested user community who maintains them and can support archiving. |
||
Additional Comments The risk to this content depends on the specific service or platform, but older platforms (BeBo, MySpace) pose a higher risk of loss than current platforms (and is likely already lost) but social media wasn’t used to the same extent (and not as widely used by government, corporations, research institutions, etc.) in the early 2000s/2010s when these platforms were popular, which reduces the impact slightly. When looking at the digital preservation landscape and where we need to apply effort as well as resources, defunct early social media spaces are not high on the list; but, when considering how contemporary social media channels could become defunct, it becomes a different conversation because of how intrinsically tied they are to political discourse and influencing political opinion It is to be hoped that some of these have been archived via traditional web archiving, and so the remnants of these sites can be found in bits and pieces in various web archives, but it may be too late to save some of the content that is likely already lost. If some of this is still available, there may be hope in trying to preserve, but it may be difficult if the platforms are not willing to share data or work with preservationists. ArchiveTeam has stepped in here too. There is undoubtedly a story here that could be used as a call for arms to raise awareness about the preservation of current social media platforms too. Case Studies or Examples:
|
Digital Archives of Community Groups
Digital Archives of Community Groups
Digital materials including ephemera, correspondence and campaign materials created as a by-product of small scale or ad-hoc community action groups. |
||
Digital Species: Community Archives |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Archives of smaller and ad-hoc political and campaigning organizations; environmental protests; sports clubs; smaller religious groups; amateur music or drama; fan groups. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Poor documentation; lack of replication; lack of continuity funding; lack of residual mechanism; dependence on a small number of volunteers, lack of preservation mandate; lack of preservation thinking at the outset; conflation of backup with preservation; conflation of access and preservation; inaccessible to web archiving; dependence on social media providers; distrust of ‘official’ agencies; uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Residual archive with residual funding able to receive and support collections; active user community; intellectual property managed to enable preservation. |
||
2023 Review The Jury created this entry in 2019 as a subset of ‘Community Archives and Community-Generated Content’ which was split into two entries to provide greater specificity in recommendations for approaching the preservation of created as a by-product of small scale or ad-hoc community action groups (versus digital materials generated for significant purpose of a community initiative). The 2020 Jury identified a 2020 trend towards greater risk based on community groups such as sports clubs, religious communities, arts and political groups, often relying on volunteer effort, being unable to meet for extended periods in 2020. Moreover, the local community centres, clubs or places of worship on which they depend had closed, in some cases for good. This trend continued for 2021; the 2021 Jury commented that much of the content in community archives has easily preservable content, but resources are not directed towards them, basic digital preservation practices are not well embedded amongst the general population, and selective approaches are needed to get a handle on the situation and to find the resources to do the work. The 2023 Council agreed with the classification of Critically Endangered. with the overall risks remaining on the same basis as before (‘No change’ to the 2023 trend). However, they also noted an increase in the significance of loss due to the fact that community heritage tends to be part of wider conversations within the international landscape. |
||
2024 Interim Review The 2024 Council agreed these risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments Typically, born-digital material is more at risk as community groups may not know about the risk of loss. Many are unaware of digital preservation terminology. It is the ad-hoc nature of these groups and projects which is of great concern. There is a significant need to raise awareness and provide a ‘home’ but also to do so with sufficient sensitivity so as to ensure community groups remain in control of their own material. Communities who live in rural and remote areas may have a lack of access to services such as broadband connectivity, which is a well-reported issue and is often referred to as the ‘digital divide’. Inadequate internet connectivity would diminish the capacity for these communities to access digital preservation solutions, such as cloud storage for digital assets. This is especially prevalent with personal photos and videos on mobile phones as possession of a mobile phone does not necessarily mean the user has adequate internet connectivity to be able to upload videos to web-based platforms. AI could potentially be used to assist with easy access to simple, succinct explanations and principles of digital preservation and archiving solutions which would give these communities a wider understanding of the work being done and empower them to be able to do minimum digital preservation themselves. See also:
|
Digital Evidence and Records of Investigation Prior to Court
Digital Evidence and Records of Investigation Prior to Court
Digital materials assessed by police and other authorities in the course of investigation and retained as evidence of due process such as case files and correspondence, including materials not submitted to court. |
||
Digital Species: Legal Data |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent losses in this group, such as the development of new preservation tools or techniques. |
Examples CCTV; Email; 3d scanning; social media interactions; police records; court records; text messages. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Poor chain of custody; fragile or obsolete media; dependence on proprietary formats or products; lack or loss of documentation; inaccessible to web harvesting technologies; lack of version control; lack of integrity checks or integrity records; poor chain of custody; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Meticulous transfer and disclosure processes. |
||
2023 Review This entry was added in 2019 as an entry made in 2017 for ‘Digital Legal Records and Evidence,’ which the Jury split into four more discrete entries. This category includes evidence prior to court that may form part of an investigation or gathering of evidence but which are not formally submitted as evidence. It recognizes that police and other investigating authorities are not limited in the types of evidence that they need to administer, but that this creates an almost unbounded limit of preservation requirements to ensure authenticity and admissibility. A 2021 risk was identified based on examples bringing to question whether legal bodies have the skills and capabilities to preserve these materials should they need them if a case is reopened etc. The 2022 Taskforce found no significant trend towards greater or reduced risk. The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments In the International organizations realm, more and more of these investigative missions are being set up. They are collecting huge volumes of data and the same issues around chain of custody, integrity records/checks continue to be aggravating especially with respect to authenticity and admissibility. Given the potentially huge volumes of data, and the drive to keep costs low, it is debatable whether there will be sustained funding for preservation. Case files and correspondence are one thing: retention of these should be clear but may differ widely between jurisdictions and levels of government. If retention is not long-term or permanent, the risk of loss may not be so critical. Retention of 'unused' or 'potential' evidence is likely a different matter altogether. Is it even a record? Certainly, it is not a record of the court. Should it be returned to the suspect or accused? Are their rights being considered here - not just in terms of preservation, but also simply disposition? There may be legal and ethical issues around this that need to be fleshed out in conjunction with assessing its preservation risk. Police forces tend only to have the resources to maintain forensic capability with relatively recent technology - for older technology, institutions and specialist companies are the only sources of expertise. This has an impact on cold cases. There have been many examples of convictions being overturned when previously unused evidence was brought to light. Therefore the retention and preservation of unused evidence can have immense value. |
Evidence in Court
Evidence in Court
|
||
Digital materials presented in court as evidence or documents such as rulings and proceedings generated through legal proceedings |
||
Digital Species: Legal Data |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2017 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to address losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Evidence submitted to courts of all kinds, including text messages, photography, CCTV, email, 3d and 2d scanning, scientific reports and analyses, documents and websites. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Loss of context; loss of integrity; external dependencies; poor storage; lack of understanding; churn of staff; significant or diversity of data; poorly developed specifications; ill-informed records management; poorly developed transfer protocols; poorly developed migration or normalization; longstanding protocols or procedures that apply unsuitable paper processes to digital materials; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Well-managed data infrastructure; preservation enabled at ingest; carefully managed authenticity; use of persistent identifiers; finding aids; well-managed records management processes; recognition of preservation requirements at highest levels; strategic investment in digital preservation; preservation roadmap; participation in digital preservation community. |
||
2023 Review This entry is a subset of an entry made in 2019 titled ‘Proceedings and Evidence in Court,’ which was itself created as a subset of entry in 2017 for ‘Digital Legal Records and Evidence,’ The 2021 Jury split ‘Proceedings and Evidence in Court’ into two more discrete entries to highlight their distinct preservation challenges and risk profiles. This entry includes evidence that has been presented as evidence in court. It was given a Critically Endangered classification to highlight its higher risk profile and additionally emphasize that courts are not limited in the types of evidence that they can admit but that they have a responsibility to provide robust preservation that ensures the authenticity of their records and evidence. The 2022 Taskforce found no significant trend towards greater or reduced risk The 2023 Council agreed with the Critically Endangered classification with the overall risks remaining on the same basis as before (‘No change’ to trend). They emphasized the importance that organizations with these materials should have identified preservation actions established in their workplan–for digital evidence of investigation prior to court–to put into practice within the next three years. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments Temporary courts are continuing to gradually close and decisions about preservation and management of their archives are being made hurriedly and at the last minute. Some of the decisions are placing materials at high risk due to; materials being split all over the place - including to entities with no capacity or capability to preserve them, a seeming lack of understanding that preservation and management of the archives has no completion date, an unwillingness to invest in preservation or a drive to keep costs low which is resulting in negative implications for preservation, hurried choices on preservation measures which are not allowing for proper testing of approaches to safeguard authenticity and legal admissibility (e.g. extracting digital data from complex systems in formats that can then potentially not be restored). Standard Records Management processes within designated agencies should be able to take care of the preservation of materials like this but given that it is likely to involve complex types of data, such agencies may not be equipped to deliver preservation effectively. It is surprising that courts are not more prominent in the digital preservation community, where solutions now exist. Case Studies or Examples:
More concrete examples would be welcome. It is the evidentiary value of submissions to court that may be lost, and therefore veracity of the decision could be questioned. Evidence submitted in digital form is of greater risk (e.g., a video file submitted on a CD in the 90s) than records of the proceedings themselves (e.g., transcripts). |
Legacy Research Web Collections
Legacy Research Web Collections
Research related collections of digital content on the web which are now outdated and/or no longer actively maintained. This can include software and published or unpublished source code. |
||
Digital Species: Web, Research Outputs |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months, detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability Loss seems likely. By the time tools or techniques have been developed, the material will likely have been lost. |
Examples Academic and institutional websites from the first decade of the web containing details of research projects and interests as well as research data. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Inaccessible to web archive; bespoke code; insufficient documentation; uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Secured by web archive; documentation and rights information published alongside material. |
||
2023 Review This entry was added in 2019. While there are overlaps with ‘Semi-Published Research Data’ and ‘Unpublished Research Data’ entries, it is a separate entry to distinguish between ‘current’ and ‘legacy’ collections with different risk profiles. In 2020, the fact that materials of legacy web collections were no longer actively maintained increased the risk classification to Critically Endangered. The 2021 Jury agreed with these distinctions, adding that loss has already occurred and future loss can be prevented through approaches such as web archiving and code preservation. They identified a 2021 risk toward greater risk based on noted security issues posed by hosting legacy technology software and services which prompted disposal of content imminently without adequate review or selection. The 2022 Taskforce agreed with this assessment, noting no change to the trend (it remained on the same basis as before). The 2023 Council agreed with the Critically Endangered classification with risks remaining on the same basis as before (‘No change’ to trend) but also noted a greater inevitability of loss compared to previous reviews. Additionally, the Council recommended that a received nomination for an entry, on unpublished digital indices and transcriptions in the DIMEV Open-Access Digital Edition of the Index of Middle English Verse, would provide a valuable example to this entry rather than as a new, standalone entry. The 2023 Council additionally recommended that the next major review considers rescoping the entry, possibly splitting this entry into separate areas to assess different levels of risk relating to published and unpublished source code in legacy research web collections. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments These collections are valuable but lose funding and care as institutions re-configure their tasks and individuals retreat from tasks due to retirement or (as volunteers) to old age. There are an endless number of legacy research web resources out there that people don’t know about. Not necessarily a technical challenge but a resource challenge The Internet Archive and other national web archiving bodies have copies of a lot of websites that would fit into this category but by no means all. There’s also a distinction between the software or code used to deliver the user experience and the data. Such code is secondary to the content. This issue can be intensified by the legacy IT Infrastructure in cases where much of the content is hosted there, as security concerns may lead to disposal of content imminently. In these scenarios, their imminence of action becomes more urgent given the security issues posed by hosting legacy technology/software/etc. Case Studies or Examples:
|
Media Inside Paper Files
Media Inside Paper Files
Media inside paper files occurred in records since the 1980s and will continue to do so for many years. |
||
Digital Species: Portable Media |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Digital media mixed with paper files in records offices and filing cabinets of almost every kind of enterprise. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Unsustainable effort to assess; exotic or obsolete media; poor storage; lack of descriptive labelling; uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Carefully labelled; managed programme of assessment and retrieval; robust media used. |
||
2023 Review This entry was added in 2019 to report the significant amounts of digital media being transferred to archives folded into traditional files. The 2019 Jury noted that it is relatively simple to preserve this material once identified using standard tools, but it can be an ‘unknown unknown,’ and that assessment can seem overwhelming and, therefore it may overlap with other portable media risks but has a higher risk classification. The 2021 Jury agreed on a 2021 trend towards greater risk due to the increased time sensitivity and need for conducting collection audits as soon as possible, in order to determine what you have to then work out a plan about opening carriers, assessing files, and extracting them if significant. The 2022 Taskforce agreed, with risks on the same basis as before (‘No change’ to trend) The 2023 Council agreed with the risk classification of Critically Endangered with the overall risks remaining on the same basis as before (‘No change’ to trend). . |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments This is highly dependent on who is looking after the portable formats. There are good examples, for example in libraries, where disks are stored at the back of books or in front of magazines and can be processed at the point of acquisition. In archives, however, dealing with bit-level preservation of external media (often on legacy formats) is largely an unquantified problem, and so resource commitments will not be in place. So, there is a method and tools but simply no time committed and no proper assessment either. In other agencies, the issue will not have even been considered, and for them, it will be much harder over time with some inevitable loss. |
Non-current Hard Disk Technologies
Non-current Hard Disk Technologies
Materials saved to storage devices with a variety of underlying magnetic or solid-state technologies that are hardwired into a computer that is no longer under warranty or supported: typically, hard disks more than five years old. |
||
Digital Species: Integrated Storage |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve | Inevitability Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop. |
Examples Disks installed into computers or servers that are more than five years old, or out of warranty. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of replication; poor storage; non-standard connections or controllers; aggressive compression; encryption; Uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Maintenance schedule; renewable extendable warranty; best practice storage and operation; replication. |
||
2023 Review This entry was added in 2019 to ensure that the range of media storage is properly assessed and presented. The lifecycles of most consumer hard disk technology are relatively stable in comparison to portable devices because they are integrated into systems and therefore inherit the lifecycle and replacement of the entire system. This is less true at scale; however, where disks are used in storage arrays, and refreshment is more loosely tied to the server architecture. Storage at scale also means the percentage likelihood of finding a disk failure increases, and this likelihood of failure led to the 2021 Jury’s noted trend towards greater risk. It was reviewed in 2022 with no noted change towards even greater or reduced risk. The 2023 Council agreed with the current Critically Endangered classification with overall risks remaining on the same basis as before (‘No change’ to trend), while also noting a greater inevitability of loss from the discontinuation of support and development for these storage technologies when compared to the 2021 Jury review. |
||
2024 Interim Review The 2024 Council agreed These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments A lot of early PCI-E flash devices (e.g. Fusion-IO) used proprietary drivers before the NVME standard was developed, but are now dropping off support. Intel has stopped development of Optane non-volatile RAM, some of which required specific CPU support to access although that form was usually used for data caching rather than storage. Accessing drives with pre-SATA interfaces is increasingly difficult since interface cards and OS support can be hard to come by. The greater density of newer disks, as well as encryption and compression, mean they can be more fragile than older disks with less density, and less sophisticated read/write technologies. The age of a disk is not the best or only indicator of its reliability. See also:
|
Unpublished Research Data from Government Researchers
Unpublished Research Data from Government Researchers
|
||
Data sets and research outputs produced in the course of government research but never shared or made available outside of the initial research. In particular, the risk classification applies to research data under government embargo, restrictions due to sensitivities, classification issues, and/or materials suppressed for ideological reasons. |
||
Digital Species: Research Outputs |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Action is recommended within twelve months; detailed assessment is a priority. |
Significance of Loss The loss of tools, data or services within this group would impact on many people and sectors. |
Effort to Preserve | Inevitability Loss seems likely: by the time tools or techniques have been developed the material will likely have been lost. |
Examples Data sets or research outputs produced for agencies that have closed or have had funding withdrawn from research initiatives; research data from government agencies no longer active. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Lack of access to archival services; sudden or unanticipated closure; loss of implicit knowledge from destabilized or demoralized staff; encryption; uncertainty over IPR or the presence of orphaned works. |
||
‘Endangered’ in the Presence of Good Practice Archival responsibility well developed; documentation; published through research channels. |
||
2023 Review This entry was added in 2019 under ‘Unpublished Research Data from US Government Researchers’. It has significant overlaps with other entries in the research outputs group but was set as a standalone entry to draw attention to two realities: 1. Research outputs are not simply a matter for academic institutions, and that government is, in fact, a major producer of research data; and 2. Political instability and threats to the continuity of government services are a significant preservation risk. The 2019 Jury noted that while it specifically related to the US government context, it did not mean that other jurisdictions are immune from political instability, and commented that politically inconvenient research outputs face particular and immediate threats of which the digital preservation community should be cognizant. No 2020 trend towards increased or decreased risk was identified. The 2021 Jury agreed with concerns raised by the 2019 assessment but recommended the broader applicability should be more explicit. The entry title and description were changed to broaden and include governments across national and international contexts. The 2021 Jury added that with the changes, the risk profile will range and depend on the political system, the political change and the measures in place to save and reuse data from disbanded research projects; in other words, there may be instances where the unpublished research data in one country may fall under the Vulnerable classification. The 2022 Taskforce agreed with this 2021 assessment with no change to trend. The 2023 Council agreed with the Critically Endangered classification with risks on the same basis as before (‘No change’ to trend) and recommended inviting additional review from an expert in this area for the next review. A further recommendation was made to consider whether it should be an individual entry or instead be incorporated as an example under the ‘Unpublished Research Data’ entry. |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments The US made the news as part of the last government, but this is probably an issue in other countries as well and is, therefore, a category that could be made more generic. One question to ask is whether the research data is considered of long-term value or considered ephemeral? |
Web Domains with no Legal Deposit
Web Domains with no Legal Deposit
This entry regards the preservation of websites and domains that fall outside a remit of legal deposit (or no legal deposit mandate exists). Web archiving is able to capture large quantities of materials with routine and standards-based tools, but there are significant issues arising with intellectual property rights associated with website capture and republication. In many jurisdictions, but by no means all, those obstacles are overcome by regulations that enable a national library or other ‘legal deposit’ agency to copy and preserve content. Where no such permission exists, there is a significant risk of loss. |
||
Digital Species: Web |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Critically Endangered |
Imminence of Action Immediate action necessary. Where detected should be stabilized and reported as a matter of urgency. |
Significance of Loss The loss of tools, data or services within this group would impact on many people and sectors. |
Effort to Preserve | Inevitability Loss seems inevitable: loss has already occurred or is expected to occur before tools or techniques develop. |
Examples Domains registered without a country code; domains with a country code but weak or unenforceable legal deposit permission to harvest. |
||
‘Practically Extinct’ in the Presence of Aggravating Conditions Uncertainty over IPR or the presence of orphaned works; lack of legal deposit mandate or remit; rapid churn of websites; lack of access to Internet Archive harvest; contentious content; encryption; digital rights management; non-standard content management. |
||
‘Endangered’ in the Presence of Good Practice Permissive approach to Legal deposit; legislation to support and/or manage associated risks. |
||
2023 Review This entry was added in 2019. It is characterized by regulatory barriers rather than technical ones, though the pace of change in web technologies, as well as the growth of web content, means that significant technical challenges still exist. The 2019 Jury noted that local conditions were also a significant factor; for example, websites often also fall under public records legislation or are important elements of corporate records, and so important parts of the web are harvested even when there is no explicit legal deposit legislation. The 2019 Jury particularly recognized the work of the Internet Archive to capture and preserve content. They noted significant gaps in web archiving and, in too many cases, regulation as the barrier. The 2021 Jury agreed with this description and classification but added that in some limited instances, pywb tools (as opposed to automated web crawlers like Heritrix) could effectively capture the look and feel of a platform interface, preserving legacy versions for users to interact with in the future. However, pywb tools are manual and, therefore, cannot address the scale of the issue. They also do not capture interfaces in a way that makes it possible to recreate them in the future, only interact with a defined set of web pages. For this growing issue of scale, the 2021 Trend was towards greater risk. The 2022 Taskforce agreed with noted no change to the trend. The 2023 Council agreed with the Critically Endangered classification. They also noted an increase in the imminence and inevitability of loss, recognizing that while the need for major efforts to prevent or reduce losses continues, it is much more likely that loss of material has already occurred and will continue to do so by the time tools or techniques have been developed. While the Council agreed the entry description should be updated to reflect these areas of discussion, overall risks remain on the same basis as before (‘No change’ to trend). |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). Council members also added that the presence of a clear IPR framework for preservation is enabling, whether it is through legislation (like legal deposit) or licensing. |
||
Additional Comments There is not only a significant risk of loss to the content but also risk of loss to access. Unless the Internet Archive is picking these up, the early web or permission regimes are in place, and these early instances are gone forever and will continue to be lost. See also:
|