Semi-Published Research Data
Data sets produced in the course of research and shared between researchers, such as by posting to a website or portal but without preservation capability or commitment. Typically the data remains in the hands of the researchers who have the job of maintaining it. |
||
Digital Species: Research Outputs |
Trend in 2023: No Change |
Consensus Decision |
Added to List: 2019 |
Trend in 2024: No Change |
Previously: Endangered |
Imminence of Action Action is recommended within three years, detailed assessment within one year. |
Significance of Loss The loss of tools, data or services within this group would impact on people and sectors around the world. |
Effort to Preserve It would require a major effort to prevent or reduce losses in this group, possibly requiring the development of new preservation tools or techniques. |
Examples Departmental web servers; project wikis; GitHub repositories. |
||
‘Critically Endangered’ in the Presence of Aggravating Conditions Originating researcher no longer active or changed research focus; staff on temporary contracts; dependence on single student or staff member; weak or fluid institutional commitment to subject matter; weak institutional commitment to data sharing; uncertainty over IPR or the presence of orphaned works; encryption; limited or dysfunctional data management planning; web capture challenges that means unlikely to be picked up by automatic crawlers. |
||
Vulnerable in the Presence of Good Practice Data in preparation for transfer to specialist repository; robust data management planning; documented and managed professionally using data stewards. |
||
2023 Review This 2019 entry was previously introduced in 2017 under ‘Research Data,’ though without explicit reference to semi-published research data. The 2019 Jury split the ‘Research Data’ entry into a range of contexts for research outputs, including this addition. The entry draws attention to represent ‘self-help’ data sharing which is to be encouraged as a means to facilitate open science but should not be confused with long-term preservation. The 2021 Jury agreed with the Endangered classification, noting problems with the volume of data being produced but not being kept in a meaningful way. They noted that research data is complex and has specific requirements for documentation that may only be known to subject matter experts; however, data creators (e.g., researchers) are not necessarily well placed to sustain the data in the long term. There were also a few significant changes to the entry in the 2021 Bit List.
The 2022 Taskforce agreed on a trend towards reduced risk based on material improvement over the last year which had not only offered examples of good research data management and preservation practices but also suggested a significant shift towards a culture of change and collaboration across different research communities and stakeholders. Those mentioned included (but were not limited to) improvements and initiatives by the European Open Science Cloud (EOSC), Science Europe, Research Data Alliance (RDA), Digital Curation Centre (DCC) and related projects on the preservation of research data and outputs. The 2023 Council agreed with the Endangered classification and that risk remained on the same basis as before (‘No change’ to trend). |
||
2024 Interim Review These risks remain on the same basis as before, with no significant trend towards even greater or reduced risk (‘No change’ to trend). |
||
Additional Comments There is a positive trend of increased research data management activity and engagement by libraries and data centres, which should help to ensure that more research datasets are properly deposited in data repositories, rather than left in a 'semi-published' state. Offering and minting researchers Digital Object Identifiers for datasets deposited at specialist repositories will encourage data citation and increase research impact of individual researchers, which traditionally relied more on publishing papers than datasets. See also:
|