Ndahambelela Hertha Iipinge is Associate Archivist at UNHCR, the UN Refugee Agency. She attended the IIPC Web Archiving Conference with support from the DPC Career Development Fund, which is funded by DPC Supporters.


Since August 2023, I have been working as an associate archivist in the digital preservation team at UNHCR, the UN Refugee Agency. My role includes managing the UNHCR Web Archive, where my responsibilities include curating and capturing essential content. I work closely with UNHCR websites owners, social media team and the Records and Archives Section Web Archiving Working Group, notifying them about scheduled website crawls to ensure content selection and inclusion. The role is crucial in maintaining optimal performance, data integrity and that the process runs smoothly and without undue strain on servers.

I also work closely with UNHCR’s web archiving service provider to oversee the crawling of our websites and social media platforms, safeguarding the online digital web legacy of UNHCR’s global efforts.

Why web archiving at UNHCR?

The Internet is one of the most important sources of public information in UNHCR, and its websites hold critical and remarkable content of operational and historical value for the organization and its partners in the near and long term. The UNHCR’s Records and Archives Section has been capturing web content since 2015, and it has set up the Web Archive using ISO compliant tools, for web archiving and metadata as the key to accessing them.

Websites and Social Media sites do not endure overtime, because they are fragile and ephemeral in nature, due to rapidly updated information, changes in business models, rapidly changing software and hardware  and thus without intervention by archivists they would be lost.  Social media companies have changed the way they work, and this has impacted the way UNHCR captures social media content, affecting key accounts. For example, X (former Twitter), has changed the way its API interacts with other platforms; the change of X’s API requirements has affected the way UNHCR captures its social media accounts, and now requires account owners to connect their account to the archive via the activation email.

I appreciate the Digital Preservation Coalition (DPC) Career Development Fund Grant, that afforded me the opportunity to attend and participate in the International Internet Preservation (IIPC) Web Archiving Conference (WAC), on 25-26 April 2024 in Paris, France. The Bibliotheque nationale de France (BnF) hosted the delegates in the breathtaking site of Francois- Mitterand.

 

Tips, tools, and case studies at the pre-conference workshop

The practical pre-conference on 24 April was packed with web archiving beginners’ insightful presentations, with practical know-how on web archiving. The first workshop by Claire Newing, Ricardo Basilio and Lauren Baker and Kody Willis encompassed Trainer of Trainers’ (ToTs), with DIYs training material available on the IIPC website, the training materials were produced jointly by the IIPC Training Working Group and DPC. Training materials highlight the critical importance and value of web content to organisations and how these materials support organisations who have started or are considering web archiving, and why it is important to do so.

One of the many tips I have learned attending this conference is that there is valuable support available for organizations who have started or are planning to establish web archiving programmes.  After 8 months of web archiving experience, the excitement was high to learn in a practical session how to capture a website using https://webrecorder.net/ which develop and support open source web archiving tools such as https://archiveweb.page/ to record a web page, store it in your browser, create a new collection (a page you like), verify by replaying your recording, export the content in the WACZ; upload WARC files.

Some take away points if you want to be a good trainer are attitude: (you need to be confident and be open to your trainees’ priorities and questions), strategies: (Web Archiving concepts and practical exercises- capture(websites), store (WARC) and Playback (Replay WA) and tools & software: (easy to use step by step tools).

Melissa Wertheimer presented on a topic which affects most institutions if not all: the selection and appraisal of collections and what not to collect for many various reasons. I was introduced to constituent collection development criteria for collections’ appraisal. 4 points stood out for me as to why organization should appraise collections; to create long term documentation, for transparency and accountability, organization being responsible stewardship (space, shelves and servers) as well as taking into consideration availability of funds to maintain collections.

The workshop offered a practical exercise to select collections based on various valid reasons; from availability of resources (funds, staff) to informational value, intrinsic value, evidential value, authenticity, and archival nature. The last one, archival nature probably rings more strongly for me as to what content is to be selected and not to be selected, because the work of UNHCR, the Refugee Agency https://www.unhcr.org/ is driven by emergencies, and that requires rapid action by UNHCR archivists to ensure capturing key web content of the organisation. This requires a strategy by refining your scope to ensure key content of value is selected and captured. My view is that it is based on work context, content, and time sensitivity of content usage, and how fast you need to get the information to the user are of utmost core to making these decisions. 

 

IIPC Conference

The conference kicked off with opening remarks by Pierre Bellanger, Pauline Ferrari, Jerôme Thièvre and Sara Aubry discussing Skyblog, the French pioneer of digital social networks. They described Skyblog as a social platform that allowed young people to connect and have absolute freedom to engage with each other on the internet. The blogs and websites for Skyblog are not available anymore, however their content was captured and archived. The archived content survived because of cooperations strategies to make the capturing project work.

I was particularly keen on hearing about using techniques and tools to build street art collections in Lisbon at Arquivo.pt by Ricardo Basilio, that are used by students to produce their applications and presentation.  

For the first time I was introduced to advertising web archiving, and that ads content selection require strategy between ads content captured and content with no ads. Conference sessions that were captivating, like one by Claire Newing, Patricia Falcao, and Sarah Haylett, who presented on their collaborative work to capture, preserve, and provide Access to Digital Artwork on Intermediate Art Websites and sharing information along the process of archiving content is key.

An interesting presentation by Valerie Schaefer on how archiving Memes can be challenging highlighting the absence of metadata on memes making it difficult to create meme context, to ensure usability of content captured. Some more discussions surrounded the initial decision to start to archive a website and the challenges of delimiting a website. This is true more so when resources are limited, and funding is earmarked for core functions of an organization.

Perhaps one of the interesting sessions at the conference was the drop-in talks, in the afternoon of the first day of the conference. Presenting my first ever 1-minute drop-in-talk at a conference, I was catapulted in the deep end of learning that while a short powerful Youtube video https://www.youtube.com/shorts/kQRoakcyxvY by JJ Bola, a poet, writer, educator, UNHCR supporter and a former refugee, to start a short presentation, could make an impact as part of your presentation, it might not be a good idea to use it in a 1-minute drop-in-talk.

While I managed to have the attention of 3 people who came to speak with me after the session, it was a learning process and an introduction to web archiving conference and audience.

 

IIPC Mentorship Programme

Shortly before the conference, IIPC announced intake for the second mentorship programme that was to take place during the conference. I expressed interest in the programme and I was accepted to learn from the experts on web archiving, cataloguing and digital preservation workflows.

I paired up with Jeffrey van der Hoeven from the National Library of the Netherlands, to tap into his expertise on preservation, web archiving and leadership.

I appreciate the benefits of the mentorship programme, one of them being that I continue to benefit from Jeffery’s expertise after the conference. Jeffrey has continued to share with me his experience in preservation, web archiving, storage, and leadership.

 

Take aways for UNHCR Web Archiving

Ensuring the capture of rare, unique, and valuable web content from UNHCR's websites and social media platforms is vital for preserving the organization's digital legacy.

UNHCR's Web Archive serves as a source of public information that supports the organization's operations.

Working closely with teams within UNHCR to capture and preserve web content fosters a collaborative effort in maintaining and utilizing the Web Archive effectively.

Recognizing that UNHCR's web content is not only beneficial to the organization but also to the people we work with.

Embracing insights and lessons from events like the IIPC WAC enhances expertise and strategies for refining UNHCR's web archiving practices, ensuring the timely and effective preservation of critical digital assets.

Networking played an integral part of the conference too, by providing a platform to engage and network with the web archiving community, discussing, and sharing experiences on error 404 for example, processes to follow when engaging with web archiving service providers.

  

Finally

I think that my short presentation 😊, managed to capture the conference participants’ attention and in turn highlight the work of the UNHCR, the UN Refugee Agency and on capturing web content on the web archive:

 

 


Acknowledgements 

The Career Development Fund is sponsored by the DPC’s Supporters who recognize the benefit and seek to support a connected and trained digital preservation workforce. We gratefully acknowledge their financial support to this programme and ask applicants to acknowledge that support in any communications that result. At the time of writing, the Career Development Fund is supported by Arkivum, Artefactual Systems Inc., boxxe, Cerabyte, Evolved Binary, Ex Libris, Iron Mountain, Libnova, Max Communications, Preservica and Simon P Wilson. A full list of supporters is online here.

 


Scroll to top