Stay on Top of Emerging Technology Trends
Get updates impacting your industry from our GigaOm Research Community
For the last century, the U.K. has had what is known as a legal deposit law requiring a copy of every book, pamphlet, magazine and newspaper to be sent to the British Library, and allowing five other major libraries to also request copies. Now the rules are being updated: from Saturday, the same will apply to digital content, including blogs and other content published online.
The idea, much as it was with printed content, is to archive the U.K.’s cultural and intellectual output. The libraries — including the British Library, the national libraries of Scotland and Wales, Trinity College Library Dublin, the Bodleian Libraries and Cambridge University Library — will be allowed to scrape and store everything on the .uk domain, and to demand copies of ebooks, e-journals and even CD-ROMs published in the U.K.
Here’s an interesting snippet from the FAQs:
“Legal Deposit Libraries will copy U.K.-published material from the internet, including freely accessible material on the open web. They are also entitled to harvest copies of password-protected or paid-for material, but are putting alternative arrangements in place for any publisher who prefers to deliver such material to them instead.”
A British Library spokesman confirmed to me on Friday that this was a reference to paywalled content. However, given that people will only be able to access the archive by physically visiting the libraries in question, and that there will be a seven-day lag between publication and archiving, that shouldn’t be too much of a problem for the publishers.
The spokesman said social media output would also be included, “as long as it is U.K.-based and openly available on the web,” and confirmed that this includes identifiably U.K.-based individuals’ Twitter feeds, although “we’d need to select people because it’s a .com” — no Library of Congress-style catch-all approach, then.
“The main thing we’re trying to capture first time round is .uk domain websites,” the spokesman added, while also stressing that no non-public social media material would be scraped.
On the book publishing side, The Bookseller reported that priority will be given to ebook-only publishers. This is presumably because those who aren’t ebook only are already submitting their books under the previously existing legal deposit scheme.
So why is this all happening? As my colleague Mathew Ingram pointed out last year, digital content can often be ephemeral and easily lost. That sentiment was echoed on Friday by British Library chief executive Roly Keating:
“Ten years ago, there was a very real danger of a black hole opening up and swallowing our digital heritage, with millions of web pages, e-publications and other non-print items falling through the cracks of a system that was devised primarily to capture ink and paper.
The regulations now coming into force make digital legal deposit a reality, and ensure that the Legal Deposit Libraries themselves are able to evolve — collecting, preserving and providing long-term access to the profusion of cultural and intellectual content appearing online or in other digital formats.”