Information World Review (IWR) Blog Information World Review (IWR) Blog A blog from www.iwr.co.uk

« Online information could be the education utility of the future | Main | Business models and sustainability. How do we maintain and develop e-content? »

Living with Google Book Search digitisation

Richard Ovenden, Keeper of Special Collections (Associate Director), Bodleian Library, University of Oxford
Michael Keller, University Librarian, Stanford University

Sharing the platform the pair were talking about the Google digitisation experience. Ovenden said, "Our involvement is a focus on 19th century printed materials. We have taken the copyright very seriously and studiously avoided material that will cause us difficulty.
"The project has been industrial scale, the Google project is on a different planetary level to the JISC project. Organising the move of hundreds of thousands of books from 40 locations and returning them in a matter of days.
"You can already see millions of our pages on the Google Book Search interface and the expectations are moving incredibly fast. We refer to the project as "The Beast" and it has to be kept fed and is already having a dramatic effect on scholars and we are learning how little we know about our collections and it reminds us that we need to get back to our shelves and we have learnt that books are in much worse condition than we realised and has been used more than we realised. Also discovering a lot of titles that have not been catalogued because people didn't think it would be used. We are about to start integrating this content into new services with text mining, marking up texts and sharing."

Keller describes Stanford's involvement: "We had a lot more complications due to the legal wrangles. Have a copyright determinator. There are intricacies of the law which means there is a lot of content that is in the public domain, but you wouldn't know that from reading the law.
"We have discovered 8000 books that need conservation from the Google project. Stanford's expectations are that it is an indexing project, this will then lead to results. Indexing and searching are highly valued by 85% of our readers and make a real difference and will be using the scans for preservation.
"We will now be indexing our works in new ways and we will be indexing by ideas to create linkages that you would not expect. Citation linking is very valuable and very important. New kinds of searching such as associated searching will create a vector expression that can be used to compare a selected text to pre-computed expressions of other texts. We will be able to use the OCR texts as the test bed for new research using our books, to develop new search algorithms, to trace all manner of subjects, this seems to us to be a major benefit from our liberal interpretation of US copyright law.
"The indexing and presentation information will lift all information boats everywhere, even with the Google paradox."
Ovenden, it's good to see JISC investing in other 19th Century projects. Its important to remember is a search project, one visitor to Bodlian described Google Book Search another way of using the index.

Comments

Post a comment

Bloggers-in-chief

Daniel Griffin, IWR Deputy Editor Daniel Griffin, IWR Deputy Editor
Daniel joined IWR in 2006 after a career as a publisher of guides, supplements and websites for magazine and event companies. His special interest is the evolving publishing and information industry online.

Peter Williams, IWR Editor Peter Williams, IWR Editor
Peter is in his second spell on IWR. Over the last few years he has developed interest in the fields of knowledge management and e-learning, writing and editing extensively on both topics.

Friends of IWR

LI Isues
James Mullan

Lorcan Dempsey’s weblog
Lorcan Dempsey

SocialTech
Josie Fraser

Jennie Law’s blog
Jennie Law

UK Web Focus
Brian Kelly

tfpl blog
James Lappin

e4innovation
Grainne Conole


Recent Comments

Powered by Movable Type
Useful links: About | Privacy policy | Terms & conditions | Top of the page
© Incisive Media Ltd. 2008
Incisive Media Limited, Haymarket House, 28-29 Haymarket, London SW1Y 4RX, is a company registered in the United Kingdom with company registration number 04038503