|
Google's
library plan 'a huge help' So consider Oremus thrilled that Stanford has partnered with Internet search giant Google (GOOG) to digitize library books, which eventually will be searchable from dorm rooms. "If it was possible to access things online without leaving my room, that would be a huge help," he says. "For research papers, there's lots of books you can only get at the library. It would be really cool to get them on your computer." That is the plan, but it's expected to take five to 10 years for it to become a reality. (Related: Making books readable on computer proves trying task) Google is picking up the estimated $150 million tab to have employees on-site at the Harvard, Stanford, Oxford and University of Michigan libraries, plus the New York Public Library, begin scanning books, page by page. Google LibraryAs has been reported quite widely, Google has begun a massive digitization project with five libraries:
The total covered by existing agreements is said to be 15 million. Each is estimated to cost $10 to scan. Stanford's scanning unit is said to be able to do 100,000 pages a day. Oxford's scanning unit is said to be able to do 10,000 books per week. If all of them are that speed then by my math it will take a little over five years to scan them all. Similarly, the University of Michigan says the project will take six years. How digitizing
process works 1:
Convert each page into a digital image by scanning or photographing
it. 2:
Clean and crop the image 3:
Run word-recognition software 4:
Store, and post the book Contributing:
Gregg Toppo
|
|||||||||
Deals
with Google to accelerate library digitization projects for Stanford,
others
BY BARBARA PALMER
In December, Stanford announced that it is one of five libraries cooperating with Google Inc. in a project to make millions of books from their collections available electronically to readers worldwide without charge. Along with Harvard University, the University of Michigan, the University of Oxford and the New York Public Library, Stanford will loan books to Google to be added to an electronic repository that could become the world's largest digital library. "This is a great leap forward," said Michael A. Keller, university librarian and publisher of the Stanford University Press and the HighWire Press, Stanford's online co-publishing service for scholarly journals. For years, Stanford has been digitizing texts to make them more accessible and, as of January 2005, Highwire Press has helped publish more than 800,000 free full-text journal articles. But in the case of books, the university's efforts have been limited for technical and financial reasons, Keller said. "The Google arrangement catapults our effective digital output from the boutique scale to the truly industrial." Google was founded in 1998 by Stanford doctoral students Larry Page and Sergey Brin. Google and Stanford have been talking with one another about the project since very early in the development of the idea, Keller said. Both Stanford and Google are committed to respecting the rights of publishers and copyright holders of the books scanned, he said. Users will be able to browse the full text of works in the public domain. According to a Google press release, library books that are still in copyright will show up in Google search results, but users will see only bibliographic information and a few small text snippets unless permission is granted from publishers to show more. The project's unveiling last month made national and world headlines and prompted speculation about the effect it might have on the future of libraries and publishing. Keller talked recently about what the project will mean for Stanford. When will the actual scanning of books begin? Lots of logistical issues remain to be worked out—things like transport, selection, physical control, sorting, etc. Google staff are working with Stanford University Libraries to develop detailed initial plans for the project. How many of Stanford's more than 7.5 million books will be digitized? Stanford has great hope of digitizing all its books eventually, so that each one can be made as accessible and addressable as possible. That said, the process will take quite a few years and we really do not know how Google's grand ambitions will play out over time. For that reason, we left the question open as to how many Stanford books Google would handle. The agreement with Google neither calls out specific collections nor specifies a minimum or maximum number of books to be digitized. At this point, we're not really worried about digitizing the last book. How will they be selected? That is still being determined. We most likely will begin with a few hundred thousand books that were not converted from the Dewey Decimal cataloging system when the libraries began using the Library of Congress classification system. That is intrinsically an older collection, so more of the books are likely to be in the public domain. We also will factor such things as current location and condition, as well as attempting to create as little disruption for our readers as possible. How will the books be digitized? Stanford will loan books from its library collections to Google, which will scan them at their Mountain View headquarters. Once digitized, the books will be returned to Stanford and re-shelved. We'll require that books be turned around fairly quickly. Google has promised not to damage the books, and we are taking them at their word. We are strongly committed to get as much work done as possible, without disrupting the services we provide to our readers. How do you expect this project to affect the library and its mission? .I have been committed to finding ways of digitizing information for years. The Google plan allows us to accelerate our digitization schemes by orders of magnitude. I intend that our eventual use on campus of the digitized book files will be a tremendous asset to the Stanford community. Some people seem to believe the effect will be to make the physical books redundant—that we can simply discard the books and convert our book stacks to offices and labs. I disagree strongly. In fact, I believe having books in digital form will actually increase the use of the physical books. The digital files will be great for searching and targeting material for study, but many of us prefer the hard copy original in hand for careful reading. So, in my opinion, it is not an either-or proposition; the book provides a valuable reading experience different from the valuable searching/scanning/excerpting work with the digital version. Now the downside of the Google plan from the library's operational point of view is the work at our end: selecting books, protecting fragile or damaged books before they go off for digitizing, resorting, etc. Physically handling hundreds of thousands or millions of books is labor intensive by its nature. When will Stanford materials appear in Google? As of this writing, no timeline has been set. Can faculty ask that certain books be digitized? We already have a process for targeting books for digitization, and such needs should be communicated through the librarians of specific collections. The Google library project reportedly was nicknamed Project Ocean. How do you respond to the criticism that the Internet is already a sea of information that's difficult to intelligently navigate? There is obviously a huge amount of information on the web. However, information is not quite a generic commodity: Having millions of pages available online is of no immediate value if the information you need is represented only in a book on a shelf to which you do not have access. Further, not all information is of equal validity, integrity, accuracy, legitimacy, etc. The Google book digitization project will unlock a very large amount of relatively high-quality information of known, traceable origin, with proper bibliographic references. And, of course, that information will be searchable through Google. So I would say its net effect is to improve the chances that web users can obtain legitimate representations of the information they seek, thus improving the value and maybe even decreasing the chaotic quality of the web. I also expect that the existing tools for extracting information will improve with the large-scale availability of full-text material. The tools that are emerging now will give us the ability to extract ideas from online content, rather than simply perform keyword searches. |
Google
library project named as one of ten most important emerging technologies
for humanity by futurist Mike Adams The Google library project -- an ambitious effort to digitize hundreds of thousands of texts from prestigious libraries -- has been named the single most important emerging technology for humanity by futurist Mike Adams in his free downloadable ebook, "The Ten Most Important Emerging Technologies For Humanity." In the downloadable book, available at TruthPublishing.com, Adams cites the Global Electronic Library as the #1 technology needed to uplift humanity due to its ability to enhance the accessibility of knowledge.
Google Checks Out Library Books The Libraries of Harvard, Stanford, the University of Michigan, the University of Oxford, and The New York Public Library Join with Google to Digitally Scan Library Books and Make Them Searchable Online MOUNTAIN VIEW, Calif. - December 14, 2004 - As part of its effort to make offline information searchable online, Google Inc. (NASDAQ: GOOG) today announced that it is working with the libraries of Harvard, Stanford, the University of Michigan, and the University of Oxford as well as The New York Public Library to digitally scan books from their collections so that users worldwide can search them in Google. "Even before we started Google, we dreamed of making the incredible breadth of information that librarians so lovingly organize searchable online," said Larry Page, Google co-founder and president of Products. "Today we're pleased to announce this program to digitize the collections of these amazing libraries so that every Google user can search them instantly. "Our work with libraries further enhances the existing Google Print program, which enables users to find matches within the full text of books, while publishers and authors monetize that information," Page added. "Google's mission is to organize the world's information, and we're excited to be working with libraries to help make this mission a reality." Today's announcement is an expansion of the Google Print™ program, which assists publishers in making books and other offline information searchable online. Google is now working with libraries to digitally scan books from their collections, and over time will integrate this content into the Google index, to make it searchable for users worldwide. "We believe passionately that such universal access to the world's printed treasures is mission-critical for today's great public university," said Mary Sue Coleman, President of the University of Michigan. For publishers and authors, this expansion of the Google Print program will increase the visibility of in and out of print books, and generate book sales via "Buy this Book" links and advertising. For users, Google's library program will make it possible to search across library collections including out of print books and titles that weren't previously available anywhere but on a library shelf. Users searching with Google will see links in their search results page when there are books relevant to their query. Clicking on a title delivers a Google Print page where users can browse the full text of public domain works and brief excerpts and/or bibliographic data of copyrighted material. Library content will be displayed in keeping with copyright law. For more information and examples, please visit http://print.google.com/googleprint/library.html. Source: USA TODAY research University of California-Berkeley professor John Battelle, who runs the influential Searchblog, says Google's library project has huge implications. "The idea that the world's knowledge, as held through books and libraries, is opening up to all via a Web browser cannot be understated," he says. "People will find books they never knew existed." He thinks it will take a year before university and public library books begin to show up in Google's index in a meaningful way. In October, Google began working with publishers to make portions of their books searchable in the Google index. Several have signed up for the service, including John Wiley & Sons, Hyperion and Scholastic. Rob Enderle, an independent analyst with The Enderle Group, predicts that Google's library program will motivate the publishing industry to get serious about having more searchable online books. "The object is to get the books read, and search engines will be where more readers will find out about books than at the bookstore," he says. At the school level, the popularity of the Internet and easier access to information has made teachers concerned about a rise in student plagiarism. But Rutgers professor Donald McCabe says having complete books online could make plagiarism easier to detect: "It'll provide ... a greater possibility of being caught." The idea of bringing the local library to the world, and making out-of-print books available, sounds great to author Avery Corman. His latest novel, A Perfect Divorce, was just released. But six prior books, including Oh, God! and Kramer vs. Kramer, have been out of print for years. When a book is no longer in print, "It's like it's disappeared into a black hole," he says. "The only place anybody can read them is at the library. If this helps get the book to more people, I'm all for it." |
|
|||||||
| © World eBook Library, worldLibrary.net. 2003, All rights reserved world wide. |