Printer Friendly

Dewey goes digital: as campus Internet technologies enter the next millennium, schools rush to move their libraries online. Is this finally the end of the card catalog? (Digital Libraries).

When author and renowned flugelhorn soloist Richard Sudhalter set out in 2000 to write a book about jazz legend Hoagy Carmichael, he hit the Internet for archived recordings and came upon an unlikely source: Indiana University.

Earlier that year, as part of a library sciences effort called "Variations," Indiana technologists had digitized thousands of Carmichael recordings from the 1920s and '30s and archived them online. Unbeknownst to Sudhalter, the archive comprised the largest digital music library online outside of Napster.

Sudhalter's discovery posed no problems at all. With proper provisions from university administrators, he was able to log on to the campus network in Bloomington, search the database electronically, and research his entire book online--all without leaving his office 1,000 miles away in Long Island, NY.

"Maybe we saw him once during the whole process," says Kristine Brancolini, who heads the music library and also serves as director of Indiana's digital library program as a whole. "With all of the material he needed there in front of him, our archive made his research easy."

Sudhalter is not alone; all over the country, researchers and university affiliates alike are beginning to understand the power and convenience of digital libraries. Once described as "fantasy" projects for library science professionals, digital libraries now are seen as critical to the long-term survival of data. Just like traditional libraries with physical shelves and tangible books, these electronic archives contain everything from text to data, audio, video, images, and--in the case of Indiana's archive--music. The difference, of course, lies in the medium: Everything in a digital library is, well, digital, meaning university affiliates can access data anytime from just about any place with a sophisticated computer and a secure Internet connection.

While this technology isn't ready to replace the card catalog just yet, experts agree that the applications for content in these libraries are unbounded. Already, a handful of colleges and universities have digitized bodies of rare information that would have remained off-limits without the technology to copy it electronically. At other schools, researchers are exploring ways to build archives of digitized three-dimensional content--data previously experienced only by human contact or laboratory work. Still, the move to digitize is not without obstacles. For every advance, it seems, researchers must recover from another pratfall, necessitating steady development and, naturally, a cartload of patience.

"Despite all the strides we've made, in many regards, digital libraries are still in their infancy," says Steve Griffin, program director of the Digital Libraries Initiative at the National Science Foundation (NSF; "We know how we'd like them to work eventually, but between here and there, the libraries themselves still have quite a bit of life to live."


The digital library movement has been gaining momentum gradually for years, ever since the first effort launched in the late 1980s. That was when the NSF and the national Institute of Museum and Library Services (IMLS) funded six "test bed" sites for digital-archiving projects of varying sizes and scopes: the University of Michigan, Carnegie Mellon University, Stanford University, the University of California at Berkeley, the University of California at Santa Barbara, and the University of Illinois at Urbana-Champaign. Most of these early projects focused on the digitization of text; in many cases, the projects are still going strong.

In the early days, if a researcher could demonstrate a new and innovative way to digitize content, that researcher's project was considered a success almost regardless of the outcome. Over the years, however, as Optical Character Recognition (OCR) scanning technology empowered librarians to scan multiple documents at a time, the nature of what made a good and successful library digitization project changed dramatically. Technicians at large projects such as the University of California's California Digital Library project ( digitized hundreds of documents each day, and suddenly, expectations soared virtually overnight.

That was only the beginning. As the fundamental technologies of digitization became even more sophisticated, digital library projects were no longer considered successful unless they succeeded in effectively delivering digitized content to a targeted community of users. Digital content became nothing without context, and according to Tim Cole, mathematics librarian and associate professor of library Administration at the University of Illinois at Urbana-Champaign, this shift in focus facilitated the integration of digitized information resources with more traditional collections, leading directly to the development of sophisticated stand-alone digital archives.

"The pioneering institutions were forced to redefine success almost every step of the way," Cole says, noting that perhaps their biggest accomplishment was collaborating to supply formative data for the NSF-funded national science, mathematics, engineering, and technology education digital library (NSDL; "In the process, the definition of digital library evolved as well."


Today, most digital library efforts still focus on perfecting an old standard: converting character-based documents to electronic files that can be accessed over the Internet. The University of Michigan, for example, boasts the largest library of digitized text in the American academic world. Michigan puts such a priority on digitization that the University Library has established a nine-person division entirely devoted to digital conversion services. John Price Wilkin, the library's associate director, says this group digitizes between 3,000 and 5,000 microfilm documents a year, and estimates that more than 3,000 volumes of English-language documents from the 18th century have been digitized as part of a project entitled, "Early English Books Online."

"What we've established is a factory," explains Price Wilkin. "Our efforts are focused on getting as much as we possibly can into digital form."

Researchers at U Illinois at Urbana-Champaign have taken the same blanket-coverage approach, focusing their efforts on digitizing second-generation mathematics resources from Wolfram Research Inc. (, makers of Mathematica engineering software. Through a partnership that has granted the university unrestricted access to a smorgasbord of Wolfram content, UIUC researchers have enhanced volumes of data for educational purposes, and have transcribed all of it into electronic supersets available online. As a result, project coordinator Tom Habing says the school has compiled one of the broadest and most diverse engineering and mathematics databases anywhere in the world.

At Texas A&M University in College Station, TX, digital library efforts are a bit more niche. There, at the behest of Dr. Eduardo Urbina, renowned Cervantes scholar and professor of modern and classical languages, university technologists have spent years digitizing Don Quixote in several versions, all from 1605 and 1615 first editions considered extremely rare. Because these manuscripts are so old, modern OCR programs cannot read them, and Texas A&M programmers have had to transcribe the texts by hand. The result? The Cervantes Digital Library--an archive that project coordinator Rick Furuta deems one of the most extensive Cervantes collections anywhere in the world.

"The only better collection you'll find is in the National Library of Madrid," Furuta boasts. "But when you consider the number of students in this country who'll have a legitimate opportunity to study there, this archive makes a lot of sense."

What's more, he adds, digitized versions of each manuscript have helped save the original microfilm copies from overuse. Preservation is an equally important component to digital libraries at a number of other schools, as well. At DeMontfort University in Leicester, England, for instance, digitization has enabled students to access rare copies of Chaucer's The Canterbury Tales; at the University of Kentucky in Lexington, the Digital Athenaeum project has provided users with the opportunity to peruse damaged (and therefore quarantined) portions of the Cottonian Collection of Greek manuscripts at the British Museum in London.

Digital library efforts can even help preserve institutional documents that "die" as soon as they are printed. At the Massachusetts Institute of Technology in Cambridge, technologists have developed an electronic repository of dissertations, theses, and other primary research published by university affiliates. The library, called D-Space, was designed to catalog these documents and incorporate them into the school's pantheon of digital data, for posterity. "As soon as something ends up in paper, it becomes obsolete unless you're here to find it," explains Mackenzie Smith, associate director for technology at MIT Libraries. "We're just trying to reverse that trend."


All of these character-based digitization projects are worthwhile, but library science experts say that perhaps the biggest benefit of a digital library is the ability to create digital versions of other objects as well. At some schools, this has translated into efforts to digitize art and art history collections; at others, it has spawned an entirely new breed of data that allows for three-dimensional representations to become part of the knowledge base.

In Indiana, technologists at DePauw University have compiled more than 2,500 art history images, and digitized them for use by faculty and students in the DePauw Digital Image Database. With the help of an image-viewing software application from Luna Imaging Inc. (, university affiliates can access these images only online. The images range from ancient and medieval art to Renaissance and modern art, and include sculpture, paintings, and architecture. Though the database isn't extensive, according to Rick Provine, associate director of Libraries and Coordination of Technology, it suits the 2,300-student liberal arts institution just fine.

"We are not some huge school with tons of resources we can throw at this [digital library] stuff," he says. "Our scale is smaller, our resources are smaller, and we're forced to be more creative."

At Tufts University (MA), the stakes are slightly larger and the digital library is, too. There, Director of Digital Collections and Archives Greg Colati explains how researchers have digitized a collection of thousands of maps and images of 18th and 19th century London in the Edwin C. Bolles project. These efforts comprise a larger push to make digital copies of tens of thousands of images left to the school by Bolles, a former English and American History professor. The maps are interactive, meaning users can zero in on a particular portion of a map to access specific pictures of that neighborhood. In the end, says Colati, the digital collection will include nearly 10,000 items.

Efforts like these two focus on the digital archiving of two-dimensional images; at the University of Texas in Austin, researchers are focusing on digitizing three-dimensional images that represent a budding science called digital morphology. The library, colloquially known as DigiMorph (, is funded with money from the NSF, and represents a dynamic archive of information on high-resolution CT scans of biological specimens from fossils to modern organisms. With the right programs and processing speeds, users can use their computers to "explore" digitized objects from the inside, achieving perspective they could match only in a laboratory vivisection.

Timothy Rowe, director and curator of the vertebrate paleontology laboratory of the Texas Memorial Museum, describes the DigiMorph experience as "mind-boggling," and the same could be said for some of the NSF-funded digital library projects at Stanford University (CA). The first project, called Digital David, enables users to view archived three-dimensional digital images of Michelangelo's most famous sculpture, all scanned by researchers using similar CT-scan technology. The second effort, dubbed Digital Forma Urbis Romae, is a three-dimensional digital representation of ancient Rome, the result of scanning a marble map uncovered by Stanford archaeologists in the 1990s.

"What an optical desktop scanner has done for graphic arts, 3-D scanning can do for the plastic arts," says Marc Levoy, an associate professor of Computer Science, and the man who has coordinated the efforts. "The implications of how this kind of technology can impact art history and architecture research are off the charts."


Because applications like Digital Forma Urbis Romae require significant bandwidth and processor resources to be effective, these impacts could be years away. This, say experts such as Rowe and Levoy, does not matter much at art, digital libraries face an even bigger challenge in the months ahead: usability. Outside of the NSFs clearinghouse NSDL, few, if any, digital libraries enable users to search globally for what they need. Instead, the digital libraries of today essentially are a gaggle of intranets--self-contained databases with little or no exposure to the outside world. If you're a student at Texas A&M and you're looking for a digital archive of Chaucer, you're out of tuck. Likewise, if you're a student at DeMontfort and you want to read Cervantes, your only option is the card catalog.

Solving this conundrum was the basis for the Digital Library Federation (www., a push to provide access to digital content across institutions, and enable users from one school to access data at another. Operating under the administration umbrella of the Council of Library and Information Resources, this effort draws upon resources from more than 30 partner organizations, and identifies standards and best practices for digital collections, coordinates research-and-development efforts at participating schools, and supports research for an infrastructure through which schools can share digitized data that already is online. Ultimately, Indiana's Brancolini envisions that users will be able to search for digitized information at portal sites similar to popular Internet search engines such as Google and Yahoo.

"Libraries are cognizant of the fact that we need to make digital resources more accessible to people in a way that they can use," says Brancolini, an active participant in the group. "We're just now starting to figure out how."

In one attempt to make this wealth of digital data more accessible, digital library activists have founded the Open Archives Initiative (, an effort to standardize summary information for each and every item online. Technologists call these summaries "metadata," and liken them to digestible snippets of data about data. In reality, the snippets are synopses along the tines of dissertation abstracts. A handful of digital libraries utilize metadata currently, and about a dozen others plan to phase in the descriptors over the coming year. As researchers see it, searches on these Google-like portals would yield metadata that users could then use to determine which bit of data serves them best. Metadata, therefore, would just become another part of the process.

At the University of Virginia, researchers are working on their own software alternative, designed with the Flexible, Extensible Digital Object Repository Architecture, commonly known as FEDORA ( for short. Designed in collaboration with computer scientists at Cornell University (NY), with a $1 million grant from the Andrew W. Mellon Foundation, the software demonstrates how distributed digital library architecture can be deployed using Web-based technologies such as XML, and is designed to be a foundation upon which interoperable digital libraries can be built. A trial version of the software launched May 1, and according to Thornton Staples, Virginia's director of Digital Library Research and Development, early reports show the technology could revolutionize digital libraries forever.

"This isn't a digital library, but instead, a digital object repository management protocol," he explains. "We've had the technology behind this stuff for years; what's always been missing is the ability to manage it effectively."


No matter how FEDORA changes the landscape of digital libraries, Staples says there still are a number of challenges to the long-term survival of digitized data. Perhaps the most pervasive of these obstacles is U.S. copyright law, which prohibits independent parties from reproducing copyrighted material published before 1923. Regarding that time period, most researchers currently limit their digitizing efforts to material that exists in the public domain. Those library scientists who wish to digitize newer content can appeal to copyright holders themselves, or join RLG--the Research Libraries Group (, a loose affiliation of institutions that seeks to license material from copyright holders, museums, and publishing companies for nominal fees.

There are other challenges, too. Digital libraries already contain terabytes of data, and as the amount of information in digital libraries grows, the projects will present even greater technical challenges in terms of scalability. For the more information one stores in a digital library, the more resources one must set aside. Particularly in the areas of audio and video content, researchers are concerned about finding storage and delivery methods that won't create a digital divide and segregate schools based on budgetary resources. FEDORA helps solve this problem, but in an attempt to combat this fear more directly, researchers also have affiliated with the Rich Electronic Archive for Language Instruction Anywhere project ( media. html), and are looking at ways to combine peer-to-peer technology with affordable hardware alternatives to bring smaller schools into the fold.

Then, of course, there's the issue of sustainability. As technologists in the 1980s learned firsthand, technology changes so quickly that files stored in one format today may be obsolete five or 10 years down the road. Virginia's Staples and Cole at U Illinois Urbana-Champaign are among the researchers working furiously to address this pitfall, publishing almost annual reports on creating a framework of guidance for building digital collections that will last. At the University of Waikato in New Zealand, Ian Witten is taking a more proactive approach, developing a Linux-based software called Greenstone ( designed to perform in all of the current operating environments.

Whichever strategy individual researchers utilize to address these problems, those who deal with digital libraries every day say the only way to approach the future is together. Collaboration and interoperability, they say, are the names of the game, and practitioners now recognize the potential of digital collections to function as components and building blocks that can be reused by many different groups. Just as traditional librarians and researchers have viewed academic library print collections as shared resources, so too must managers of digital collections take a similar approach to digital collection development.

"The more things change in library science, the more they stay the same," says Michigan's Price Wilkin. "It's clear that the only way we'll solve digital libraries is through collaboration."

Matt Villano is a freelance writer based in Seattle, and Moss Beach, CA.
COPYRIGHT 2003 Professional Media Group LLC
No portion of this article can be reproduced without the express written permission from the copyright holder.
Copyright 2003, Gale Group. All rights reserved. Gale Group is a Thomson Corporation Company.

Article Details
Printer friendly Cite/link Email Feedback
Author:Villano, Matt
Publication:University Business
Geographic Code:1U3IN
Date:Jun 1, 2003
Previous Article:Fine-tuning your mission: your mission statement can put you right on top of your market--or make you irrelevant. (Marketing).
Next Article:2003 guide to higher education consultants: whether it's to empower a capital campaign, turn around a dismal enrollment picture, or devise a...

Related Articles
Seeking the Subject(*).
The Human Element in the Virtual Library.
The progress of theory in knowledge organization.
Googling the future.

Terms of use | Privacy policy | Copyright © 2022 Farlex, Inc. | Feedback | For webmasters |