ECDL 2001: Proceedings of the European Conference on Digital Libraries

Fullname:ECDL 2001: Research and Advanced Technology for Digital Libraries, 5th European Conference
Editors:Panos Constantopoulos; Ingeborg T. Sølvberg
Location:Darmstadt, Germany
Dates:2001-Sep-04 to 2001-Sep-09
Publisher:Springer Berlin Heidelberg
Series:Lecture Notes in Computer Science 2163
Standard No:DOI: 10.1007/3-540-44796-2; ISBN: 978-3-540-42537-3 (print), 978-3-540-44796-2 (online); hcibib: ECDL01
  1. User Modelling
  2. Digitisation, Interpretation, and Annotation of Documents
  3. Knowledge Management I
  4. Data and Metadata Models
  5. Integration in User Communities
  6. Information Retrieval and Filtering
  7. Knowledge Management II
  8. Multimedia Digital Libraries
  9. Multilinguality
  10. Panels

User Modelling

Evaluating Electronic Textbooks: A Methodology BIBAFull-Text 1-12
  Ruth Wilson; Monica Landoni
EBONI (Electronic Books ON-screen Interface) [1] builds on the premise to emerge from the Visual Book [2] and WEB Book projects, [3] that appearance is important in the design of electronic textbooks, and offers an evaluation model, or general methodology, from which ebook usability experiments in a range of areas can be extracted and remain comparable at a basic level. The methodology sets out options for selecting material, participants, tasks and techniques, which vary in cost and level of sophistication. Results from each study will feed into a set of best practice guidelines for producing electronic textbooks on the Web, reflecting the requirements of students and academics throughout the UK.
Search Behavior in a Research-Oriented Digital Library BIBAFull-Text 13-24
  Malika Mahoui; Sally Jo Cunningham
This paper presents a transaction log analysis of ResearchIndex, a digital library for computer science researchers. ResearchIndex is an important information resource for members of this target group, and the collection sees significant use worldwide. Queries from over six months of usage were analyzed, to determine patterns in query construction and search session behavior. Where appropriate, these results are compared to earlier studies of search behavior in two other computing digital libraries.
A Combined Phrase and Thesaurus Browser for Large Document Collections BIBAFull-Text 25-36
  Gordon W. Paynter; Ian H. Witten
A browsing interface to a document collection can be constructed automatically by identifying the phrases that recur in the full text of the documents and structuring them into a hierarchy based on lexical inclusion. This provides a good way of allowing readers to browse comfortably through the phrases (all phrases) in a large document collection.
   A subject-oriented thesaurus provides a different kind of hierarchical structure, based on deep knowledge of the subject area. If all documents, or parts of documents, are tagged with thesaurus terms, this provides a very convenient way of browsing through a collection. Unfortunately, manual classification is expensive and infeasible for many practical document collections.
   This paper describes a browsing scheme that gives the best of both worlds by providing a phrase-oriented browser and a thesaurus browser within the same interface. Users can switch smoothly between the phrases in the collection, which give access to the actual documents, and the thesaurus entries, which suggest new relationships and new terms to seek.
Customizable Retrieval Functions Based on User Tasks in the Cultural Heritage Domain BIBAKFull-Text 37-48
  Holger Brocks; Ulrich Thiel; Adelheit Stein; Andrea Dirsch-Weigand
The cultural heritage domain dealing with digital surrogates of rare and fragile historic artifacts is one of the most promising areas for establishing collaboratories, i.e. shared virtual working environments for groups of users. However, in order to be considered a useful tool, such a system must reflect and support the specific tasks which are typical for the domain. The system design presented here takes into account a variety of activities, e.g., source analysis, which are supported by a task-specific selection of appropriate retrieval functions, e.g., access to OCR results and annotations. The tasks are explicitly modeled, thus the corresponding user interfaces can be automatically generated.
Keywords: Cultural Heritage; Collaboratory; Task-based Retrieval

Digitisation, Interpretation, and Annotation of Documents

Digital Watermark BIBAFull-Text 49-58
  Hamid Reza Mehrabi
Cultural institutions have an increasing need for protection of copyright on the internet. Digital images are easily downloaded and thus need to be protected from misuse. This project will develop a method for provision of protection.
   An invisible signal, known as a digital watermark, is a mark placed on a still image. In a project Culture Net Denmark at The Royal library in Copenhagen a method for copyright protection of digital colour images is analysed and implemented.
   The signature bits are embedded by modifying the blue channel in the image. A generator that produces random locations in the image, where the signatures bits are embedded, is chosen to prevent any removal of the signature. Specifically, a secret key determines where the signatures bits are embedded in the image.
   To extract the signature a user needs to know the secret key.
   Furthermore, a method is implemented to retrieve the signature without reference to the original image.
   The robustness of the method against possible attacks via geometrical transformation or filtering is analysed.
Document Classification and Interpretation through the Inference of Logic-Based Models BIBAFull-Text 59-70
  Giovanni Semeraro; Stefano Ferilli; Nicola Fanizzi; Floriana Esposito
We present a methodology for document processing that exploits logic-based machine learning techniques. Our claim is that information capture and indexing can profit by the identification of the document class and of specific function of its single layout components. Indeed, the application of incremental and multistrategy machine learning techniques, rather than the classic ones, allows for an efficient solution to the problem of information capture.
The Cervantes Project: Steps to a Customizable and Interlinked On-Line Electronic Variorum Edition Supporting Scholarship BIBAFull-Text 71-82
  Richard Furuta; Siddarth Kalasapur; Rajiv Kochumman; Eduardo Urbina; Ricardo Vivancos-Pérez
The Cervantes Project, housed under the auspices of the Center for the Study of Digital Libraries at Texas A&M University, aims to provide a comprehensive on-line research and reference site on the life and works of the author Miguel de Cervantes Saavedra (1547-1616). This activity is a joint collaboration among researchers in the Department of Computer Science and the Department of Modern and Classical Languages, Texas A&M University. This paper outlines the work being conducted by the project, focusing on the creation of an Electronic Variorum Edition of Cervantes' Don Quixote.
Fusion Approaches for Mappings between Heterogeneous Ontologies BIBAFull-Text 83-94
  Thomas Mandl; Christa Womser-Hacker
Ordering principles of digital libraries expressed in ontologies may be highly heterogeneous even within a domain and especially over different cultures. Automatic methods for mappings between different ontologies are necessary to ensure successful retrieval of information stored in virtual digital libraries. Text categorization has discussed learning methods to map between full text terms and thesaurus descriptors. This article reports some experiments for the mapping between different ontologies and shows further that fusion methods which have been successfully applied to ad-hoc information retrieval can also be employed for text categorization.
Enhancing Digital Library Documents by A Posteriori Cross Linking Using XSLT BIBAFull-Text 95-102
  Michael G. Bauer; Günther Specht
In this paper we describe a way to enhance existing digital library documents by adding links without modifying the stored documents themselves. We show how to use a combination of XSLT and a host language to access a database with linking information and how to merge documents and links at run-time (a posteriori cross linking). Our approach is already used in the system OMNIS/2, which is an advanced meta system for existing digital library systems and enhances existing digital library systems or retrieval systems by additional storing and indexing of user-defined multimedia documents, automatic and personal linking concepts, annotations, filtering and personalization.
Using Copy-Detection and Text Comparison Algorithms for Cross-Referencing Multiple Editions of Literary Works BIBAFull-Text 103-114
  Arkady B. Zaslavsky; Alejandro Bia; Krisztián Monostori
This article describes a joint research work between Monash University and the University of Alicante, where software originally meant for plagiarism and copy detection in academic works is successfully applied to perform comparative analysis of different editions of literary works. The experiments were performed with Spanish texts from the Miguel de Cervantes digital library. The results have proved useful for literary and linguistic research, automating part of the tedious task of comparative text analysis. Besides, other interesting uses were detected.

Knowledge Management I

An Architecture for Automatic Reference Linking BIBAFull-Text 115-126
  Donna Bergmark; Carl Lagoze
Along with the explosive growth of the Web has come a great increase in on-line scholarly literature, which is often more current than what appears in printed publications. The increasing proportion of on-line scholarly literature makes it possible to implement functionality desirable to all researchers -- the ability to access cited documents immediately from the citing paper. Implementing this direct access is called "reference linking". The Cornell Digital Library Research Group employs value-added surrogates as a generalizable mechanism for providing reference-linking behavior in Web documents. This mechanism exposes reference linking data through a well-defined API, permitting the construction of reference linking services by external clients. We present two example reference linking applications buildable on this API. We also introduce a performance metric; currently we are (automatically) extracting reference linking information with more than 80% accuracy.
Disambiguating Geographic Names in a Historical Digital Library BIBAFull-Text 127-136
  David A. Smith; Gregory Crane
Geographic interfaces provide natural, scalable visualizations for many digital library collections, but the wide range of data in digital libraries presents some particular problems for identifying and disambiguating place names. We describe the toponym-disambiguation system in the Perseus digital library and evaluate its performance. Name categorization varies significantly among different types of documents, but toponym disambiguation performs at a high level of precision and recall with a gazetteer an order of magnitude larger than most other applications.
Greenstone: A Platform for Distributed Digital Library Applications BIBAKFull-Text 137-148
  David Bainbridge; George Buchanan; John R. McPherson; Steve Jones; Abdelaziz Mahoui; Ian H. Witten
This paper examines the issues surrounding distributed Digital Library protocols. First, it reviews three prominent digital library protocols: Z39.50, SDLIP, and Dienst, plus Greenstone's own protocol. Then, we summarise the implementation in the Greenstone Digital Library of a number of different protocols for distributed digital libraries, and describe sample applications of the same: a digital library for children, a translator for Stanford's Simple Digital Library Interoperability Protocol, a Z39.50 client, and a bibliographic search tool. The paper concludes with a comparison of all four protocols, and a brief discussion of the impact of distributed protocols on the Greenstone system.
Keywords: Distributed protocol; Z39.50; CORBA; graphical user interface support
Linking Information with Distributed Objects BIBAFull-Text 149-160
  Trond Aalberg
Digital libraries can be viewed as managed and organized information spaces. In building and using these information spaces there is a need for technology to express, navigate and manage relationships. This paper presents the DL-LinkService, an object-oriented solution for structuring information in digital libraries where the relationships are implemented as distributed objects. The service is inspired by the CORBA Relationship Service, but our contribution is more flexible because the typing of relationships is independent of the implementation of the service. This paper describes the DL-LinkService and its possible use in three scenarios. A prototype is developed and our experience with the DL-LinkService is that this is a promising solution for expressing and navigating the structural information of digital libraries.

Data and Metadata Models

Metadata for Digital Preservation: A Review of Recent Developments BIBAFull-Text 161-172
  Michael Day
This paper is a review of recent developments relating to digital preservation metadata. It introduces the digital preservation problem and notes the importance of metadata for all proposed preservation strategies. The paper reviews some developments in the archives and records domain, describes the taxonomy of information object classes defined by the Reference Model for an Open Archival Information System (OAIS) and outlines some library-based projects.
MARIAN: Flexible Interoperability for Federated Digital Libraries BIBAFull-Text 173-186
  Marcos André Gonçalves; Robert K. France; Edward A. Fox
Federated digital libraries are composed of distributed, autonomous, and often heterogeneous information services but provide users with a transparent, integrated view of collected information. In this paper we discuss a federated system for the Networked Digital Library of Theses and Dissertations (NDLTD), an international consortium of universities, libraries, and other supporting institutions focused on electronic theses and dissertations (ETDs). Federation requires dealing flexibly with differences among systems, ontologies, and data formats while respecting information sources' autonomy. Our solution involves adapting the object-oriented digital library system MARIAN to serve as mediation middleware for the federated NDLTD collection. Components of the solution include: 1) the use and integration of several harvesting techniques; 2) an architecture based on object-oriented ontologies of search modules and metadata; 3) reconciliation of diversity within the harvested data joined to a single collection view for the user; and 4) an integrated framework for addressing such questions as data quality, flexible and efficient search, and scalability.
Digital Libraries: A Generic Classification and Evaluation Scheme BIBAFull-Text 187-199
  Norbert Fuhr; Preben Hansen; Michael Mabe; András Micsik; Ingeborg Sølvberg
Evaluation of digital libraries (DLs) is essential for further development in this area. Whereas previous approaches were restricted to certain facets of the problem, we argue that evaluation of DLs should be based on a broad view of the subject area. For this purpose, we develop a new description scheme using four major dimensions: data/collection, system/technology, users, and usage. For each of these dimensions, we describe the major attributes. Using this scheme, existing DL test beds can be characterised. For this purpose, we have performed a survey by means of a questionnaire, which is now continued by setting up a DL meta-library.
A Deposit for Digital Collections BIBAFull-Text 200-212
  Norman Noronha; João P. Campos; Daniel Gomes; Mário J. Silva; José Luis Borbinha
We present the architecture and requirements for a novel system for managing the deposit of specific genres of digital publications in a deposit library. The system adopts a simple model for online publications and supports both harvesting and delivery models of deposit. This paper describes that system, and presents an evaluation after a trial period with the harvesting functions.

Integration in User Communities

Digital Libraries in a Clinical Setting: Friend or Foe? BIBAFull-Text 213-224
  Anne Adams; Ann Blandford
Clinical requirements for quick accessibility to reputable, up-to-date information have increased the importance of web accessible digital libraries for this user community. To understand the social and organisational impacts of ward-accessible digital libraries (DLs) for clinicians, we conducted a study of clinicians' perceptions of electronic information resources within a large London based hospital. The results highlight that although these resources appear to be a relatively innocuous means of information provision (i.e. no sensitive data) social and organisational issues can impede effective technology deployment. Clinical social structures, which produce information -- and technology -- hoarding behaviours can result from poor training, support and DL usability.
Interactive, Domain-Independent Identification and Summarization of Topically Related News Articles BIBAFull-Text 225-238
  Dragomir R. Radev; Sasha Blair-Goldensohn; Zhu Zhang; Revathi Sundara Raghavan
In this paper we present NewsInEssence, a fully deployed digital news system. A user selects a current news story of interest which is used as a seed article by NewsInEssence to find in real time other related stories from a large number of news sources. The output is a single document summary presenting the most salient information gleaned from the different sources. We discuss the algorithm used by NewsInEssence, module interoperability, and conclude the paper with a number of empirical analyses.
Digital Work Environment (DWE): Using Tasks to Organize Digital Resources BIBAFull-Text 239-250
  Narayanan Meyyappan; Suliman Al-Hawamdeh; Schubert Foo
DWE is aimed at providing a one-stop access point to local and remote digital library collections, traditional in-house libraries, and most importantly, to the vast array of information resources that exists in the academic community's local Intranet. Due to vast amount of information available and the difficulty faced by students and staff in finding the relevant resources, there is a need for a better and logical organization of these resources. DWE uses tasks as a means of directing students and staff to the relevant resources. Tasks generally play an important role in system and user interface design. Identifying the user's tasks enables the designer to construct user interfaces reflecting the tasks' properties, including efficient usage patterns, easy-to-use interaction sequences, and powerful assistance features. The resources in DWE are organized according to specific tasks performed by the research students and staff in the division of information studies. The tasks and resources were elicited based on the needs of faculty and students through interviews and focus groups.
Learning Spaces in Digital Libraries BIBAFull-Text 251-262
  Anita Coleman; Terence R. Smith; Olha A. Buchel; Richard E. Mayer
The Alexandria Digital Earth Prototype (ADEPT) Project is developing services to support the construction and use of "learning spaces", or personalized DLs of geospatially referenced information and services, with applications in science education. The project is focused on helping students attain deep understanding (concept development) and scientific reasoning skills (hypothesis development.) In relation to its use of concepts and hypotheses for organizing and using collections and services in ways that support such student learning activities, we describe four Project activities focused on developing: (1) use scenarios to inform the ADEPT specification process; (2) both the concept of learning spaces (LSs) and instances of LSs; (3) clients for LSs; and (4) meta-information environments, including topic maps, that support LSs.
Ethnography, Evaluation, and Design as Integrated Strategies: A Case Study from WES BIBAFull-Text 263-274
  Michael Khoo
The Water in the Earth System (WES) collection is a collection of the Digital Library for Earth System Education. As WES relies on its user community to generate metadata-resources; identification of robust user community features, and of potential user community problems, is therefore important. This paper describes (a) how ethnography is being used to study the WES community; (b) how technological frames theory and technology use mediation theory is being used to analyse this data; and (c) how research outcomes are being used to generate recommendations for supporting future WES development.
Dynamic Models of Expert Groups to Recommend Web Documents BIBAFull-Text 275-286
  DaeEun Kim; Sea Woo Kim
Recently most recommender systems have been developed to recommend items or documents based on user preferences for a particular user, but they have difficulty in deriving user preferences for users who have not rated many documents. In this paper we use dynamic expert groups which are automatically formed to recommend domain-specific documents for unspecified users. The group members have dynamic authority weights depending on their performance of the ranking evaluations. Human evaluations over web pages are very effective to find relevant information in a specific domain. In addition, we have tested several effectiveness measures on rank order to determine if the current top-ranked lists recommended by experts are reliable. We show simulation results to check the possibility of dynamic expert group models for recommender systems.

Information Retrieval and Filtering

Enhancing Information Retrieval in Federated Bibliographic Data Sources Using Author Network Based Stratagems BIBAFull-Text 287-299
  Peter Mutschke
Despite the fact that many Digital Libraries (DLs) are available on the Internet, users cannot effectively use them because of inadequate functionality, deficient visualization and insufficient integration of different DLs. As part of the DAFFODIL project we develop a user-oriented access system for DLs which overcomes these drawbacks. A major focus of the prototype concerns the implementation of search stratagems that exhaust the data structures stored in federated bibliographic DLs. The paper introduces stratagems taking into account information on the (social) status of scientific actors in author networks using network analysis methods. To make the propagation and analysis of actor networks more efficient an optimization strategy called main path analysis is employed.
Architecture for Event-Based Retrieval from Data Streams in Digital Libraries BIBAKFull-Text 300-311
  Mohamed Kholief; Stewart N. T. Shen; Kurt Maly
Data streams are very important sources of information for both researchers and other users. Data streams might be video or audio streams or streams of sensor readings or satellite images. Using digital libraries for archival, preservation, administration, and access control for this type of information greatly enhances the utility of data streams. For this specific type of digital libraries, our proposed event-based retrieval provides an alternate, yet a very natural way of retrieving information. People tend to remember or search by a specific event that occurred in the stream better than by the time at which this event occurred. In this paper we present the analysis and design of a digital library system that contains data streams and supports event-based retrieval.
Keywords: digital libraries; event-based retrieval; data streams; architecture
The Effects of the Relevance-Based Superimposition Model in Cross-Language Information Retrieval BIBAFull-Text 312-324
  Teruhito Kanazawa; Akiko N. Aizawa; Atsuhiro Takasu; Jun Adachi
We propose a cross-language information retrieval method that is based on document feature modification and query translation using a dictionary extracted from comparable corpora. In this paper, we show the language-independent effectiveness of our document feature modification model for dealing with semantic ambiguity, and demonstrate the practicality of the proposed method for extracting multilingual keyword clusters from digital libraries. The results of our experiments with multilingual corpora indicate that our document feature modification model avoid the difficulties of language-/domain-dependent parameters.
An On-Line Document Clustering Method Based on Forgetting Factors BIBAKFull-Text 325-339
  Yoshiharu Ishikawa; Yibing Chen; Hiroyuki Kitagawa
With the rapid development of on-line information services, information technologies for on-line information processing have been receiving much attention recently. Clustering plays important roles in various on-line applications such as extraction of useful information from news feeding services and selection of relevant documents from the incoming scientific articles in digital libraries. In on-line environments, users generally have interests on newer documents than older ones and have no interests on obsolete old documents.
   Based on this observation, we propose an on-line document clustering method F²ICM (Forgetting-Factor-based Incremental Clustering Method) that incorporates the notion of a forgetting factor to calculate document similarities. The idea is that every document gradually losses its weight (or memory) as time passes according to this factor. Since F2ICM generates clusters using a document similarity measure based on the forgetting factor, newer documents have much effects on the resulting cluster structure than older ones. In this paper, we present the fundamental idea of the F2ICM method and describe its details such as the similarity measure and the clustering algorithm. Also, we show an efficient incremental statistics maintenance method of F2ICM which is indispensable for on-line dynamic environments.
Keywords: clustering; on-line information processing; incremental algorithms; forgetting factors

Knowledge Management II

Towards a Theory of Information Preservation BIBAFull-Text 340-351
  James Cheney; Carl Lagoze; Peter Botticelli
Digital preservation is a pressing challenge to the library community. In this paper, we describe the initial results of our efforts towards understanding digital (as well as traditional) preservation problems from first principles. Our approach is to use the language of mathematics to formalize the concepts that are relevant to preservation. Our theory of preservation spaces draws upon ideas from logic and programming language semantics to describe the relationship between concrete objects and their information contents. We also draw on game theory to show how objects change over time as a result of uncontrollable environment effects and directed preservation actions. In the second half of this paper, we show how to use the mathematics of universal algebra as a language for objects whose information content depends on many components. We use this language to describe both migration and emulation strategies for digital preservation.
C-Merge: A Tool for Policy-Based Merging of Resource Classifications BIBAKFull-Text 352-365
  Florian Matthes; Claudia Niederée; Ulrike Steffens
In this paper we present an interactive tool for policy-based merging of resource-classifying networks (RCNs). We motivate our approach by identifying several merge scenarios within organizations and discuss their individual requirements on RCN merge support. The quality-controlled merging of RCNs integrates the contributions from different authors, fostering synergies and the achievement of common goals.
   The C-Merge tool design is based on a generalized view of the merge process and a simple but flexible model of RCNs. The tool is policy-driven and supports a variable degree of automation. Powerful options for user interaction and expressive change visualization enable substantial user support as well as effective quality control for the merge process.
Keywords: Categorization; Taxonomy; Merging; CSCW; Knowledge Management; Knowledge Visualization; Quality Control
Truth in the Digital Library: From Ontological to Hermeneutical Systems BIBAKFull-Text 366-377
  Aurélien Bénel; Elöd Egyed-Zsigmond; Yannick Prié; Sylvie Calabretto; Alain Mille; Andréa Iacovella; Jean-Marie Pinon
This paper deals with the conceptual structures which describe document contents in a digital library. Indeed, the underlying question is about the truth of a description: obvious (ontological), by convention (normative) or based on interpretation (hermeneutical). In the first part, we examine the differences between these three points of view and choose the hermeneutical one. Then in the second and third part, we present two "assisted interpretation systems". (AIS) for digital libraries (audiovisual documents and scholarly publications). Both provide a dynamic annotation framework for readers' augmentations and social interactions. In the fourth part, a few synthetic guidelines are given to design such "assisted interpretation systems" in other digital libraries.
Keywords: Interpretation; collaboration; annotation; ontology; graphs; interactive information retrieval; assisted interpretation systems

Multimedia Digital Libraries

XSL-based Content Management for Multi-presentation Digital Museum Exhibitions BIBAFull-Text 378-389
  Jen-Shin Hong; Bai-Hsuen Chen; Jieh Hsiang
Similar to a conventional museum, a digital museum draws a set of objects from its collection of digital artifacts to produce exhibitions about a specific topic. Online exhibitions often consist of a variety of multimedia objects such as webpages, animation, and video clips. In a physical museum, the exhibition is confined by the physical limitation of the artifacts. That is, there can only be one exhibition using the same set of artifacts. Thus, if one wishes to design different exhibitions about the same topic for different user groups, one has to use different sets of artifacts, and exhibit them in different physical locations.
   A digital museum does not have such physical restrictions. One can design different exhibitions about the same topic for adults, children, experts, novices, high bandwidth users, and low bandwidth users, all using the same set of digital artifacts. A user can simply click and choose the specific style of exhibition that she wants to explore. The difficulty here is that it is time-consuming to produce illustrative and intriguing online exhibitions. One can spend hours designing webpages for just one exhibition alone, not to mention several. In this paper, we present the design of an XSL-based Multi-Presentation Content Management System (XMP-CMS). This framework is a novel approach for organizing digital collections, and for quickly selecting, integrating, and composing objects from the collection to produce exhibitions of different presentation styles, one for each user group. A prototype based on our framework has been implemented and successfully used in the production of a Lanyu digital museum. Using our method, the Lanyu Digital Museum online exhibition has several features: (1) It provides an easy way to compose artifacts extracted from the digital collection into exhibitions. (2) It provides an easy way to create different presentations of the same exhibition content that are catered to users with different needs. (3) It provides easy-to-use film-editing capability to re-arrange an exhibition and to produce new exhibitions from existing ones.
Iterative Design and Evaluation of a Geographic Digital Library for University Students: A Case Study of the Alexandria Digital Earth Prototype (ADEPT) BIBAFull-Text 390-401
  Christine L. Borgman; Gregory H. Leazer; Anne J. Gilliland-Swetland; Rich Gazan
We report on the first two years of a five-year project to design and evaluate the Alexandria Digital Earth ProtoType (ADEPT), a digital library of geo-referenced information resources, for use in undergraduate education. To date, we have established design principles, observed classroom activities, gathered baseline data from instructors and students, and evaluated early prototypes. While students and instructors are generally enthusiastic about ADEPT, they have concerns about the effort required and the effectiveness of computer-based technologies in the classroom. Instructors vary widely in their use of instructional materials and technologies, teaching styles, and areas of expertise. Results of our work are being incorporated in an iterative cycle of design and evaluation. The paper concludes by presenting research and evaluation methods, design principles, and requirements for educational applications of digital libraries.
Automatically Analyzing and Organizing Music Archives BIBAKFull-Text 402-414
  Andreas Rauber; Markus Frühwirth
We are experiencing a tremendous increase in the amount of music being made available in digital form. With the creation of large multimedia collections, however, we need to devise ways to make those collections accessible to the users. While music repositories exist today, they mostly limit access to their content to query-based retrieval of their items based on textual meta-information, with some advanced systems supporting acoustic queries. What we would like to have additionally, is a way to facilitate exploration of musical libraries. We thus need to automatically organize music according to its sound characteristics in such a way that we find similar pieces of music grouped together, allowing us to find a classical section, or a hard-rock section etc. in a music repository. In this paper we present an approach to obtain such an organization of music data based on an extension to our SOMLib digital library system for text documents. Particularly, we employ the Self-Organizing Map to create a map of a musical archive, where pieces of music with similar sound characteristics are organized next to each other on the two-dimensional map display. Locating a piece of music on the map then leaves you with related music next to it, allowing intuitive exploration of a music archive.
Keywords: Multimedia; Music Library; Self-Organizing Map (SOM); Exploration of Information Spaces; User Interface; MP3
Building and Indexing a Distributed Multimedia Presentation Archive Using SMIL BIBAFull-Text 415-428
  Jane Hunter; Suzanne Little
This paper proposes an approach to the problem of generating metadata for composite mixed-media digital objects by appropriately combining and exploiting existing knowledge or metadata associated with the individual atomic components which comprise the composite object. Using a distributed collection of multimedia learning objects, we test this proposal by investigating mechanisms for capturing, indexing, searching and delivering digital online presentations using SMIL (Synchronized Multimedia Integration Language). A set of tools have been developed to automate and streamline the construction and fine-grained indexing of a distributed library of digital multimedia presentation objects by applying SMIL to lecture content from both the University of Qld and Cornell University. Using temporal information which is captured automatically at the time of lecture delivery, the system can automatically synchronize the video of a lecture with the corresponding PowerPoint slides to generate a finely-indexed presentation at minimum cost and effort. This approach enables users to search and retrieve relevant streaming video segments of the lecture based on keyword or free text searches within the slide content. The underlying metadata schema, the metadata processing/generation tools, distributed archive, backend database and the search, browse and playback interfaces which comprise the system are also described in this paper. We believe that the relatively low cost and high speed of development of this apparently sophisticated multimedia archive with rich search capabilities, provides evidence to support the validity of our initial proposal.


Digitization, Coded Character Sets, and Optical Character Recognition for Multi-script Information Resources: The Case of the Letopis' Zhurnal'nykh Statei BIBAFull-Text 429-437
  George Andrew Spencer
Multi-lingual information resources that consist of texts in more scripts than can be represented by a single 8-bit encoding scheme can currently be best represented by use of the Unicode multi-byte character-encoding scheme. However use of Unicode could lead to a decrease in the accuracy of Optical Character Recognition (OCR) software because of the similarity of glyphs between certain scripts. This decrease in OCR accuracy can dramatically increase the amount of time needed to proofread the resulting electronic texts. An Indiana University -- Digital Library Program project for digitizing a 20-year portion of the Letopis' Zhurnal'nykh Statei is presented as an example of a digital library project dealing with a multi-script information resource for which Unicode has been used.
Document Clustering and Language Models for System-Mediated Information Access BIBAKFull-Text 438-449
  Gheorghe Muresan; David J. Harper
This paper presents the novel concept of system-mediated information access, i.e. system support for the user in clarifying and refining a vague information need and in generating a good formulation for it. The concept is based on two main assumptions: firstly, on document clustering's ability to reveal the topical, semantic structure of a domain of interest, represented by a specialized collection, and secondly, on the capacity of language models to convey content. Experimental results show that these assumptions are correct and that there is potential to significantly improve the retrieval performance by generating a better query through mediation.
Keywords: Mediated Access; Document Clustering; Topic Model
Research and Development of Digital Libraries in China: Major Issues and Trends BIBAFull-Text 450-457
  Guohui Li; Michael Bailou Huang
This paper presents an overview on research and development of digital libraries in China, introduces three digital library prototypes being built, and analyses problems and countermeasures of Chinese digital libraries construction.


What's Holding Up the Development of Georeferenced DLs? BIBAFull-Text 458
  Michael Freeston; Linda L. Hill
The implementation of georeferenced digital library technologies in the collection development efforts of various subject domains has been slow in advancing beyond the original geographic and map collections. The panel members will speak from the experience of georeferencing applications in specific subject domains and discuss the issues presented by georeferencing -- including aspects of cognition (understanding of the meaning and usefulness of geospatial searching, display, and evaluation), culture (established ways of doing things and identification of geospatial indexing solely with GIS), technologies (e.g., geospatial search functionality and representation of spatial location), and funding (magnitude and availability of funding needed to georeference objects and redesign systems). The session will provide a significant segment of time for discussion among the audience and panelists. The aim is to generate a good exchange of ideas and get closer to understanding the barriers to wide spread integration of georeferencing in DL application domains.
Open Archive Initiative, Publishers and Scientific Societies: Future of Publishing -- Next Generation Publishing Models BIBAFull-Text 459
  Elisabeth Niggemann; Matthias Hemmje
This panel will look into the future of publishing as a process between authoring communities such as scientific associations and publishers, i.e., non profit organizations on the one hand, and commercial enterprizes responsible for performing the production, marketing, sales and distribution of publications on the other hand.
   Just recently and especially triggered by the activities of, for example the Open Archive Initiative and other similar movements, the discussion between the different stake holders in the scientific publishing process has become more intensive.
   It is generally questioned whether the traditional publishing models are still valid since the advent of electronic publishing tools enable publishing from the desktop, and public distribution mechanisms like the web enable distribution at virtually no cost. These developments have produced a totally different scenario from the past.
   In this context the passing of intellectual property rights from authors to publishing companies is questioned in the same way as the validity of traditional business models, pricing policies, and access regulations. On the other hand, issues like quality assurance etc. and the cost of high quality production, distribution, and maintenance of intellectual collections cannot be neglected. The members of the panel are exemplary representatives of the different stakeholders in the scientific publishing process and will report on what has been achieved so far in the discussion and on still open questions. Their presentations will include their view on which agreements have to be achieved in the future for organizing and supporting the scientific publishing process in a fair way, at the same time paving the ground for organizational innovation in other areas of publication.
Digital Library Programs: Current Status and Future Plans BIBAFull-Text 460
  Erich J. Neuhold
For this panel, as for the conference itself, the term Digital Libraries has been broadened to include besides library organisations also museums and archives as well as any other digital collections that are assumed to be of continuing interest for human users. In this way we distinguish them from other kinds of information collections that are of current interest but are of little value for long term availability and preservation.
   The members of the panel come from different countries and governmental organisations and will report on what has been achieved so far in the context of various governmental programs. Their presentations will include their views on what has to be done in the future for the Digital Library field, and what governmental programs may come forward to support research, development and use of these collective memories. Special emphasis can be expected on the cultural heritage aspects implied by these collections.