ECDL 2000: Proceedings of the European Conference on Digital Libraries

Fullname:ECDL 2000: Research and Advanced Technology for Digital Libraries: 4th European Conference
Editors:José Borbinha; Thomas Baker
Location:Lisbon, Portugal
Dates:2000-Sep-18 to 2000-Sep-20
Publisher:Springer Berlin Heidelberg
Series:Lecture Notes in Computer Science 1923
Standard No:DOI: 10.1007/3-540-45268-0; ISBN: 978-3-540-41023-2 (print), 978-3-540-45268-3 (online); hcibib: ECDL00
  1. Optical Recognition
  2. Information Retrieval
  3. Metadata
  4. Frameworks
  5. Multimedia
  6. Users
  7. Papers Complementing Invited Talks
  8. Multimedia
  9. Users in Digital Libraries
  10. Information Retrieval
  11. Internet Cataloguing
  12. Technical Collections
  13. Cases 1
  14. Cases 2
  15. Cases 3
  16. Special Workshop

Optical Recognition

Automatic Feature Extraction and Recognition for Digital Access of Books of the Renaissance BIBAFull-Text 1-13
  Fernando Muge; Isabel Granado; M. Mengucci; Pedro Pina; Vitorino Ramos; N. Sirakov; João Rogério Caldas Pinto; A. Marcolino; Mário Ramalho; Pedro Vieira; A. Maia do Amaral
Antique printed books constitute a heritage that should be preserved and used. With novel digitising techniques is now possible to have these books stored in digital format and accessible to a wider public. However it remains the problem of how to use them. DEBORA (Digital accEss to BOoks of the RenAissance) is a European project that aims to develop a system to interact with these books through world-wide networks. The main issue is to build a database accessible through client computers. That will require to built accompanying metadata that should characterise different components of the books as illuminated letters, banners, figures and key words in order to simplify and speed up the remote access. To solve these problems, digital image analysis algorithms regarding filtering, segmentation, separation of text from non-text, lines and word segmentation and word recognition were developed. Some novel ideas are presented and illustrated through examples.
Content Based Indexing and Retrieval in a Digital Library of Arabic Scripts and Calligraphy BIBAFull-Text 14-23
  Suliman Al-Hawamdeh; Gul N. Khan
Due the cursive nature of the Arabic scripts automatic recognition of keywords using computers is very difficult. Content based indexing using textual, graphical and visual information combined provides a more realistic and practical approach to the problem of indexing large collection of calligraphic material. Starting with low level patter recognition and feature extraction techniques, graphical representations of the calligraphic material can be captured to form the low level indexing parameters. These parameters are then enhanced using textual and visual information provided by the users. Through visual feedback and visual interaction, recognized textual information can be used to enhance the indexing parameter and in return improve the retrieval of the calligraphic material. In this paper, we report an implementation of the system and show how visual feedback and visual interaction helps to improve the indexing parameters created using the low-level image feature extraction technologies.
Ancient Music Recovery for Digital Libraries BIBAFull-Text 24-34
  João Rogério Caldas Pinto; Pedro Vieira; Mário Ramalho; M. Mengucci; Pedro Pina; Fernando Muge
The purpose of this paper is to present a description and current state of the "ROMA" (Reconhecimento Óptico de Música Antiga or Ancient Music Optical Recognition) Project that consists on building an application, for the recognition and restoration specialised in ancient music manuscripts (from XVI to XVIII century). This project, beyond the inventory of the Biblioteca Geral da Universidade de Coimbra musical funds aims to develop algorithms for scores restoration and musical symbols recognition in order to allow a suitable representation and restoration on digital format. Both objectives have an intrinsic research nature one in the area of musicology and other in digital libraries.
Probabilistic Automaton Model for Fuzzy English-Text Retrieval BIBAFull-Text 35-44
  Manabu Ohta; Atsuhiro Takasu; Jun Adachi
Optical character reader (OCR) misrecognition is a serious problem when searching against OCR-scanned documents in databases such as digital libraries. This paper proposes fuzzy retrieval methods for English text that contains errors in the recognized text without correcting the errors manually. Costs are thereby reduced. The proposed methods generate multiple search terms for each input query term based on probabilistic automata reflecting both error-occurrence probabilities and character-connection probabilities. Experimental results of test-set retrieval indicate that one of the proposed methods improves the recall rate from 95.56% to 97.88% at the cost of a decrease in precision rate from 100.00% to 95.52% with 20 expanded search terms.

Information Retrieval

Associative and Spatial Relationships in Thesaurus-Based Retrieval BIBAFull-Text 45-58
  Harith Alani; Christopher B. Jones; Douglas Tudhope
The OASIS (Ontologically Augmented Spatial Information System) project explores terminology systems for thematic and spatial access in digital library applications. A prototype implementation uses data from the Royal Commission on the Ancient and Historical Monuments of Scotland, together with the Getty AAT and TGN thesauri. This paper describes its integrated spatial and thematic schema and discusses novel approaches to the application of thesauri in spatial and thematic semantic distance measures. Semantic distance measures can underpin interactive and automatic query expansion techniques by ranking lists of candidate terms. We first illustrate how hierarchical spatial relationships can be used to provide more flexible retrieval for queries incorporating place names in applications employing online gazetteers and geographical thesauri. We then employ a set of experimental scenarios to investigate key issues affecting use of the associative (RT) thesaurus relationships in semantic distance measures. Previous work has noted the potential of RTs in thesaurus search aids but the problem of increased noise in result sets has been emphasised. Specialising RTs allows the possibility of dynamically linking RT type to query context. Results presented in this paper demonstrate the potential for filtering on the context of the RT link and on subtypes of RT relationships.
Experiments on the Use of Feature Selection and Negative Evidence in Automated Text Categorization BIBAFull-Text 59-68
  Luigi Galavotti; Fabrizio Sebastiani; Maria Simi
We tackle two different problems of text categorization (TC), namely feature selection and classifier induction. Feature selection (FS) refers to the activity of selecting, from the set of r distinct features (i.e. words) occurring in the collection, the subset of r' << r features that are most useful for compactly representing the meaning of the documents. We propose a novel FS technique, based on a simplified variant of the X² statistics. Classifier induction refers instead to the problem of automatically building a text classifier by learning from a set of documents pre-classified under the categories of interest. We propose a novel variant, based on the exploitation of negative evidence, of the well-known k-NN method. We report the results of systematic experimentation of these two methods performed on the standard Reuters-21578 benchmark.
The Benefits of Displaying Additional Internal Document Information on Textual Database Search Result Lists BIBAFull-Text 69-82
  Offer Drori
Most information systems, which perform computerized searches of textual databases, deal with the need to display a list of documents which fulfill the search criteria. The user must chose from a list of documents, those documents which are relevant to his search query. Selection of the relevant document is problematical, especially during searches of large databases which have a large number of documents fulfilling the search criteria. This article defines a new hierarchical tree which is made up of three levels of display of search results. In a series of previous studies (not yet published) which were carried out at the Hebrew University in Jerusalem, the influence of information (within the documents) displayed to the user was examined in the framework of a list of responses to questions regarding user satisfaction with the method and the quality of his choices. In the present study, in addition to the information displayed in the list, information on the contents (subject) of the document was also displayed. The study examined the influence of this additional information on search time, user satisfaction and ease of using the systems.
Interactive-Time Similarity Search for Large Image Collections Using Parallel VA-Files BIBAFull-Text 83-92
  Roger Weber; Klemens Böhm; Hans-Jörg Schek
In digital libraries, nearest-neighbor search (NN-search) plays a key role for content-based retrieval over multimedia objects. However, performance of existing NN-search techniques is not satisfactory with large collections and with high-dimensional representations of the objects. To obtain response times that are interactive, we pursue the following approach: it uses a linear algorithm that works with approximations of the vectors and parallelizes it. In more detail, we parallelize NN-search based on the VA-File in a Network of Workstations (NOW). This approach reduces search time to a reasonable level for large collections. The best speedup we have observed is by almost 30 for a NOW with only three components with 900 MB of feature data. But this requires a number of design decisions, in particular when taking load dynamism and heterogeneity of components into account. Our contribution is to address these design issues.


Dublin Core Metadata for Electronic Journals BIBAKFull-Text 93-102
  Ann Apps; Ross MacIntyre
This paper describes the design of an electronic journals application where the article header information is held as Dublin Core metadata. Current best practice in the use of Dublin Core for bibliographic data description is indicated where this differs from pragmatic decisions made when the application was designed. Using this working application as a case study to explore the specification of a metadata schema to describe bibliographic data indicates that the use of Dublin Core metadata is viable within the journals publishing sector, albeit with the addition of some local, domain-specific extensions.
Keywords: Dublin Core; metadata; bibliographic citation; electronic journals.
An Event-Aware Model for Metadata Interoperability BIBAFull-Text 103-116
  Carl Lagoze; Jane Hunter; Dan Brickley
We describe the ABC modeling work of the Harmony Project. The ABC model provides a foundation for understanding interoperability of individual metadata modules -- as described in the Warwick Framework -- and for developing mechanisms to translate among them. Of particular interest in this model is an event, which facilitates understanding of the lifecycle of resources and the association of metadata descriptions with points in this lifecycle.
QUEST -- Querying Specialized Collections on the Web BIBAFull-Text 117-126
  Martin Heß; Christian Mönch; Oswald Drobnik
of the techniques Ensuring access to specialized web-collections in a fast evolving web environment requires flexible techniques for orientation and querying. The adoption of meta search techniques for web-collections is hindered by the enormous heterogeneity of the resources. In this paper we introduce QUEST -- a system for querying specialized collections on the web. One focus of QUEST is to unify search fields from different collections by relating the search concepts to each other in a concept-taxonomy. To identify the most relevant collections according to a user query, we propose an association-based strategy. Furthermore the Frankurt Core is introduced -- a metadata-scheme for describing web-collections as a whole. Its fields are filled automatically by a metadata-collector component. Finally a prototype of QUEST is presented, demonstrating the integration in an overall architecture.
Personal Data in a Large Digital Library BIBAFull-Text 127
  José Manuel Barrueco Cruz; Markus J. R. Klink; Thomas Krichel
The RePEc Economics library offers the largest distributed source of freely downloadable scientific research reports in the world. RePEc also contains details about Economics institutions, publication outlets and people working in the field. All this data forms a large relational dataset. In this paper we describe HoPEc, a system that allows to implement access control records personal data within RePEc. The bulk of these records describe the authors of documents. These records are maintained by the authors themselves. We discuss the technical and social aspects of this system.


Implementing a Reliable Digital Object Archive BIBAFull-Text 128-143
  Brian F. Cooper; Arturo Crespo; Hector Garcia-Molina
An Archival Repository reliably stores digital objects for long periods of time (decades or centuries). The archival nature of the system requires new techniques for storing, indexing, and replicating digital objects. In this paper we discuss the specialized indexing needs of a write-once archive. We also present a reliability algorithm for effectively replicating sets of related objects. We describe a data import utility for archival repositories. Finally, we discuss and evaluate a prototype repository we have built, the Stanford Archival Vault (SAV).
Policy-Carrying, Policy-Enforcing Digital Objects BIBAFull-Text 144-157
  Sandra Payette; Carl Lagoze
We describe the motivation for moving policy enforcement for access control down to the digital object level. The reasons for this include handling of item-specific behaviors, adapting to evolution of digital objects, and permitting objects to move among repositories and portable devices. We then describe our experiments that integrate the Fedora architecture for digital objects and repositories and the PoET implementation of security automata to effect such object-centric policy enforcement.
INDIGO -- An Approach to Infrastructures for Digital Libraries BIBAFull-Text 158-167
  Christian Mönch
In this paper INDIGO, an approach to infrastructures for digital libraries is presented. It fulfills two crucial requirements to digital libraries: scalability and the ability to handle newly evolving document types. Based on a classification of digital library architectures, the main reasons for limited scalability and extensibility of digital libraries are identified. To overcome the identified problems the concept of mobile structure knowledge, on which INDIGO is based, is developed. The architecture of INDIGO is outlined and examples for the application of the concept are given.
Scalable Digital Libraries Based on NCSTRL/Dienst BIBAFull-Text 168-179
  Kurt Maly; Mohammad Zubair; Hesham Anan; Dun Tan; Yunchuan Zhang
NCSTRL (The Networked Computer Science Technical Report Library) is a successful digital library for scientific and technical information. It uses the Dienst protocol that was developed by ARPA-funded CS-TR project. We encountered several problems while implementing NCSTRL based largescale libraries: UPS for Los Alamos and JDL for JTASC. The document collection for these libraries can range from several hundred thousands to few millions. The first problem we found that the native Dienst implementation does not scale beyond approximately 30,000 records. Secondly we found that the implementation is tightly coupled to the Unix platform. Finally, for a large number of hits the NCSTRL search interface support is limited in terms of usability. To address these problems, we replaced the Dienst repository service implementation with an Oracle-based implementation using servlet technology. The Oracle database stores the index information (metadata) and is partitioned horizontally to speed searching through different archives. Furthermore, indexes were built in order to speed the search by different key items such as the author name, the title and the abstract. Our implementation significantly reduced the average wait time for a user for searches that resulted in a large number of hits. In addition, we get all the other benefits of using servlet technology such as efficiency and portability. In this paper, we present the performance results of the new implementation and compare it with that of the implementation of the Dienst protocol in NCSTRL.


OMNIS/2: A Multimedia Meta System for Existing Digital Libraries BIBAFull-Text 180-189
  Günther Specht; Michael G. Bauer
Since today more and more complementary information is available in different electronic media there is an increasing demand for the integration of traditional digital library systems and multimedia systems. In this paper we present the OMNIS/2 system, which is an advanced meta system and enhances existing digital library systems or retrieval systems by additional storing and indexing of user-defined multimedia documents, automatic and personal linking concepts, annotations, filtering and personalization. The key concept of OMNIS/2 is that all of the above mentioned features are accomplished without changing the underlying documents. In our architecture existing digital library systems, which are established applications, serve as a document storage layer, while OMNIS/2 forms the multimedia storage layer, linking layer and personalization layer. This general approach ensures the integration and transparent combination of different digital library systems. Thus with OMNIS/2, even mere retrieval systems -- and nowadays most digital library systems are mere retrieval systems -- can be enriched to interactive multimedia DL-systems and are combined into one virtual personal digital library. OMNIS/2 is part of the Global Inventory Project of the G7 countries.
Modeling Archival Repositories for Digital Libraries BIBAFull-Text 190-205
  Arturo Crespo; Hector Garcia-Molina
This paper studies the archival problem: how a digital library can preserve electronic documents over long periods of time. We analyze how an archival repository can fail and we present different strategies that help solve the problem. We introduce ArchSim, a simulation tool that for evaluating an implementation of an archival repository system and compare options such as different disk reliabilities, error detection and correction algorithms, preventive maintenance, etc. We use ArchSim to analyze a case study of an Archival Repository for technical reports.
Implementation and Analysis of Several Keyframe-Based Browsing Interfaces to Digital Video BIBAFull-Text 206-218
  Hyowon Lee; Alan F. Smeaton; Catherine Berrut; Noel Murphy; Seán Marlow; Noel E. O'Connor
In this paper we present a variety of browsing interfaces for digital video information. The six interfaces are implemented on top of Físchlír, an operational recording, indexing, browsing and playback system for broadcast TV programmes. In developing the six browsing interfaces, we have been informed by the various dimensions which can be used to distinguish one interface from another. For this we include layeredness (the number of "layers" of abstraction which can be used in browsing a programme), the provision or omission of temporal information (varying from full timestamp information to nothing at all on time) and visualisation of spatial vs. temporal aspects of the video. After introducing and defining these dimensions we then locate some common browsing interfaces from the literature in this 3-dimensional "space" and then we locate our own six interfaces in this same space. We then present an outline of the interfaces and include some user feedback.
Functional and Intentional Limitations of Interactivity on Content Indexing Topics: Possible Uses of Automatic Classification and Contents Extraction Systems, in Order to Create Digital Libraries Databases BIBAFull-Text 219-228
  Florent Pasquier
The context of the creation of self-learning contents knows a revolution by the use of digital tools. The process to realize video was based previously on the use of analog tools. It moves nowadays towards digital ones: studios are equipped to work with digital data and digital materials under the pressure of international standards and components. On the same time, capture and storage pictures by videotapes evolve towards hybrid supports (Digital/Video), which will become entirely digital soon (computers hard-discs and networks). The current creation process which uses a kind of simulated interactivity, waits new materials and synopsis that will allow a true and complete interactivity. This interactivity needs to be used through a full digital multimedia way. The current evolution of the market digital systems seems to empower the role of the creator and the designer. They might start their creation with previous and genuine videos, by exploiting the ones which suit the best what they need, picking them up on video databases. But the problem is to be able to locate the video wanted. Many systems of automatic recognition (voice, pictures, videos...) are already functional to help them in this heavy task. But the question of using them in a relevant context, as the educational one per example, is still not resolved.


Interaction Profiling in Digital Libraries through Learning Tools BIBAFull-Text 229-238
  Giovanni Semeraro; Floriana Esposito; Nicola Fanizzi; Stefano Ferilli
We present improvements to a learning module, the Learning Server, to be exploited in a digital library system for supporting document management tasks as well as for providing a form of user interface adaptivity based on user classification. Indeed, our system is equipped with a web-based environment endowed with visual tools that are thought for improving the interaction of inexperienced users and for supporting experienced users in an effective accomplishment of their retrieval tasks. By logging user interaction, the Learning Server is able to suggest the most suitable interaction tools for each user.
DEBORA: Developing an Interface to Support Collaboration in a Digital Library BIBAFull-Text 239-248
  David M. Nichols; Duncan Pemberton; Salah Dalhoumi; Omar Larouk; Claire Belisle; Michael Twidale
Interfaces to library systems have largely failed to represent the in-herently collaborative nature of information work. This paper describes how collaborative functionality is being implemented as part of the DEBORA project to provide access to digitised Renaissance documents. Work practices of users of Renaissance documents are described and the collaborative features of the client software are outlined. Functionalities discussed include annotation, the creation of virtual books and the inclusion of user-supplied metadata.
Children as Design Partners and Testers for a Children's Digital Library BIBAFull-Text 249-258
  Yin Leng Theng; Norliza Mohd-Nasir; Harold W. Thimbleby; George Buchanan; Matt Jones; David Bainbridge; Noel Cassidy
Most of today's digital libraries (DLs) are not designed for children. To produce usable and useful DLs, designers need to ensure that good design features are incorporated, taking into consideration users' needs. We describe our experience working with children as design partners and testers in building a children's DL of stories and poems for 11-14 year olds, using a concrete example to demonstrate our design philosophy and research approach, The study provides insights on useful design features children's DLs should have, and their importance to children. The initial work we have done highlights issues and provides a basis for the building of usable and useful digital libraries for children.
Evaluating a User-Model Based Personalisation Architecture for Digital News Services BIBAFull-Text 259-268
  Alberto Díaz Esteban; Pablo Gervás Gómez-Navarro; Antonio García Jiménez
An architecture that provides personalised filtering and dissemination of news items is presented. It is based on user profiles and it provides mechanisms that allow the user to control and tailor to his own needs the interaction between three different sources of relevance judgements: the existing newspaper categorisation by sections, basic information retrieval on user selected keywords, and an additional operation of automatic categorisation against an alternative hierarchy of categories. These three tiers cover some of the most promising access methods for digital libraries. The proposed architecture has been implemented and evaluation results are presented, covering user response, system efficiency, and user preferences regarding the set of methods made available to them.

Papers Complementing Invited Talks

Aging Links BIBAFull-Text 269-279
  Claudia Niederée; Ulrike Steffens; Joachim W. Schmidt; Florian Matthes
Rooted in the principle of hypertext, linked information is almost ubiquitous due to the WWW and related services. Links between information objects are established for various reasons, aspects of which are encoded by a link type or expressed through link context, e.g., by the surrounding content. Such reasons may lose their validity through content evolution in the link target. Fine-grained solutions are required that enable the user to gain evolution awareness without being distracted from his main task. In this paper we present aging links as a non-intrusive mechanism for improving awareness in cooperative work with linked information networks. A link may age, affected by the evolution of the link target, leading to a gradual loss of its validity. The aging process is driven by evolution-indicating events and may be flexibly controlled by link type specific aging strategies. A customizable service, EvEnAge, based on standard technologies, prototypically implements the concept of aging links for XML documents.
Core Elements of Digital Gazetteers: Placenames, Categories, and Footprints BIBAFull-Text 280-290
  Linda L. Hill
The core elements of a digital gazetteer are the placename itself, the type of place it labels, and a geographic footprint representing its location and possibly its extent. Such gazetteer data is an important component of indirect geographic referencing through placenames. Based on the gazetteer development work of the Alexandria Digital Library, this paper presents the nature of placenames, and the process of assigning categories to places based on the words in the placenames and other information, and discusses the nature of georeferencing places with geographic footprints.
The Application of an Event-Aware Metadata Model to an Online Oral History Archive BIBAFull-Text 291-304
  Jane Hunter; Darren James
In this paper we test the ABC event-aware metadata model, developed within the Harmony project, by applying it to a complex multimedia oral history archive. Based on a metadata schema, generated using the ABC model, we developed indexing tools, a database and a search and browse Web interface, for an oral history collection consisting of audio tapes and posters generated from a series of interviews and photographs. The objective was to build a testbed to test and refine the ABC model and also to demonstrate that use of the model will ensure consistent, well-structured, unambiguous metadata descriptions for complex multimedia collections. Such descriptions will hopefully lead to improved fine-grained resource discovery, interoperability between different metadata schemes and explicit tracking of intellectual property rights.
From the Visual Book to the WEB Book: The Importance of Good Design BIBAFull-Text 305-314
  Monica Landoni; Ruth Wilson; Forbes Gibb
This paper presents the results of two studies into electronic book production. The Visual Book study [1] explored the importance of the visual component of the book metaphor for the production of more effective electronic books, while the WEB Book study [2] took the findings of the Visual Book and applied them to the production of books for publication on the WWW. Both studies started from an assessment of which kinds of paper book are more suitable for conversion into electronic form, and both identified as target publications those which are meant to be used for reference rather than those which are read sequentially and usually in their entirety by users. This includes scientific publications and textbooks which have been chosen for the Visual Book and the WEB Book experiments. In this paper we discuss the results of the two studies and the way they could influence the design and production of more effective electronic books.


Topic Detection in Read Documents BIBAFull-Text 315-318
  Rui Amaral; Isabel Trancoso
This paper addresses the problem of topic annotation in the speech retrieval domain. It describes an algorithm developed to perform automatic topic annotation of broadcast news (BN) speech corpora. The adopted approach is based in Hidden Markov Models (HMM) and topic language models, solving the topic segmentation and labelling tasks simultaneously. To overcome the lack of topic labelled material for training statistical models, a two-stage unsupervised clustering was developed. Both stages are based on the nearest-neighbour search method, using the Kullback-Leibler distance. On-going experiments to evaluate the system performance are also described.
Map Segmentation by Colour Cube Genetic K-Mean Clustering BIBAFull-Text 319-323
  Vitorino Ramos; Fernando Muge
In this work, a method is described for evolving adaptive procedures for colour image segmentation. We formulate the segmentation problem as an optimisation problem and adopt evolutionary strategy of Genetic Algorithms (GA) for the clustering of small regions in colour feature space. The present approach uses k-Means unsupervised clustering methods into GA, namely for guiding this last Evolutionary Algorithm in his search for finding the optimal or sub-optimal data partition, task that as we know, requires a non-trivial search because of its intrinsic NP-complete nature. To solve this task, the appropriate genetic coding is also discussed, since this is a key aspect in the implementation. Our purpose is to demonstrate the efficiency of GA to automatic and unsupervised texture segmentation. Some examples in Colour Maps are presented and overall results discussed.
Spoken Query Processing for Information Access in Digital Libraries BIBAFull-Text 324-327
  Fabio Crestani
We briefly outline the ongoing research at Strathclyde University on the use of spoken query processing for information access in digital libraries.
A Metadata Model for Historical Documentary Films BIBAFull-Text 328-331
  Giuseppe Amato; Donatella Castelli; Serena Pisani
This paper presents a metadata model for historical audio-video material able to describe information for supporting both the traditional archival functions and the advanced applications, like video summary, speech recognition, and automatic semantic content extraction.
Image Description and Retrieval Using MPEG-7 Shape Descriptors BIBAFull-Text 332-335
  Carla Zibreira; Fernando Pereira
The increasing amount of digital audiovisual information and the need to efficiently and effectively describe and retrieve this information as well as the big technological developments in the related domains have been acknowledged by MPEG (Moving Pictures Experts Group) by initiating a new work item, formally called "Multimedia Content Description Interface" but better known as MPEG-7. This paper will introduce the MPEG-7 standard, with special emphasis on the adopted shape descriptors: Curvature Scale Space and Zernike Moments. Finally, the description and retrieval mechanism based on the MPEG-7 shape descriptors developed will be presented.
A Large Scale Component-Based Multi-media Digital Library System BIBAFull-Text 336-339
  Hiroshi Mukaiyama
The Next Generation Digital Library Project has developed a 3-tire client/server system used CORBA and an agent technology, aiming a component-based digital library development. We invented the new functions such as an intellectual property right management and a bill charge in the digital library's functions and conducted a user evaluation test. This paper presents these technologies and the evaluation results.

Users in Digital Libraries

Personalised Delivery of News Articles from Multiple Sources BIBAFull-Text 340-343
  Gareth J. F. Jones; David J. Quested; Katherine E. Thomson
Traditional news media report a single set of articles on current news stories. Online news sources make multiple stories on the same topic available reflecting different perspectives on the same news event. Navigating between these news sources to find stories of interest can be time consuming and inefficient. These multiple stories can be combined into personalised news packages by selecting items on topics of interest to an individual user. The appropriate contents of these personalised news packages can be determined by a combination of information retrieval techniques and explicit user preferences. This paper describes systems exploring this approach to personalised news delivery.
Building a Digital Library of Web News BIBAFull-Text 344-347
  Nuno Maria; Mário J. Silva
We introduce a new information system for organization of a Digital Library of news articles found on the Web, with automatic topic classification. We present our strategies to deal with different update frequencies of news Web sites, the classification methodology, the data model for storing news articles, measurements on the data retrieved and finally results of classification of this type of information.
Automatically Detecting and Organizing Documents into Topic Hierarchies: A Neural Network Based Approach to Bookshelf Creation and Arrangement BIBAFull-Text 348-351
  Andreas Rauber; Michael Dittenbach; Dieter Merkl
With the increasing amount of information available in electronic document collections, methods for organizing these collections to allow topic-oriented browsing and orientation gain importance. The SOMLib Digital Library System provides such an organization based on the self-organizing map, a popular neural network model. In this paper, we present the GHSOM, which, based on the same concepts, allows an automatic hierarchical decomposition and organization of documents, which very intuitively reflects the organization typically found in (manually organized) conventional libraries. We present a case study based on a 3-month article collection from an Austrian daily newspaper.
Daffodil: Distributed Agents for User-Friendly Access of Digital Libraries BIBAFull-Text 352-355
  Norbert Gövert; Norbert Fuhr; Claus-Peter Klas
The Internet makes searching for literature in Digital Libraries (DLs) feasible. However, often a user has to contact several DLs to satisfy a given information need. This leads to usability problems due to the heterogeneity of the DLs. One aspect is that the information structures of the systems differ. In fact, relevant information may be spread across several DLs. The other aspect of heterogeneity is differing browsing and searching functionality, of course presented to the user through different user interfaces and query languages.
An Adaptive Systems Approach to the Implementation and Evaluation of Digital Library Recommendation Systems BIBAFull-Text 356-359
  Johan Bollen; Luis Mateus Rocha
The focus for information retrieval systems in digital libraries has shifted from passive repositories of information to recommendation systems that actively participate in retrieving useful information, and can furthermore learn from the retrieval behavior of users. We propose a novel evaluation methodology for such systems based on the concepts of shared knowledge structures, and system development reliability and validity.
Are End-Users Satisfied by Using Digital Libraries? BIBAFull-Text 360-363
  Mounir A. Khalil
There are many books and journal articles written and published about the Digital Library (also denoted as Electronic library or Virtual Library which present the definition, description, components, usefulness, etc. of the Digital Library -- but nothing has been mentioned or written about the satisfaction of the users for accessing needed information. A questionnaire was developed to survey the behavior of end-users and measure their understanding levels of the meaning of the Digital Library.. Below are the results of the global survey. Electronic copyright and licensing as well as their effects upon research and education are discussed.

Information Retrieval

CAP7: Searching and Browsing in Distributed Document Collections BIBAFull-Text 364-367
  Norbert Fuhr; Kai Großjohann; Stefan Kokkelink
This paper describes CAP7, a system for searching and browsing in distributed document (metadata) collections. The system architecture is similar to Harvest, comprising gatherer components and a retrieval engine; but instead of the limited SOIF data format, we use RDF and XML. The gatherer creates RDF metadata descriptions of collected resources. Before delivering the data to the retrieval engine, the RDF is transformed into valid XML. The query language supported by the retrieval engine is an extension of XQL with weighting and data types with vague predicates. The user interface provides for browsing as well as simple searching.
Representing Context-Dependent Information Using Multidimensional XML BIBAFull-Text 368-371
  Yannis Stavrakas; Manolis Gergatsoulis; Theodoros Mitakos
XML (eXtensible Markup Language) is emerging as a new standard for data representation and exchange over the Web [3]. It is a markup language that resembles HTML, but unlike HTML, it focuses on the structure of data rather than on their presentation. The extensibility of XML makes it an ideal candidate for integration and manipulation of Web data through a common data model. A large number of DTDs (XML Document Type Definitions) that target all sorts of information domains has already been developed, and new DTDs are released at a fast pace. XML claims to be the enabling technology for application interoperability and for a unified view of heterogeneous information.
AQUA (Advanced Query User Interface Architecture) BIBAFull-Text 372-375
  László Kovács; András Micsik; Balázs Pataki; István Zsámboki
AQUA is an experimental query interface which supports iterative query reFInement. Currently, it can be used as an alternative query interface for NCSTRL (Networked Computer Science Technical Reports Library) and ETRDL (ERCIM Technical Reference Digital Library). As a demonstration for the extensibility of the AQUA user interface paradigm, rating facility has been added to the system.
Fusion of Overlapped Result Sets BIBAKFull-Text 376-379
  Joaquim Macedo; António Costa; Vasco Freitas
The existence of replicated documents or indexes in distributed information retrieval introduces some level of overlapping between its component databases. In this paper, a new data fusion method taking in account the overlapping information is compared with a conventional fusion method. The overall system effectiveness evaluation enables preliminary conclusions about the importance of such parameter for data fusion in distributed information retrieval.
Keywords: Distributed Information Retrieval; Replication; Data Fusion
ActiveXML: Compound Documents for Integration of Heterogeneous Data Sources BIBAFull-Text 380-384
  João P. Campos; Mário J. Silva
We address the problem of automatic composition of XML documents with data from multiple sources and their presentation to large groups of users with different information requirements.
newsWORKS©, the Complete Solution for Digital Press Clippings and Press Reviews: Capture of Information in an Intelligent Way BIBAFull-Text 385-388
  Begoña Aguilera Caballero; Richard Lehner
A new software solution is presented, specially designed for the electronic handling of press clippings, in order to build press archives and to produce press reviews in a digital way. Based on the ultimate standard technology available on the market, newsWORKS© is able to automate the layout analysis of the newspaper and the recognition of the articles. It offers too, the best OCR tools, besides with manual tools to add the intellectual work that at the end has to be made by the specialists (intellectual indexing).

Internet Cataloguing

Effects of Cognitive and Problem Solving Style on Internet Search Tool BIBAFull-Text 389-394
  Lim Tek Yong; Tang Enya Kong
This paper presents a research proposal on user-oriented evaluation method to compare the usability of Internet search tools. Cognitive style and problem solving style are identified individual difference factors. Meta-search, portal and individual search engines are Internet search tool available. Usability of each search tools based on relevancy and satisfaction is another factor of this study. The ultimate aim of the research is to contribute to the knowledge concerning individual differences and information retrieval technology. In particular we hope to get a better understanding of which presentation structures and user interface attributes work best and why.
Follow the Fox to Renardus: An Academic Subject Gateway Service for Europe BIBAFull-Text 395-398
  Lesly Huxley
Renardus is a collaborative project of the EU's Information Society Technologies programme with partners from national libraries, university research and technology centres and subject gateways Europe-wide. Its aim is to build a single search and browse interface to existing quality-controlled European subject gateways. The project will investigate related technical, information and organisational issues, build a pilot system and develop a fully-operational broker service. This paper provides an overview of the project, work in progress and anticipated results and outlines the opportunities and benefits for future collaboration in developing the service.
CORC: Helping Libraries Take a Leading Role in the Digital Age BIBAFull-Text 399-402
  Kay Covert
The OCLC Cooperative Online Resource Catalog is helping librarians thrive in the digital age. Librarians are using CORC to select, describe, maintain, and provide guided access to Web-based electronic resources. Librarians in more than 24 countries are using CORC and all types of libraries, including public, academic, corporate, school, and government libraries are contributing records to the CORC catalog. The CORC service offers a Web-based toolset for cataloging electronic resources, a robust database of high-quality resources, and a tool for building dynamic pathfinders. Developed by and for librarians, CORC blends three key elements -- technology, cooperation and librarianship, to help librarians define the future of knowledge access management.
Automatic Web Rating: Filtering Obscene Content on the Web BIBAFull-Text 403-406
  Konstantinos Chandrinos; Ion Androutsopoulos; Georgios Paliouras; Constantine D. Spyropoulos
We present a method to detect automatically pornographic content on the Web. Our method combines techniques from language engineering and image analysis within a machine-learning framework. Experimental results show that it achieves nearly perfect performance on a set of hard cases.
The Bibliographic Management of Web Documents in Digital and Hybrid Libraries BIBAFull-Text 407-412
  Wallace C., Jr. Koehler
Web documents present digital and hybrid librarians with a set of bibliographic management issues heretofore of no or minor significance for materials in print. These include frequent content change as well as the rate at which Web documents are removed or moved by their authors or creators. There have been a number of author-side and cataloger-side initiatives to assist in the management of Web documents, but these do not adequately address change. This paper explores some of those options. It addresses the impact of document change and demise on digital and hybrid collections. It offers suggestions on the management of the change and demise phenomena.

Technical Collections

The Economic Impact of an Electronic Journal Collection on an Academic Library BIBAFull-Text 413-417
  Carol Hansen Montgomery; John A. Bielec
This paper provides information on the economic impact of the transition from print to electronic journals in an academic library. The technological orientation of the university and a robust network infrastructure have made it possible for Drexel to make this transition more quickly than most, if not all, U.S. academic libraries. Shifts in costs occur in all budget areas: capital (space and network infrastructure), staffing, purchased services, materials, supplies and equipment. Overall, costs are higher, but preliminary data indicates that in Drexel's case per journal and per article costs are lower.
A Comparative Transaction Log Analysis of Two Computing Collections BIBAFull-Text 418-423
  Malika Mahoui; Sally Jo Cunningham
Transaction logs are invaluable sources of fine-grained information about users' search behavior. This paper compares the searching behavior of users across two WWW-accessible digital libraries: the New Zealand Digital Library's Computer Science Technical Reports collection (CSTR), and the Karlsruhe Computer Science Bibliographies (CSBIB) collection. Since the two collections are designed to support the same type of users -- researchers/students in computer science -- a comparative log analysis is likely to uncover common searching preferences for that user group. The two collections differ in their content, however; the CSTR indexes a full text collection, while the CSBIB is primarily a bibliographic database. Differences in searching behavior between the two systems may indicate the effect of differing search facilities and content type.
ERAM -- Digitisation of Classical Mathematical Publications BIBAFull-Text 424-427
  Hans Becker; Bernd Wegner
Longevity is typical for research achievements in mathematics. Hence to improve the availability of the classical publications in that area and to enable to get quick information on these, electronic literature information services and digital archives of the complete texts will be needed as important tools for the mathematical research in the future. This will bring the holdings from the journals archives nearer to the user and will prevent lost of the papers because of the deterioration of the paper as a consequence of age.
The Electronic Library in EMIS -- European Mathematical Information Service BIBAFull-Text 428-431
  Bernd Wegner
The idea to develop the European Mathematical Information Service EMIS was born at the meeting of the executive committee of the EMS (European Mathematical Society) in Cortona/Italy, October 1994. It was decided to set up a system of electronic servers in Europe for Mathematics under the auspices of the EMS, and this was extended very soon to the current version of a central server collecting mathematical information and distributing this through a world-wide system of mirror servers. The installation of the central server began in March 1995 in co-operation with FIZ Karlsruhe at the editorial office of Zentralblatt für Mathematik in Berlin. In June 1995 EMIS went on-line under the URL http://www.emis.de/. The first mirrors were established very soon in Lisbon, Southampton and Marseilles.
Model for an Electronic Access to the Algerian Scientific Literature: Short Description BIBAFull-Text 432-436
  Bakelli Yahia
Within the Digital-IST project (engaged by the CERIST since 1998) we try to study a mechanism of the introduction of the PC, Multimedia and Internet in the Algerian scientific publishing system. And how the Electronic Publishing (EP) technology can be exploited in the way of better access and organization of the national scientific literature. One survey among 130 higher education teachers and researchers and a systematic analysis of academic websites [1] were done.

Cases 1

A Digital Library of Native American Images BIBAFull-Text 437-440
  Elaine Peterson
This paper summarizes the organizational and technical issues involved in creating a digital library of Native American images. Initial participants include a museum, an archives, and three university libraries. Using Oracle software, the shared images now constitute a database searchable by subject, date, photographer/artist, tribe, geographic location, and format of the material. Dynamic links are provided to the textual collections which house the physical images. The database resides at: http://libmuse.msu.montana.edu:4000/nad/nad.home
China Digital Library Initiative and Development BIBAFull-Text 441-444
  Michael Bailou Huang; Guohui Li
The construction of digital library systems and digital library resource databases by utilizing advanced information technologies has become not only a tremendous challenge but also an important opportunity to information enterprises and large-scale information resources collection centers such as national libraries in various countries. Over 20 countries and regions worldwide are actively involved in building digital libraries. China is no exception. Supported by the State Council and Ministry of Culture, China's digital library project is underway in earnest. This paper aims to present an overview of current research and development of the digital library in China. It will discuss in particular the preparation work and construction of the digital library resource databases and analyze critical problems needed to be solved in the construction of China's digital libraries.
Appropriation of Legal Information: Evaluation of Data Bases for Researchers BIBAFull-Text 445-448
  Céline Hembise
This paper shows results of a study led as part of a project of the Region Rhône-Alpes entitled "Textual engineering and digital libraries". The aim of this survey is to update appropriation behaviour of legal text by researchers in law in order to create a workstation own to the jurist, and permitting him to access and to appropriate easily a digital text. Online services of legal information are still little used by researchers and have to be developed to help users for the interrogation of online information.
Publishing 30 Years of the Legislation of Brazil's São Paulo State in CD-ROM and Internet BIBAFull-Text 449-451
  Paulo Leme; Dilson da Costa; Ricardo Baccarelli; Maurício Barbosa; Andréa Bolanho; Ana Reis; Rose Bicudo; Eduardo F. Barbosa; Márcio Nunes; Innocêncio Pereira Filho; Guilherme Plonski; Sérgio Kobayashi
About 60,000 legal acts covering the last 30 years of the São Paulo State legislation are published in CD-ROM and the Internet. The effective search engine implemented yields to a swift and intuitive system using hypertext feature for easy navigation through related legal acts. It is already reflecting a significant improvement in administrative and legal procedures in government offices, legislative and judiciary departments, municipal administrations as well as private lawyer offices, corporations, universities, and members of the community.
Electronic Dissemination of Statistical Information at Local Level: A Cooperative Project between a University Library and Other Public Institutions BIBAFull-Text 452-455
  Eugenio Pelizzari
The European process of integration is designing a new institutional scene involving a progressive strengthening of international and local governments, with a gradual weakening of intermediate levels, especially the National States. Local communities, from an economic and social point of view, will probably have a more direct relationship with each other. We predict that inter-European competition will take place at the level of local economic and social systems.

Cases 2

Building Archaeological Photograph Library BIBAFull-Text 456-460
  Rei Atarashi; Masakazu Imai; Hideki Sunahara; Kunihiro Chihara; Tadashi Katata
The photographs taken at excavation fields are one of most important materials. It is expected to digitize and save these photographs in the computer system because of difficulty of management to many photographs and losing color problem. We designed and implemented prototype of archaeological photograph library. The library is designed based on Dublin Core Metadata Element set and XML We describe the photograph library project and design concept of the library.
EULER -- A DC-Based Integrated Access to Library Catalogues and Other Mathematics Information in the Web BIBAFull-Text 461-466
  Bernd Wegner
Literature databases, scientific journals and communication between researchers on the electronic level are rapidly developing tools in mathematics having high impact on the daily work of mathematicians. They improve the availability of information on all important achievements in mathematics, speed up the publication and communication procedures and lead to enhanced facilities for the preparation and presentation of research in mathematics. The aim of this article is to give a more detailed report on one of these projects, the so-called EULER-project, developing a search engine for distributed mathematical sources in the web. Main features of the EULER deliveries are uniform access of different sources, high precision of information, deduplication facilities, user-friendliness and an open approach enabling participation of additional resources. The partner of the projects represent different types of libraries and moreover different types of information in the web. The functionalities of the EULER-engine will be described and a report will be given on the transition from the prototype developed in the project to a consortium based service in the internet.
Decomate: Unified Access to Globally Distributed Libraries BIBAFull-Text 467-470
  Thomas Place; Jeroen Hoppenbrouwers
The Decomate project enables mutual access to heterogeneous, distributed, and pooled digital resources of consortium members. Using a mediator architecture with a Broker and several back-end servers, a scalable and flexible system has been developed that is going in production in major European universities. Ongoing work focuses on access improvements using graphical browsing and thesaurus integration.
MADILIS, the Microsoft Access-Based Digital Library System BIBAFull-Text 471-474
  Scott Herrington; Philip Konomos
The ASU Libraries' staff had considerable experience creating digital library systems to satisfy the needs of a major university library. These systems were designed to be high performance, large scale systems, capable of supporting very large, multimedia databases, accessible to large numbers of simultaneous users. Using this experience, the staff set out to design a digital library system that could satisfy the needs of small libraries. A small digital library system cannot be simply a scaled back version of a large system. The primary factors driving the design of a small system are cost, scalability and technical support. The resulting digital library system, named MADILIS, is designed to satisfy all of the criteria for a fully functional digital library system, while also meeting the cost, scalability and technical support needs of small libraries.
Leveraging Electronic Content: Electronic Linking Initiatives at Arizona State University BIBAFull-Text 475-480
  Dennis Brunning
This paper presents an overview of electronic linking initiatives at Arizona State University Libraries. It covers existing commercial solutions. These solutions include SilverLinker from SilverPlatter Information and ISILINKS from the Institute of Scientific Information. Problems, advantages, and disadvantages of these initiatives are described and explored.

Cases 3

Asian Film Connection: Developing a Scholarly Multilingual Digital Library -- A Case Study BIBAFull-Text 481-484
  Marianne Afifi
In 1998, the staff of the Center for Scholarly Technology (CST) in the Information Services Division (ISD) at the University of Southern California (USC) was approached for assistance with a database/digital library project. Generally, the role of the Center is to assist with curricular technology projects. Although the project described below is somewhat outside the normal scope for the Center, the scholarly nature of the project was taken into account in granting assistance to it. The impetus for the project came from Jeanette Paulson Hereniko, Director of the Asia Pacific Media Center at the Annenberg Center for Communication and Founding Director of the Hawaii International Film Festival. Ms. Paulson organized a conference in Los Angeles in May of 1998 attended by members of NETPAC, the Network for the Promotion of Asian Cinema, a pan-Asian cultural organization involving critics, filmmakers, festival organizers and curators, distributors and exhibitors, and film educators. At the conference, a plan for the creation of a scholarly, multilingual digital library about film in Asia was first presented.
Conceptual Model of Children's Electronic Textbook BIBAFull-Text 485-489
  Norshuhada Shiratuddin; Monica Landoni
First step in developing electronic book is to build a conceptual model. The model described in this paper is designed by integrating Multiple Intelligences Theory with existing electronic book models. Emphasis is on integrating the content of a page with appropriate activities that meet and cater for the diversity of learning styles and intelligence in young children. We postulate that an additional feature for children e-book would be to present contents by mixing different presentation modes and including various activities which support as many intelligences as possible.
An Information Food Chain for Advanced Applications on the WWW BIBAFull-Text 490-493
  Stefan Decker; Jan Jannink; Sergey Melnik; Prasenjit Mitra; Steffen Staab; Rudi Studer; Gio Wiederhold
The growth of the WWW has resulted in amounts of information beyond what is suitable for human consumption. Automated information processing agents are needed. However, with the current technology it is difficult and expensive to build automated agents. To facilitate automated agents on the web we present an information food chain for advanced applications on the WWW. Every part of the food chain provides information that enables the existence of the next part.
An Architecture for a Multi Criteria Exploration of a Documents Set BIBAFull-Text 494-497
  Patricia Dzeakou; Jean-Claude Derniame
This paper presents an architecture suitable for exploring a set of documents depending on multi criteria documents. During the exploration session, the user progressively builds a portfolio of relevant documents using semantic views.
An Open Digital Library Ordering System BIBAFull-Text 498-501
  Sarantos Kapidakis; Kostas Zorbadelos
In this paper we briefly describe an open ordering manipulation system with a WWW interface. The orders concern articles of scientific journals. Customer users can search in data sources from a variety of suppliers for articles of journals and order specific pages of the articles. Their search can also include electronic journals in which case their orders can be fulfilled, charged and delivered electronically as an e-mail attachment without needing an operator. The various suppliers can view orders made to them and service them. A customer can direct his order to several suppliers declaring an order of preference. We also introduce the issues involved and present our open system solution that separates the search from the order procedures. Searching can use any external interface provided by the various data sources and intercepts queries and their answers to search requests.

Special Workshop

Special NKOS Workshop on Networked Knowledge Organization Systems BIBAFull-Text 502-505
  Martin Doerr; Traugott Koch; Douglas Tudhope; Repke de Vries
This half-day workshop aims to provide an overview of research, development and projects related to the usage of knowledge organization systems in Internet based services and digital libraries. These systems can comprise thesauri and other controlled lists of keywords, ontologies, classification systems, taxonomies, clustering approaches, dictionaries, lexical databases, concept maps/spaces, semantic road maps etc.
Implementing Electronic Journals in the Library and Making them Available to the End-User: An Integrated Approach BIBAFull-Text 506-510
  Gerrit Alewaeters; Serge Gilen; Paul Nieuwenhuysen; Stefaan Renard; Marc Verpoorten
This short-paper describes our strategy for implementing electronic journals (with embedded multimedia) in the library and making them available to the end-user. Together with the Technische Universiteit Einhoven (TUE) in the Netherlands, we own the source code of the Vubis library information system, which allows development, customization and tighter integration of Vubis for our specific needs. The university library of the Vrije Universiteit Brussel (VUB) in Belgium is currently involved in a digital library project named CROCODIL (CROss-platform CO-operation for a DIgital Library 1999-2000) which is sponsored by the Flemish government (IWT). Our partners are the largest subscription agent in Europe (Swets Blackwell) and the distributor of the Vubis library information system (Geac). We are developing integrated access to electronic documents (with embedded multimedia) and information in different formats. We hope that our experience can be of interest to other libraries coping with the integration of electronic journals in their library system.