HCI Bibliography Home | HCI Conferences | ECDL Archive | Detailed Records | RefWorks | EndNote | Hide Abstracts
ECDL Tables of Contents: 9798990001020304050607080910

ECDL'98: Proceedings of the European Conference on Digital Libraries

Fullname:ECDL'98: Research and Advanced Technology for Digital Libraries: Second European Conference
Editors:Christos Nikolaou; Constantine Stephanidis
Location:Heraklion, Crete, Greece
Dates:1998-Sep-21 to 1998-Sep-23
Publisher:Springer Berlin Heidelberg
Series:Lecture Notes in Computer Science 1513
Standard No:DOI: 10.1007/3-540-49653-X; ISBN: 978-3-540-65101-7 (print), 978-3-540-49653-3 (online); hcibib: ECDL98
Papers:91
Pages:908
Links:Online Proceedings | DBLP Contents
  1. Invited Talks
  2. Architectures for Digital Libraries
  3. Image Digital Libraries
  4. Multilinguality
  5. DL Technologies for Libraries
  6. Case Studies I
  7. Navigation and Digital Libraries
  8. Information Retrieval for Digital Libraries
  9. Querying in Digital Libraries
  10. Human Computer Interaction for Digital Libraries
  11. Natural Language Processing for Digital Libraries
  12. Posters
  13. Panels
  14. Treasure Chest or Pandora's Box: The Challenge of the Electronic Library as a Vehicle for Scholarly Communication
  15. Delos Workshop
  16. Demonstrations

Invited Talks

Intelligent Multimedia Communication BIBAFull-Text 1-11
  Mark T. Maybury; Oliviero Stock; Wolfgang Wahlster
Multimedia communication is a part of everyday life and its appearance in computer applications is increasing in frequency and diversity. Intelligent or knowledge based computer supported communication promises a number of benefits including increased interaction efficiency and effectiveness. This article defines the area of intelligent multimedia communication, outlines fundamental research questions, summarizes the associated scientific and technical history, identifies current challenges and concludes by predicting future breakthroughs including multilinguality. We conclude describing several new research issues that systems of systems raise.
Autonomous Search in Complex Spaces BIBAKFull-Text 12-28
  Erol Gelenbe
The search for information in a complex system space -- such as the Web or large digital libraries, or in an unknown robotics environment -- requires the design of efficient and intelligent strategies for (1) determining regions of interest using a variety of sensors, (2) detecting and classifying objects of interest, and (3) searching the space by autonomous agents. This paper discusses strategies for directing autonomous search based on spatio-temporal distributions. We discuss a model for search assuming that the environment is static, except for the effect of identifying object locations. Algorithms are designed and compared for autonomously directing a robot.
Keywords: Search; Optimal Strategies; Greedy and Infinite Horizon Algorithms
Scientific Digital Libraries in Germany: Global-Info, a Federal Government Initiative BIBAFull-Text 29-39
  Erich J. Neuhold; Reginald Ferber
This talk will introduce and comment on the German Digital Libraries program Global-Info. It will start with a brief introduction to the way research is organized in Germany, followed by some background on ongoing and completed German projects related to Digital Libraries. Then the approach and organization of the Global-Info program are presented.
   Global-Info is an interdisciplinary program that includes producers, intermediaries and consumers of scientific information. They are represented by learned societies, publishers, universities, and libraries. The program started at the beginning of this year and will run for a 6-year period with individual project durations of two to three years. A main characteristic of Global-Info is that the more specific goals and its organization will be developed by the scientific community and the publishers -- i. e. the funded organizations -- in a bottom up process in the course of the program.

Architectures for Digital Libraries

Flexible and Extensible Digital Object and Repository Architecture (FEDORA) BIBAFull-Text 41-59
  Sandra Payette; Carl Lagoze
We describe a digital object and respository architecture for storing and disseminating digital library content. The key features of the architecture are: (1) support for heterogeneous data types; (2) accommodation of new types as they emerge; (3) aggregation of mixed, possibly distributed, data into complex objects; (4) the ability to specify multiple content disseminations of these objects; and (5) the ability to associate rights management schemes with these disseminations. This architecture is being implemented in the context of a broader research project to develop next-generation service modules for a layered digital library architecture.
The Alexandria Digital Library Architecture BIBAFull-Text 61-73
  James Frew; Michael Freeston; Nathan Freitas; Linda L. Hill; Greg Janee; Kevin Lovette; Robert Nideffer; Terence R. Smith; Qi Zheng
Since 1994, the Alexandria Digital Library Project has developed three prototype digital libraries for georeferenced information. This paper describes the most recent of these efforts, a three-tier client-server architecture that relies heavily on a middleware layer to present a single uniform set of interfaces to multiple heterogeneous servers. These standard interfaces, all of which are implemented in HTTP, support session management, collection discovery and evaluation, metadata searching, metadata retrieval, and online holding retrieval. An XML-based meta-data encoding scheme and a simple boolean query language have also been developed. The architecture described by these interfaces has been implemented at UCSB.
A Framework for the Encapsulation of Value-Added Services in Digital Objects BIBAFull-Text 75-94
  Manolis Marazakis; Dimitris Papadakis; Stavros A. Papadakis
Container technology enables the encapsulation of information content together with rules and controls specifying the types of content usage permitted and the consequences of usage, such as triggering of report generation and payment. Containers have been proposed as a mechanism for securing intellectual property rights. This paper outlines other possible applications of container technology, including support for compound documents that incorporate active content, and automation of processes involving multi-party peer-to-peer interactions for the purposes of collaboration and commerce. Such value-added services are of particular interest in the context of digital libraries aiming to provide functionality extending beyond that of a simple repository of electronic documents. This paper presents the design of a container framework in the context of an architecture for network-centric applications.
A Management Architecture for Measuring and Monitoring the Behavior of Digital Libraries BIBAFull-Text 95-114
  Sarantos Kapidakis; Sotirios Terzis; Jakka Sairamesh
In this paper, we investigate issues of performance management in Digital Libraries. We defined a management architecture for measuring and monitoring the behavior of digital libraries as they operate, so that we can make performance conclusions using real life digital library load. Our architecture can be easily applied on any digital library system, introducing minimal overhead to digital library performance, and requiring minimal changes to the digital library code. We implemented this architecture over a testbed of Dienst servers using real data and workload. We defined the relevant parameters for investigating the performance of the servers and we made visualization tools to study the performance results. We also demonstrated how the performance results can be used by the digital library itself, to produce advanced unattended operations, like load balancing and dynamic timeout adaptation.
Building HyperView Wrappers for Publisher Web-Sites BIBAFull-Text 115-134
  Lukas Faulstich; Myra Spiliopoulou
Electronic journals are becoming a major source of scientific information. Researchers interested only in certain topics do not have time to scan all possibly relevant journals on a regular basis. A digital library can assist them by providing a uniform, search-able interface for electronic journals. To this purpose, a catalogue of metadata on the available journals such as authors and titles of articles must be established by the digital library. If there is no cooperation with journal publishers, this metadata must be extracted from the publishers' Web Sites, overcoming the intrinsic heterogeneity problems.
   Within the framework of the ongoing Natural Sciences Digital Library project at the Free University of Berlin, we have designed a wrapper-mediator mechanism that copes with the heterogeneity problems of automatic metadata acquisition. It is based on our generic HyperView methodology for integration ofWeb Sites. From this methodology it inherits two elegant and effective features. First, the structure of the publisher site is specified with abstract graph-schemata, instead of being hard-coded in scripts for data acquisition. Second, a powerful view concept based on declarative graph-transformation rules is used for information extraction.

Image Digital Libraries

The Application of Metadata Standards to Video Indexing BIBAFull-Text 135-156
  Jane Hunter
This paper first outlines a multi-level video indexing approach based on Dublin Core extensions and the Resource Description Framework (RDF). The advantages and disadvantages of this approach are discussed in the context of the requirements of the proposed MPEG-7 ("Multimedia Content Description Interface") standard. Finally a hybrid approach is proposed based on the combined use of Dublin Core and the currently undefined MPEG-7 standard within the RDF which will provide a solution to the problem of satisfying widely differing user requirements.
Search and Progressive Image Retrieval from Distributed Image/Video Databases: The SPIRE Project BIBAKFull-Text 157-168
  Vittorio Castelli; Lawrence D. Bergman; Chung-Sheng Li; John R. Smith
In this paper, we describe the architecture and initial implementation of a content-based retrieval mechanism from heterogeneous image archives. In particular, we propose an architecture to produce local representation of the images stored in heterogeneous archives and a progressive framework that reorganizes the images into a hierarchical representation based on a multiresolution decomposition and an abstraction pyramid. Search operations can rely on this representation and be performed in a hierarchical fashion, thus significantly reducing the total amount of data that need to be processed. Dramatic speedup has been achieved for many search operations, such as template matching, texture feature extraction, and histogram extraction.
Keywords: image databases; satellite imagery; content-based retrieval
Improving the Spatial-Temporal Clue Based Segmentation by the Use of Rhythm BIBAFull-Text 169-181
  Walid Mahdi; Liming Chen; Dominique Fontaine
Video is a major media in the society of information under way. Unfortunately, the full use of this media is limited by the opaque character of the video which prevents content-based access. In this paper we improve our previous spatial temporal clues-based semantic video segmentation technique, and propose the use of the rhythm within a video to more precisely capture temporal relations within a scene and between scenes in a video. Preliminary evidence based on a 7 minutes video shows that our spatial temporal clues-based segmentation technique coupled with the rhythm consideration fully detect the narrative structure of a video.

Multilinguality

Multilingual Information Retrieval Based on Document Alignment Techniques BIBAFull-Text 183-197
  Martin Braschler; Peter Schäuble
A multilingual information retrieval method is presented where the user formulates the query in his/her preferred language to retrieve relevant information from a multilingual document collection. This multilingual retrieval method involves mono- and cross-language searches as well as merging their results. We adopt a corpus based approach where documents of different languages are associated if they cover a similar story. The resulting comparable corpus enables two novel techniques we have developed. First, it enables Cross-Language Information Retrieval (CLIR) which does not lack vocabulary coverage as we observed in the case of approaches that are based on automatic Machine Translation (MT). Second, aligned documents of this corpus facilitate to merge the results of mono- and cross-language searches. Using the TREC CLIR data, excellent results are obtained. In addition, our evaluation of the document alignments gives us new insights about the usefulness of comparable corpora.
Experimental Studies on an Applet-based Document Viewer for Multilingual WWW Documents -- Functional Extension of and Lessons Learned from Multilingual HTML BIBAKFull-Text 199-214
  Shigeo Sugimoto; Akira Maeda; Myriam Dartois; Jun Ohta; Shigetaka Nakao; Tetsuo Sakaguchi; Koichi Tabata
The World Wide Web (WWW) covers the globe. However, the browsing functions for documents in multiple languages are not easily accessed by occasional users. Functions to display and input multilingual texts in digital libraries are clearly crucial. Multilingual HTML (MHTML) is a document browser technology for multilingual documents on the WWW. The authors developed a display function for multilingual documents based on MHTML technology and extended it to text inputs in multiple languages for off-the-shelf browsers and sample applications. This extension creates an environment for digital library end-users, wherein they can view and search multilingual documents using any off-the-shelf browser. This paper also discusses the lessons learned from the MHTML project.
Keywords: Multilingual Document Browsing; Off-the-Shelf WWW; Browsers; Multilingual Texts Display and Input; Text Retrieval in Multiple Scripts
SIS -- TMS: A Thesaurus Management System for Distributed Digital Collections BIBAFull-Text 215-234
  Martin Doerr; Irini Fundulaki
The availability of central reference information as thesauri is critical for correct intellectual access to distributed databases, in particular to digital collections in international networks. There is a continuous raise in interest in thesauri, and several thesaurus management systems have appeared on the market. The issue, how to integrate effectively such central resources into a multitude of client systems and to maintain the consistency of reference in an information network has not yet been satisfactorily solved. We present here a method and an actual thesaurus management system, which is specifically designed for this use, and implements the necessary data structures and management functions. The system handles multiple multilingual thesauri and can be adapted to all semantic thesaurus structures currently in use. Consistency-critical information is kept as history of changes in the form of backward differences. The system has been installed at several sites in Europe.
Parallel Text Alignment BIBAFull-Text 235-260
  Charles B. Owen
Parallel Text Alignment (PTA) is the problem of automatically aligning content in multiple text documents originating or derived from the same source. The implications of this result in improving multimedia data access in digital library applications range from facilitating the analysis of multiple English language translations of classical texts to enabling the on-demand and random comparison of multiple transcriptions derived from a given audio stream, or associated with a given stream of video, audio, or images. In this paper we give an efficient algorithm for achieving such an alignment, and demonstrate its use with two applications. This result is an application of the new framework of Cross-Modal Information Retrieval recently developed at Dartmouth.

DL Technologies for Libraries

An Analysis of Usage of a Digital Library BIBAFull-Text 261-277
  Steve Jones; Sally Jo Cunningham; Rodger J. McNab
As experimental digital library testbeds gain wider acceptance and develop significant user bases, it becomes important to investigate the ways in which users interact with the systems in practice. Transaction logs are one source of usage information, and the information on user behaviour can be culled from them both automatically (through calculation of summary statistics) and manually (by examining query strings for semantic clues on search motivations and searching strategy). We conduct a transaction log analysis on user activity in the Computer Science Technical Reports Collection of the New Zealand Digital Library, and report insights gained and identify resulting search interface design issues.
Illustrated Bood Study: Digital Conversion Requirements of Printed Illustrations BIBAFull-Text 279-293
  Anne R. Kenney; Louis H., II Sharpe; Barbara Berger
Cornell University Department of Preservation and Conservation and Picture Elements, Incorporated have undertaken a joint study for the Library of Congress to determine the best means for digitizing the vast array of illustrations used in 19th and early 20th century publications. This work builds on two previous studies. A Cornell study [1] characterized a given illustration type based upon its essence, detail, and structure. A Picture Elements study [2] created guidelines for deciding how a given physical content region type should be captured as an electronic content type. Using those procedures, appropriate mappings of different physical content regions (representing instances of different illustration processes) to electronic content types are being created. These mappings differ based on the illustration type and on the need to preserve information at the essence, detail, or structure level. Example pages that are typical of early commercial illustrations have been identified, characterized in terms of the processes used to create them (e.g., engraving, lithograph, halftone) and then scanned at high resolutions in 8-bit grayscale. Digital versions that retain evidence of information at the structure level have been derived from those scans and their fidelity studied alongside the paper originals. Project staff have investigated the available means for automatic detection of illustration content regions and methods for automatically discriminating different illustration process types and for encoding and processing them. A public domain example utility is being created which automatically detects the presence and location of a halftone region in a scan of an illustrated book page and applies special processing to it.
Structuring Facilities in Digital Libraries BIBAFull-Text 295-313
  Peter J. Nürnberg; Uffe Kock Wiil; John J. Leggett
Digital libraries offer much promise for patrons and many challenges for system designers and implementers. One important issue that faces digital library system designers is the type of support provided to patrons for intellectual work. Although many researchers have noted the desirability of robust hypermedia structuring facilities in digital library systems, this research has tended to focus on navigational hypermedia (primarily used for associative storage and retrieval) only. Many other types of hypermedia, such as spatial, issue-based, and taxonomic, have been ignored. We briefly review some of our experiences with building digital library systems and discuss some of the lessons we learned from our initial prototypes. We then present a scenario of digital library work that illustrates many of the kinds of tasks we have observed users of our systems perform. We use this scenario to suggest a potential area of improvement for current hypermedia support in digital library systems and discuss some of our initial work in this area. Finally, we present some directions of future work and some concluding remarks.
E-Referencer: A Prototype Expert System Web Interface to Online Catalogs BIBAFull-Text 315-333
  Christopher S. G. Khoo; Danny C. C. Poo; Teck-Kang Toh; Soon-Kah Liew; Anne N. M. Goh
An expert system Web interface to online catalogs called E-Referencer is being developed. An initial prototype has been implemented. The interface has a repertoire of initial search strategies and reformulation strategies that it selects and implements to help users retrieve relevant records. It uses the Z39.50 protocol to access library systems on the Internet. This paper describes the design of E-Referencer, and the development of search strategies to be used by the interface. A preliminary evaluation of the strategies is also presented.

Case Studies I

The Planetary Data System. A Case Study in the Development and Management of Meta-Data for a Scientific Digital Library BIBAFull-Text 335-350
  J. Steven Hughes; Susan K. McMahon
The Planetary Data System (PDS) is an active science data archive managed by scientists for NASA's planetary science community. With the advent of the World Wide Web, the majority of the archive has been placed on-line as a science digital library for access by scientists, the educational community, and the general public. The meta-data in this archive, primarily collected to ensure that future scientists would be able to understand the context within which the science data was collected and archived, has enabled the development of sophisticated on-line interfaces. The success of this effort is primarily due to the development of a standards architecture based on a formal model of the planetary science domain. A peer review process for validating the meta-data and the science data has been critical in maintaining a consistent archive. In support of new digital library research initiatives, the PDS functions as a case study in the development and management of meta-data for science digital libraries.
Performing Arts Data Service -- An Online Digital Resource Library BIBAFull-Text 351-366
  Steve Malloch; Carola Boehm; Celia Duffy; Catherine Owen; Stephen Arnold; Tony Pearson
The Performing Arts Data Service (PADS) aims to support research and teaching in UK Higher Education by collecting and promoting the use of digital data relating to the performing arts: music, film, broadcast arts, theatre, dance. The PADS is one of 5 service providers of the Arts and Humanities Data Service (AHDS) which will provide a single gateway for arts and humanities scholars wishing to search for datasets across various discipline areas. Data is indexed with Dublin Core metadata, will interoperate with other databases within the AHDS and beyond using Z39.50, and will be available via the Web. The diversity of data with which the PADS must deal is a major issue, and any information system for such a service must support text based, visual/image, time-based and complex data, and offer appropriate access over wide area networks. This paper focuses on the system requirements of such a system and briefly describes one implementation of those requirements.

Navigation and Digital Libraries

Learning User Communities for Improving the Services of Information Providers BIBAFull-Text 367-383
  Georgios Paliouras; Christos Papatheodorou; Vangelis Karkaletsis; Constantine D. Spyropoulos; Victoria Malaveta
In this paper we propose a methodology for organising the users of an information providing system into groups with common interests (communities). The communities are built using unsupervised learning techniques on data collected from the users (user models). We examine a system that filters news on the Internet, according to the interests of the registered users. Each user model contains the user's interests on the news categories covered by the information providing system. Two learning algorithms are evaluated: COBWEB and ITERATE. Our main concern is whether meaningful communities can be constructed. We specify a metric to decide which news categories are representative for each community. The construction of meaningful communities can be used for improving the structure of the information providing system as well as for suggesting extensions to individual user models. Encouraging results on a large data-set lead us to consider this work as a first step towards a method that can easily be integrated in a variety of information systems.
Soft Navigation in Product Catalogs BIBAFull-Text 385-396
  Markus Stolze
Current electronic product catalogs support only Hard Navigation in the product list. Products or product categories are displayed only if they match a criterion that a user has specified explicitly as a constraint or implicitly by following a navigation link. Hard navigation is problematic if users want to express soft preferences instead of hard constraints. Users will make sub-optimal buying decisions if they mistake soft preferences for hard requirements and focus only on products that match all their preferences.
   Soft Navigation is an alternative means to navigate product catalogs. Users express preferences which are used to evaluate products and display them in such a way that higher-scoring products are more visible than lower-scoring products. This paper presents a product scoring catalog (PSC) that supports soft navigation and allows users to express preferences and rate their importance by following a set of rules. The paper closes by outlining possible extensions to PSC and indicating research issues related to soft navigation product catalogs.

Information Retrieval for Digital Libraries

Mixing and Merging for Spoken Document Retrieval BIBAFull-Text 397-407
  Mark Sanderson; Fabio Crestani
This paper describes a number of experiments that explored the issues surrounding the retrieval of spoken documents. Two such issues were examined. First, attempting to find the best use of speech recogniser output to produce the highest retrieval effectiveness. Second, investigating the potential problems of retrieving from a so-called "mixed collection", i.e. one that contains documents from both a speech recognition system (producing many errors) and from hand transcription (producing presumably near perfect documents). The result of the first part of the work found that merging the transcripts of multiple recognisers showed most promise. The investigation in the second part showed how the term weighting scheme used in a retrieval system was important in determining whether the system was affected detrimentally when retrieving from a mixed collection.
An Integrated Approach to Semantic Evaluation and Content-Based Retrieval of Multimedia Documents BIBAFull-Text 409-428
  Alois Knoll; Christian Altenschmidt; Joachim Biskup; Hans-Martin Blüthgen; Ingo Glöckner; Sven Hartrumpf; Hermann Helbig; Christiane Henning; Reinhard Lüling; Burkhard Monien; Thomas Noll; Norbert Sensen
We present an overview of a large combined querying and retrieval system that performs content-based on-line searches in a large database of multimedia documents (currently text, tables and colour images). Queries are submitted as sentences in natural language and are transformed into the language of the target database. The documents are analyzed semantically for their information content; in a data fusion step the individual pieces of information extracted from these documents are aggregated into cognitively adequate result documents.
   There is no pre-indexing necessary when new documents are stored into the system. This retains a high degree of flexibility with respect to the questions that may be asked. It implies, however, that both huge amounts of data must be evaluated rapidly and that intelligent caching strategies must be employed. It is therefore mandatory that the system be equipped with dedicated high-speed hardware processors.
   The complete system is currently available as a prototype; the paper outlines its architecture and gives examples of some real sample queries in the knowledge domain of weather data documents.
Taiscéalaí: Information Retrieval from an Archive of Spoken Radio News BIBAFull-Text 429-442
  Alan F. Smeaton; M. Morony; Gerard Quinn; Ronan Scaife
In this paper we describe Taiscéalaí, a web-based system which provides content-based retrieval on an up-to-date archive of RTé radio news bulletins. Taiscéalaí automatically records and indexes news bulletins twice daily using a stream of phones recognised from the raw audio data. A user's typed query is matched against fixed length windows from the broadcasts. A user interface allows the news bulletins most likely to be relevant to be presented and the user to select sub-parts of the bulletins to be played. Many of the parameters we have chosen to use such as the size and amount of overlap of windows and the weighting of phones within those windows, have been determined within the framework of the TREC Spoken Document Retrieval track and are thus well-founded. We conclude the paper with a walkthrough of a worked example retrieval and an outline of our plans for extending Taiscéalaí into an integrated digital library for news.

Querying in Digital Libraries

Semantic Structuring and Visual Querying of Document Abstracts in Digital Libraries BIBAFull-Text 443-458
  Andreas Becks; Stefan Sklorz; Christopher Tresp
Digital libraries offer a vast source of very different information. To enable users to fruitfully browse through a collection of documents without necessarily having to state a complex query, advanced retrieval techniques have to be developed. Those methods have to be able to structure information in a semantic manner. This work presents some first steps in semantically organizing thematically pre-selected documents of a digital library. The semantic structure of the document collection will be expressively visualized by the proposed system. We illustrate our ideas using a database of medical abstracts from the field of oncology as a walking example.
Documentation, Cataloging and Query by Navigation: A Practical and Sound Approach BIBAFull-Text 459-478
  F. J. M. Bosman; Peter Bruza; Theo P. van der Weide; L. V. M. Weusten
In this paper we discuss the construction of an automated information system for a collection of visual reproductions of art objects. Special attention is payed to the economical aspects of such a system, which appears to be mainly a problem of data entry. An approach is discussed to make this feasible, which also strongly provokes consistency between descriptions. Another main target of such a system is the capability for effective disclosure. This requires a disclosure mechanism on descriptions which is easy to handle by non technical users. We show the usefulness of query by navigation for this purpose. It allows the searcher to stepwise build a query in terms of (semi-)natural language. At each step, the searcher is presented with context sensitive information.
   The resulting system is described and we discuss an experiment of its use.
Signature File Methods for Semantic Query Caching BIBAFull-Text 479-498
  Boris Chidlovskii; Uwe M. Borghoff
In digital libraries accessing distributed Web-based bibliographic repositories, performance is a major issue. Efficient query processing requires an appropriate caching mechanism. Unfortunately, standard page-based as well as tuple-based caching mechanisms designed for conventional databases are not efficient on the Web, where keyword-based querying is often the only way to retrieve data. Therefore, we study the problem of semantic caching of Web queries and develop a caching mechanism for conjunctive Web queries based on signature files.We propose two implementation choices. A first algorithm copes with the relation of semantic containment between a query and the corresponding cache items. A second algorithm extends this processing to more complex cases of semantic intersection. We report results of experiments and show how the caching mechanism is successfully realized in the Knowledge Broker system.
Introducing MIRA: A Retrieval Applications' Development Environment BIBAFull-Text 499-514
  José María Martínez Sanchez; Jesús Bescós; Guillermo Cisneros
MIRA (Multimedia Information Remote Access) is an implementation of a generic Application Development Model, demonstrating the scalability of such Model. MIRA allows session management and common working space (desktop) paradigms, allowing not only to retrieve information but also to have facilities for further handling of such information. The MIRA Tele-research application was based on specifications from Museum curators. Hence, MIRA is currently the basis of the BABEL network, which joins a number of Museums and Libraries.

Human Computer Interaction for Digital Libraries

Interacting With IDL: The Adaptive Visual Interface BIBAFull-Text 515-534
  Maria Francesca Costabile; Floriana Esposito; Giovanni Semeraro; Nicola Fanizzi; Stefano Ferilli
IDL (Intelligent Digital Library) is a prototypical intelligent digital library service that is currently being developed at the University of Bari. Among the characterizing features of IDL there are a retrieval engine and several facilities available for the library users. In this paper, we present the web-based visual environment we have developed with the aim of improving the user-library interaction. The IDL environment is equipped with some novel visual tools, that are primarily intended for inexperienced users, who represent most of the users that usually access digital libraries. Machine Learning techniques have been exploited in IDL for document analysis, classification, and understanding, as well as for building a user modeling module, that is the basic component for providing IDL with user interface adaptivity. This feature is also discussed in the paper.
Evaluating a Visual Navigation System for a Digital Library BIBAFull-Text 535-554
  Anton Leuski; James Allan
In this paper we investigate a general purpose interactive information organization system. The system organizes documents by placing them into 1-, 2-, or 3-dimensional space based on their similarity and a spring-embedding algorithm. We begin by designing a method for estimating the quality of the organization when it is applied to a set of documents returned in response to a query. We show how the relevant documents tend to clump with each other in space. We proceed by presenting a method for measuring the amount of structure in the organization and we explain how this knowledge can be used to refine the system. We also show that increasing the dimensionality of the organization generally improves its quality. We introduce two methods for modifying the organization based on the information obtained from the user and show how such feedback improves the organization. All the analysis is done off-line without direct user intervention.
Visualizing Document Classification: A Search Aid for the Digital Library BIBAFull-Text 555-567
  Yew-Huey Liu; Paul Dantzig; Martin Sachs; James T. Corey; Mark T. Hinnebusch; Marc Damashek; Jonathan D. Cohen
The recent explosion of the internet has made digital libraries popular. The user-friendly interface of Web browsers allows a user much easier access to the digital library. However, to retrieve relevant documents from the digital library, the user is provided with a search interface consisting of one input field and one push button. Most users type in a single keyword, click the button, and hope for the best. The result of a query using this kind of search interface can consist of a large unordered set of documents, or a ranked list of documents based on the frequency of the keywords. Both lists can contain articles unrelated to user's inquiry unless a sophisticated search was performed and the user knows exactly what to look for. More sophisticated algorithms for ranking the relevance of search results may help, but what is desperately needed are software tools that can analyze the search result and manipulate large hierarchies of data graphically. In this paper, we present a language-independent document classification system for the Florida Center for Library Automation to help users analyze the search query results. Easy access through the Web is provided, as well as a graphical user interface to display the classification results.

Natural Language Processing for Digital Libraries

A Linguistically Motivated Probabilistic Model of Information Retrieval BIBAKFull-Text 569-584
  Djoerd Hiemstra
This paper presents a new probabilistic model of information retrieval. The most important modeling assumption made is that documents and queries are defined by an ordered sequence of single terms. This assumption is not made in well known existing models of information retrieval, but is essential in the field of statistical natural language processing. Advances already made in statistical natural language processing will be used in this paper to formulate a probabilistic justification for using tfxidf term weighting. The paper shows that the new probabilistic interpretation of tfxidf term weighting might lead to better understanding of statistical ranking mechanisms, for example by explaining how they relate to coordination level ranking. A pilot experiment on the Cranfield test collection indicates that the presented model outperforms the vector space model with classical tfxidf and cosine length normalisation.
Keywords: Information Retrieval Theory; Statistical Information Retrieval; Statistical Natural Language Processing
The C-value/NC-value Method of Automatic Recognition for Multi-Word Terms BIBAFull-Text 585-604
  Katerina T. Frantzi; Sophia Ananiadou; Jun-ichi Tsujii
Technical terms (henceforth called simply terms), are important elements for digital libraries. In this paper we present a domain-independent method for the automatic extraction of multi-word terms, from machine-readable special language corpora.
   The method, (C-value/NC-value), combines linguistic and statistical information. The first part, C-value enhances the common statistical measure of frequency of occurrence for term extraction, making it sensitive to a particular type of multi-word terms, the nested terms. The second part, NC-value, gives: 1) a method for the extraction of term context words (words that tend to appear with terms), 2) the incorporation of information from term context words to the extraction of terms.
Comparing the Effect of Syntactic vs. Statistical Phrase Indexing Strategies for Dutch BIBAFull-Text 605-617
  Wessel Kraaij; Renée Pohlmann
In this paper we describe the results of experiments contrasting syntactic phrase indexing with statistical phrase indexing for Dutch texts. Our results showed that we at least need a compound splitting algorithm for good quality retrieval for Dutch texts. If we then add either syntactic or statistical phrases, performance generally improves, but this effect is never statistically significant. If we compare syntactic vs. statistical phrase indexing, syntactic phrases are slightly superior to statistical phrases, particularly at high precision. At higher recall levels syntactic and statistical phrases are equally effective. However, since a compound splitting algorithm requires a dictionary and knowledge about constraints on compound formation, a purely non-linguistic indexing strategy, with or without phrases, does not seem to be very effective for Dutch.
Reduction of Expanded Search Terms for Fuzzy English-Text Retrieval BIBAFull-Text 619-633
  Manabu Ohta; Atsuhiro Takasu; Jun Adachi
Optical character reader (OCR) misrecognition is a serious problem when OCR-recognized text is used for retrieval purposes in digital libraries. We have proposed fuzzy retrieval methods that, instead of correcting the errors manually, assume that errors remain in the recognized text. Costs are thereby reduced. The proposed methods generate multiple search terms for each input query term by referring to the confusion matrices, which store all characters likely to be misrecognized and the respective probability of each misrecognition. The proposed methods can improve recall rates without decreasing precision rates. However, in English fuzzy retrieval, occasionally a few million search terms are generated, which has an intolerable effect on retrieval speed. Therefore, this paper presents two heuristics to reduce the number of generated search terms by restricting the number of errors included in each expanded search term while maintaining retrieval effectiveness.

Posters

Internet Publications: Pay-per-Use or Pay-per-Subscription? BIBAFull-Text 635-636
  Roberto Zamparelli
Today, the prevailing philosophy for the electronic transmission of copyrighted material is pay-per-use, whose main features are:
  • 1. The user pays a fee for each item obtained, part of which goes as royalties
        to the item's creator.
  • 2. Legal users who give away unauthorised copies to other users, for free or
        for a fee, are prosecuted according to law.
  • 3. Each item purchased is registered and typically protected by encryption or
        other devices, to make unauthorised copying more difficult. In this poster, I claim that a pure pay-per-use model of electronic information dissemination is not sustainable, as it runs contra current social and technological trends, and that an alternative model, pay-per-subscription is superior. By "pay-per-subscription" I mean an abstract model where:
  • 1. The authorship of some work is registered.
  • 2. A user is allowed to download for private use any copyrighted item a
        'virtual library' has in store, any number of times, against payment of an
        flat, periodical subscription rate to the library.
  • 3. The subscription also gives right to a range of inherently inalienable
        (non-transferable) services.
  • 4. The author of a work given to the virtual library is rewarded in proportion
        to the number of times his or her work is downloaded by different users.
  • Statistical Identification of Domain-Specific Keyterms for Text Summarisation BIBAFull-Text 637-638
      Budi Yuwono; Mirna Adriani
    We believe that in order to be useful a text summarisation technique must be domain dependent, in that the resulting summary must cover the important aspects and concepts specific to the subject matter's domain. The main problem with a typical domain-dependent text summarisation technique is the cost of acquiring and hand-coding the required domain-specific knowledge into the system, e.g., in the form of phrase-structure templates. To solve this problem, we propose a solution which uses automatically retrieved sample documents as the source of the domain-specific knowledge, and extracts the knowledge in the form of keyterms. These keyterms represent the key aspects and concepts (terminology) relevant to the input document. The sample documents are retrieved from a collection, called base collection-containing documents of various topics, based on their similarity with the input document. The input document is then summarised by extracting a number of sentences containing the keyterms.
       Our text summarisation technique is based on the statistical distribution of words among documents in the base collection, within individual documents, and among sentences in the input document. In particular, statistically-based formula are employed for scoring each of the candidate sample documents, keyterms, and key sentences. Our technique makes use of standard word or term distribution parameters that are commonly provided or can be easily obtained through the use of modern text retrieval systems.
    The STACS Electronic Submission Service BIBAFull-Text 639-640
      Jochen Bern; Christoph Meinel; Harald Sack
    One very popular software used to handle electronic conference submissions was written by the SIGACT Electronic Publishing Board. It was first used for the FOCS '95 conference, and later for a range of conferences including COCOON, FOCS, PODC, SODA, SPAA, STOC, and WDAG, staying basically unchanged.
       Another conference in computer science that uses electronic submission mechanisms is the Symposium on Theoretical Aspects in Computer Science. STACS is an international conference covering all aspects of Theoretical Computer Science. It has proven to be -- together with ICALP -- the main European exchange place for ideas in this area. The program committee is internationally of top rank, the number of submissions is high (typically 100 to 140 papers), and the acceptance rate is low. Researchers come from all over the world to attend.
       Our experience is that the problems in the design of conference submission services can be grouped into the following categories:
  • User Interface Problems -- First and foremost, selection of (a) portable file
       format(s) for submitted papers would be desirable; Failing that (e.g.,
       PostScript), portability problems should be reported to the submitter in an
       intellegible error message. Next, steps have to be taken to ensure reliable
       and unaltered delivery of submissions. Finally, the whole workflow needs to
       be intuitive and "familiar" to the submitter.
  • Security Related Problems -- Conference announcements nowadays are put onto
       the WWW and, thus, can be found with search engines, so submission services
       may be accessed by more people than the attendees.
  • Integrating Article Databases and Full Text Archives into a Digital Journal Collection BIBAFull-Text 641-642
      Anders Ardö; Franck Falcoz; Tove Nielsen; Salam Baker Shanawa
    The aim of DTV's Article Database Service (DADS) is to offer our end users a whole new generation of library services with integrated search and browse facilities, a common user interface and direct electronic document delivery -- all accessible from their own desks. The system should handle bibliographic data, including abstracts as well as articles in full text and table-of-contents data.
       Using an electronic article database service users can access a particular article or journal within seconds rather than hours or days typical for paper based collections. Large collections of material can be searched simultaneously and retrieved instantly. There is also the possibility for active dissemination of information based on "interest profiles" of users, ie current awareness services.
       Technical design considerations of the DADS system include:
  • Integration of data (full text documents, bibliographic records and
       table-of-contents data) from different vendors and information sources
  • Common user interface for searching and browsing
  • Simultaneous searching in multiple heterogeneous databases
  • Integrated ordering and delivery of documents
  • Accessibility 24 hours a day, 7 days a week, 365 days a year
  • The system should be based on open standards, wherever possible
  • Support of multiple hardware platforms and operating environments. Organizing, indexing and providing access to material from different information providers becomes a major task.
  • Alerting in a Digital Library Environment: Do Channels Meet the Requirements? BIBAKFull-Text 643-644
      Daniel Faensen; Annika Hinze; Heinz Schweppe
    An Alerting Service (AS) informs its clients about new information provided by several suppliers. Special interests of clients can be defined as profiles. In the context of digital libraries, suppliers are the providers of documents. Providers are typically scientific publishers. In this paper we assume, that the providers are known to the clients. A general model and architecture of an Alerting Service is given in [1]. Channel technology has been developed for broadcast of news and continuous streams of data like stock rates. For the digital library environment a finer granularity in profile definition than for common broadcasting is needed. In contrast to broadcast services, publishing events of multiple providers have to be presented to each client in a uniform way.
       In this summary we evaluate how the two competing approaches of Channel technology, Netscape's Netcaster [3] and Microsofts Active Channels [2] meet these requirements.
       To satisfy user's needs events have to be filtered by more or less complex profiles, e. g. a set of documents (like journals), a list of keywords (selected arbitrarily or from a thesaurus) or a query in a full-fledged query language like STARTS [4]. An easy-to-use and powerful profile definition language is one requirement for an AS. The second is a unified view, that means splitting the n:m-relationship between providers and clients.
       The use of both technologies strongly depends on how the contents is to be filtered, i.e. how the user profile is to be defined.
    Keywords: digital libraries; push technology; alerting; channel; CDF; Netcaster
    NAIST Digital Video Library: Tools for Restructuring Video Data for the Content Based Information Retrieval -- A Representative Image of Shot Concept for the Internet BIBAFull-Text 645-646
      Yukiko Kawasaki; Rei Suzuki; Hideki Sunahara
    Nara Institute of Science and Technology started services of a digital library system for campus use since April 1996. This system includes the function to access the digital video data. However, this function just provide showing video data on the terminals. In order to make the digital video library more useful to browse and retrieve, we proposed a new structure of the digital video data.
       There are a lot of elements relating to the video media, such as representative frames of each topics, the sound information which is accompanied with such the representative frame, the text information which explains about the contents, and so on. A new structure extracts some elements from the digital video data before storing the video data in the database(Fig.1 (left)). This structure provides a useful function in order to retrieve, browse and edit the target data in the digital library systems for the Internet(Fig.1 (right)).
       We use following two terms: "shot" and "RS-frame" (Representative frame of Shot). Shot is a short sequence of the stream. A long video stream is composed of a large number of topics. For example, a news program about 30 minutes length includes a number of different topics of few minutes. Therefore, a motion video stream can be divided into a number of shots of about only a few minutes stream depending on the contents. Users can get the target scene directly by searching with shots without playing the video long stream consecutively. RS-frame is a representative frame of shot, which is selected from the shot properly.
    LIBERATION: A Value-Added Digital Library BIBAFull-Text 647-648
      Robert Stubenrauch; Barbara Vickery; Ato Ruppert
    Within the framework of university settings the basic goal of LIBERATION was to add as much value to existing electronic publications from the scientific domain as possible by fully exploiting the sophisticated structuring and knowledge management features provided by an advanced underlying information management system. In that way all affected groups of users, from the production end (scientific publishers) through distributors (university libraries) to the consumers' end (student, scientists), should benefit significantly and the roles of information producers and consumers would blend to a large degree.
       The technical approach was to customise the Knowledge Management system Hyperwave which resulted in a fully Web compatible digital full-text library with the list of features including the following:
  • a highly flexible GUI, consistent over complete library;
  • hierarchically structured, modular content encourages re-usage of material;
  • dynamically generated navigation facilities make hard-coded links
       unnecessary;
  • implicit structure of content reduces maintenance effort significantly;
  • arbitrary document formats, including multimedia
  • meta-data of arbitrary type; actually employed is the Dublin Core scheme;
  • powerful search: full-text and meta-data search; scope selection; iterative
       search;
  • subscription to periodically performed individual searches; results sent via
       email;
  • server clustering: search across server boundaries; central pool of user
       accounts;
  • A Methodology to Annotate Cultural Heritage Digital Video BIBAFull-Text 649-650
      Claudia Di Napoli; Mario Mango Furnari; Francesco Mele; Giovanni Minei
    The goal of our work is to provide a well-structured methodological approach for annotating digital video in the domain of cultural heritage, taking into account that the methodologies and technologies currently available do not allow to make the annotation process completely automatic. In the proposed approach an interaction with the user that decides both the video segments to be annotated, and the set of labels to be associated to a segment is always required. This approach uses a predefined set of categories (world-view), hierarchically structured, which classifies the video subjects describing the contents shown in a video. This approach leads both to a way of integrating heterogeneous sources of information, and to structure information in such a way to retrieve video segments through a simple query language.
    BALTICSEAWEB -- Geographic User Interface to Bibliographic Information BIBAFull-Text 651-652
      Sauli Laitinen; Anssi Neuvonen
    Geographic user interfaces has been created to bibliographic information on environmental conditions of the Baltic Sea in a project, BALTICSEAWEB, within the Libraries sector of the EU Telematics Applications Programme. Two versions of map-based search interfaces have been developed which allow searches made in a database of more than 11000 bibliographic records. The searches can be modified by using a WWW based search form. In addition a number of original documents have been made available in electronic form so that the user can not only retrieve bibliographic records but also original documents. BALTICSEAWEB offers environmental information on the Baltic Sea through a user-friendly and well structured geographical interface. The home page of the project can be found at URL http://www.baltic.vtt
    Implementing Powerful Retrieval Capabilities in a Distributed Environment for Libraries and Archives BIBAFull-Text 653-655
      Chrisa Tsinaraki; George Anestis; Nektarios Moumoutzis; Stavros Christodoulakis
    An on-line distributed environment, which was implemented in the context of the VENIVA project for historical libraries and archives is presented here. Emphasis is given in the presentation of the powerful search capabilities provided to the end-users, which are typical for the end-users (either researchers or ordinary people) of any Digital Library environment.
       The information managed resides in a number of different relational databases in one or more institutions (i.e. Libraries and Historical Archives). The end-user of the system uses a WWW client to pose traditional boolean queries, similarity queries or complex queries containing both boolean and similarity terms on the contents of the databases. In the case of similarity queries, the end-user can also select the evaluation formula used to rank the objects that the system returns as the answer to his query. This gives a flexibility to experiment with alternative retrieval models without starting the implementation from scratch. A Graphical Query Editor is used in order to construct the queries.
       The innovative aspect of this work is that the similarity queries are translated by an appropriate component of the server into a series of traditional SQL queries, so that there is no need to have separate systems to support the various services offered. Only a standard relational DBMS (Database Management System) is used in the core of the system. The software layer that has been implemented on top gives all the additional flexibility. The implementation is based on a sound and flexible mathematical retrieval model.
    OWL-Cat: A Web-Based OPAC Appealing End-Users to Exploit Library Resources BIBAFull-Text 657-658
      Silvana Mangiaracina; Stefano Ferrarini; Maria Grazia Balestri
    Databases have always been the most appropriate tool to catalogue library items, but their most important feature, to instantly retrieve sets of data which satisfy search criteria, has often represented an awkward task to be accomplished, depending on the different kinds of query languages, which vary with hardware and software architecture. WWW browsers and the development of the Internet have offered Information specialists a unique opportunity both to facilitate end-users in the search and retrieval process and to remove barriers which have always separated local from remote resources.
       At the crossover point between libraries' supply and users' requests, Web-based OPAC's allow not only the location of items which are physically owned by the library, but also access to remote resources (either full-text documents or databases, free and fee-based, or selected links relevant to the library community) and actually display or retrieve them on the desktop of any registered user.
       Indexing digital resources also means recording type of data which require appropriate database fields such as MARC 856, designed to accommodate URL (Uniform Resource Locator) addresses, as well as others which contain information accessible only to authorized users. This will allow librarians to distribute the appropriate data in a more timely and secure manner than alerting different classes of users on 'bits and pieces' of information at any occurring change (e.g. URL changes, passwords and so on), therefore saving time to implement other specific features.
    Towards a Framework for Building Collaborative Information Searching Systems BIBAFull-Text 659-660
      Brigitte Trousse; Michel Jaczynski; Rushed Kanawati
    If the World Wide Web (the web for short), should become the world wide digital library, not only effective and efficient information searching techniques are needed, but also an adequate collaboration support that enable people to cooperate and collaborate in locating relevant information just as they do in physical libraries. Collaboration support is required during information searching as well as for sharing results of previous searching process. Collaborative information searching (CIS) can be either direct or indirect. In direct collaboration, people communicate directly, in synchronous or asynchronous manner, in order to show one another where to go to find a given information or simply to send to one another the required information. In indirect collaboration information gathered from previous information searching process conducted a user are used to help other users in their searching activities. Recommender systems are an example of indirect CIS systems. In this work we address the problem of providing a framework that facilitates the design and the implementation of various CIS applications. An overview of this framework, called, Broadway*Tools is presented in the next section.
       Our approach consists on providing a set of object-oriented reusable components (or tools) that implements main services required for CIS. Today, Broadway*Tools provides four groups of tools set: -Server tools: these include a recommendation server, a user profile and session manager server, a page information server, and an annotation server.
    Toward a New Paradigm for Library Instruction in the Digital Library BIBAFull-Text 661-662
      Verlene J. Herrington
    Hypertext and the Internet have dramatically altered the world of information access. The digital library does not focus on ownership and holdings, but strives for instant global access to information. Text is no longer merely linear discourse; hypertext is branching, linking, interactive discourse with no absolute beginning or end, no boundaries and no permanence. Technology has thrust the library into the Information Age, but models of service delivery have not changed, especially in the area of library instruction. Academic librarians must examine their beliefs and values regarding the purpose and value of library instruction.
       Over 30 years ago, Thomas Kuhn wrote The Structure of Scientific Revolutions and coined the phrase "paradigm shift". Kuhn theorized that science may encounter a law so significantly different that a discipline is forced to alter its worldview or paradigm of its environment. A paradigm describes everything which the science is based on-all of its laws, beliefs, procedures and methods. Until Kuhn, science was thought to be built on an accumulation of all that had been learned over history with each new law adding to the mass of scientific knowledge, not radically changing it. It has been 500 years since Gutenberg' printing press altered the paradigm of the world forever. Likewise, hypertext and the Internet are radically transforming the worldview of global information access, electronic publishing, scholarly collaboration, and resource sharing.
       Although technology has changed the "look" of the library, there has been no major paradigm change in the area of library instruction -- mainly because the underlying belief structure remains the same.
    Ontobroker in a Nutshell BIBAFull-Text 663-664
      Dieter Fensel; Stefan Decker; Michael Erdmann; Rudi Studer
    The World Wide Web (WWW) provides huge amounts of information in informal and semi-structured representations. This is one of the key factors that enabled its incredible success story. The representation formalisms are simple and retain a high degree of freedom in how to present the information. However, freedom in information representation and simple representation formalisms cause serious bottlenecks in accessing information from the web. We designed and implemented some tools necessary to enable the use of ontologies [2] for enhancing the web. We developed a broker architecture called Ontobroker [1] with three core elements: a query interface for formulating queries, an inference engine used to derive answers, and a webcrawler used to collect the required knowledge from the web. The strength of our approach is the tight coupling of informal, semi-formal and formal information and knowledge. This supports their maintenance and provides a service that can be used more generally for the purpose of knowledge management and for integrating knowledge-based reasoning and semi-formal representation of documents.
       The query formalism is oriented toward a frame-based representation of ontologies that defines the notion of instances, classes, attributes and values. The structure of the query language can be exploited to provide a tabular query interface as shown in Figure 1 which asks for the researchers with last name Benjamins and their email addresses. We also need support for selecting classes and attributes from the ontology. To allow the selection of classes, the ontology has to be presented in an appropriate manner.
    WAY: An Architecture for User Adapted Access to Z39.50 Servers Based on Intelligent Agents BIBAFull-Text 665-666
      Camino Fernández; Paloma Díaz; Ignacio Aedo
    The work presented in this poster is based on the combination of three well-known paradigms of Computer Science applied to Digital Libraries: User Adaptive Interfaces, Information Retrieval and Intelligent Agents. Its objective is to define a model providing intelligent information access by means of an adaptive user interface. The model, called WAY (see figure 1), is supported by a web-based architecture and counts on the help of intelligent agents in charge of studying the user characteristics in terms of previous behaviours, in order to provide an adapted interface and to guide the user's searching process through the servers available in the net. This model is currently being implemented in java using RMI (Remote Method Invocation) technology.
       Concerning user adaptive interfaces, what our model proposes is to offer a different initial interface for each type of user. These initial interfaces will be defined according to the users classification, taking into account where the user is accessing the system from. Once the user begins to interact with a specific interface, it will go on changing depending on the actions performed by the user in two ways: the preferences shown explicitly by the user -- selected options in the configuration of the interface -- and the preferences shown implicitly -- by systematic actions. The following examples illustrate these ideas:
  • The type of user: in a university library environment, the possible types
       could be students, teachers and library staff, all of them with quite
       different goals and needs.
  • The user mother tongue: if the user is connecting from Germany, he/she is
       likely to prefer a German interface.
  • The user preferences: there are people who still prefer textual environments
       although their computers allow a good performance of visual ones.
  • The user repeated actions: in a university, students usually are more
       interested in books than in journals, unlike teachers, whose main research
       tools are papers.
  • ARIADNE -- Digital Library Architecture BIBAFull-Text 667-668
      Nuno Maria; Pedro Gaspar; Nuno Grilo; António Ferreira; Mário J. Silva
    We describe our approach for acquiring, preserving, organizing and disseminating information in a digital library of news publications.
       The digital publishing group at the Informatics Department of FCUL studies new paradigms for processing heterogeneous information in organizations, combining multiple information processing technologies under a common framework. In our current project, ARIADNE, developed jointly with PÚblico, S.A.[2], a national daily newspaper, we are building a new digital publishing infrastructure where all the information used and produced by journalists is organized in a common database. From the information in this digital library, we generate publications in digital format.
       A main concern of the project is to build and maintain a large multimedia data repository, holding various collections of documents, from newsfeeds to newspaper articles, to databases of people, places and events. For some of the collections we only keep meta-data of various forms (from links to indexes and classification schemes).
       We find that some of the new publications we are developing are taking us into a new publishing paradigm where the notion of edition disappears. The same information item is reformatted and published in multiple forms, and accessed as part of information collections that do not constitute publications in the usual sense. As a result, we are studying robust forms for classifying this information so it may be reused in the future and ways to preserve its current presentation and organization.
       Organizing topic collections is another investigation topic associated with the preservation of information [1].
    Facts and Myths of Browsing and Searching in a Digital Library BIBAFull-Text 669-670
      K. F. Tan; Mike Wing; Norman Revell; Gary Marsden; C. Baldwin; Ross MacIntyre; Ann Apps; Ken D. Eason; S. Promfett
    In recent times, there has been increased interest in the querying of digital libraries (DLs). This is due in part to the development of the WWW, which enables easy access to both centralised and distributed digital library sources. The majority of published works on querying DLs are associated with information retrieval (IR), also known as digital querying. Information retrieval techniques are popular with querying DLs due to their flexibility in querying semi-structured data. In contrast, database querying of DLs has been largely ignored until only recent years. The key aspects of database querying of DLs involve the integration of database querying with browsing or navigating techniques to query semi-structured data. Our interest lies in developing the relatively limited database query facilities currently available to users of DLs, and a key stage in this process is to define what kinds of searching and browsing typical users would like to perform.
       In this poster we will present the planning of an analysis of user browsing and searching strategies in the SuperJournal1 digital library (SJDL). The browse and search strategies suggested for the analysis are derived from the navigational strategies for a database (see table 1) proposed by Canter, Rivers and Storrs. This derivation of browse and search strategies (for a DL) from the navigational strategies (of a database) will serve to highlight the similarities and differences between browsing/searching in DLs and databases. The proposed analysis is based on the activity logfiles of users of the SJDL, which are logged as ASCII files and are converted to SPSS files for statistical analysis purposes. These logfiles represent over two years worth of digital library browse and search activities.
    New Media Showcase BIBAFull-Text 671-672
      Michael Kreyche
    New Media Services is a relatively new unit of Kent State University Libraries and Media Services that provides technological and pedagogical support for instruction. Working closely with faculty, teaching assistants, and academic support staff, New Media Services supplies research, design, and production services and is engaged with a variety of technologies that support the Digital Library. Four types of projects are described here to illustrate the range of applications of new media technology.
       The "Stater Archive" (http://www.library.kent.edu/stater) is a full text archive of Kent State University's student newspaper. The Archive presently covers a period of five and a half years (from 1990 to 1996) and consists of a collection of SGML files converted (perhaps "salvaged" is a more descriptive term) from newspaper production files using a combination of batch text processing utilities and custom interactive editing software. The resulting files have been validated against a DTD written specifically for this application. Currently, a Web-based interface passes queries to a search engine and translates the SGML text to HTML for display within the browser. A disadvantage of this technique is the significant server overhead required to perform the necessary conversion for each request. This database is an excellent testbed for experimenting with the emerging XML/XSL standards. Eventually these technologies will provide a more refined delivery for the database, moving formatting tasks from the server to an advanced browser.
       Efficient, practical delivery of page images for issues prior to 1990 that only exist in paper or microform is being explored using the FlashPix image format.
    Creating a Collaborative Task-Specific Information Retrieval System BIBAFull-Text 673-674
      Aggis Simaioforidis; Jussi Karlgren; Anna-Lena Ereback
    Systems for information access today work well for some tasks, but are not suitable for general purpose usage. Our contention is no systems can be. Information seeking behavior is manyfold and varying, and viable systems for information access will have to select among many different methods, designs, and underlying information analysis technology to cover all combinations of user preferences, background, task type, and domain variation.
       Specifically, systems today are specification oriented. The typical way to interact is to specify the information need by a number of topical terms. The system retrieves a set of retrieved documents. This set is inspected, some documents are selected for further inspection. The system may allow the user to modify the set of search terms until the inspected documents are satisfactory, and finally some of the retrieved and inspected documents are selected for delivery. The standard approach has its limitations, many of which have to do with the rather narrow channels of information from the user to the system. The user is restricted to provide sets of terms, and these convey topical information only.
       In project Stockholm123 we address some of the above limitations by enriching the information given to system by the user. The project task is finding a restaurant for a meal using a local online business telephone directory and general internet search tools. Our interface provides the two information services in parallel in two browser frames. The phone directory is organized in a disjunctive hierarchy, and consists practically only of contact information. While the information is reliable enough, to make an informed choice of restaurant one needs.
    EULER: An EU 'Telematics for Libraries' Project BIBAFull-Text 675-676
      Michael Jost; Anna Brümmer
    Since April 1998 the European Commission is funding the EULER project (European Libraries and Electronic Resources in Mathematical Sciences) in the framework of the 'Telematics for Libraries' sector from the Telematics Applications programme.
       Main goal of EULER is to integrate different, electronically available information resources in the field of mathematics. EULER aims to construct a "digital library mathematics" from existing heterogeneous sources. The following existing publications-related information resources on mathematics will be covered by the EULER service:
  • scientific literature databases
  • library OPACs and document delivery services
  • electronic journals from academic publishers
  • archives of preprints and grey literature
  • quality controlled subject information gateways on the Internet
  • robot-generated indexes of other relevant Internet resources These resources are considered to be the most frequently used when conducting searches for scientific results and ongoing developments in the field of mathematics -- today the user have to search them one by one. The intention of the EULER project is to offer a "one-stop-shopping site" for users interested in mathematics.
  • A Virtual Community Library: SICS Digital Library Infrastructure Project BIBAFull-Text 677-678
      Andreas Rasmusson; Tomas Olsson; Preben Hansen
    In this project¹, we aim to create an agent-based digital library architecture for a Virtual Community Library (VCL) where each user has a personal library and, at the same time, is part of a larger community. The community is dynamically composed of the users' personal libraries and, through intermediators, other digital libraries.
       We want to stress the fact that the users participate in a large dynamic decentralised community where they continually interact with each other. Being a part of a community means that each user can benefit from the work put into the other libraries. For example, by obtaining documents through search queries or recommendations using social filtering, but also by getting help to organise the personal library.
       In the VCL, we try to combine the best aspects of the WWW, the library and the personal library. For example, ease to publish documents, personal information space, decentralised control of the document collection and ability to search for documents.
       We have currently implemented two prototypes of the system, one for the personal library and one for visualising the information spread between the users.
       The foundation for the VCL is an agent architecture where the users are represented by self-interested agents. This is an open-ended knowledge system, which supports creation, inferring, manipulation and sharing of knowledge about information objects ("metadata"), and supports (enables automatisation of) interaction between agents pertaining to the information and knowledge management related business processes.
       Interaction (compatibility) with other systems in a number of formats and protocols will be investigated, such as Z39.50, MARC, DIENST, Dublin Core, BibTeX, etc. [1]
    Digital Libraries: Information Broker Roles in Collaborative Filtering BIBAFull-Text 679-680
      Annika Wærn; Mark Tierney; Åsa Rudström; Jarmo Laaksolahti; Torben Mård
    The main goal of the EdInfo project [1] is to utilize human information brokers, or editors, as a resource in adaptive information systems. An information broker can be any of the following:
  • The dedicated expert that collects and potentially reviews literature within
       a restricted area of interest;
  • The journalist that produces articles with specific reader groups in mind;
  • The librarian that organizes incoming information and directs readers to
       various sources;
  • The professional information broker, that processes specific information
       requests, seeks for appropriate information sources, and produces summaries
       of the obtained information. The common characteristic of these roles is that the information broker has some kind of understanding of what his or her customers want, and is willing to adapt to these needs. Information brokers collect information from various sources, evaluate its relative importance and then choose whether to include the information as it is, disregard it, summarize it, or perhaps rewrite or illustrate it differently than in the original source.
       Many existing information services build upon user profiles, e.g. news services such as CNN Custom News. Users are allowed to explicitly set up their profiles by selecting a set of categories and subcategories that fit their interests.
  • Virtual Reality and Agents in a Digital Library BIBAFull-Text 681-682
      Guadalupe Muñoz; Ignacio Aedo; Paloma Díaz
    The objective of this poster is to present the digital library VILMA, (Virtual Intelligent Library using Multi-Agent Systems). A digital library has been described as a federated structure that provides humans both intellectual and physical access to the huge and growing world-wide networks of information encoded in multimedia digital formats. VILMA will provide physical access through its intelligent agents and will ease intellectual one using virtual reality for its interface. VILMA architecture (see figure 1) should be as flexible as possible because its environment is a big network, with users accessing it from different machines and protocols.
       The most efficient way to achieve modularity, flexibility and scalability is distributing little tasks among many specialised agents [1]. If we use a set of intelligent agents for the library's architecture instead of a main program to control it, we will not need to update a main program each time we enlarge the library. These Multi-Agent Systems (M.A.S.) will be in charge of the main functionalities in VILMA such as information retrieval from the world wide web according to readers and librarian requests and document cataloguing and indexing to include the obtained information in the proper database. To accomplish these tasks, they will have to communicate among themselves, coordinate their activities to determine organisational structure amongst a given group of agents and to allocate tasks and resources, and negotiate to solve conflicts.
       Also, an interface based on virtual reality provides a friendly environment for the user as shown in [2].
    The European Schoolnet: An Attempt to Share Information and Services BIBAFull-Text 683-684
      Charlotte A. Linderoth; Anders Bandholm; Birte Christensen-Dalsgaard; Gertrud Berger
    The European Schoolnet is a network of networks, created for schools in Europe. Among the objectives for the EUN initiative is to establish and test a shared repository of educational resources based on a distributed model. The unity will be established through use of protocols like Z39.50 for simultaneous search of heterogeneous databases and on defining core metadata elements to be filled out by all participants.
    The Document Management System Saros Mezzanine and the New Product AGORA as Key Component in a Digital Library Architecture at Göttingen University Library BIBAFull-Text 685-687
      Frank Klaproth; Norbert Lossau
    Current publications are more and more available in electronic form. Nevertheless there is still a clear predominance of printed material within library holdings. In order to facilitate online access to these information the Center for Digitization at Göttingen University Library was initiated as innovative service center for digitization work and techniques, financially supported by the German Research Foundation (DFG) and the Ministry of Culture of the Federal State of Lower Saxony. Setting up a Document-Management-System (DMS) as key component for the digital library architecture is one of the overall goals for the Center. Starting with Saros Mezzanine, a traditional EDMS for companies from FileNet, we present now, as result of a collaboration with the Satz-Rechen-Zentrum (SRZ) company in Berlin, a comfortable system named AGORA. The article gives a brief overview over development and functionality of the new DMS for digital libraries, AGORA.
    Electronic Roads in the Information Society BIBAFull-Text 689-690
      Costas Zervos; Stathis Panis; Dionysis Dionysiou; Michaelis Dionysiou; Constantinos S. Pattichis; Andreas Pitsillides; George A. Papadopoulos; Antonis C. Kakas; Christos Schizas
    The objective of this study is to investigate an approach for dynamic construction of Electronic Roads. We envision Electronic Roads spanning a virtual multidimensional space, a distributed digital information repository, comprising primarily of video data of cultural heritage. A visitor (user) will be able to travel in this multidimensional space along different, semantically related historical, geographical, economic or cultural paths.
       Consider the following example. At a particular point in time our visitor (user) is located at a node of this multidimensional space and s/he has to decide how to proceed with his/her journey. A cluster of nearest neighbours is computed and presented to the user based on the current node but also the most recently visited nodes in order to postulate as to which semantic path the visitor is actually following. Even though the next node along this path has precedence the user will have the option to override this, thus migrating to a different path.
       The building block of the system is the information unit, which consists of the actual data (e.g. segment of video, image, sound or text) with an attached metadata index. All information units are elementary in granularity. That is, there is no hierarchical structure. The repository of information can be viewed as a pool of information units. There is of course a tradeoff as to what will be precisely the level of granularity, i.e. how elementary the information units will be.
    The Technical Chamber of Greece Digital Library: The Vision of a Special Organisation to Save and Disseminate Its Information Work Through the Network BIBAFull-Text 691-692
      Katerina Toraki; Sarantos Kapidakis
    This presentation discusses the creation of a digital library which contains scientific work produced by the Technical Chamber of Greece and disseminated to its members through post or library services. The Technical Chamber of Greece (TEE) is the technical consultant to the Government as well as the professional organisation of Greek engineers (numbering more than 60,000). Among TEE activities, it conducts studies and organizes various meetings on technical and related aspects, like architecture, town planning, environment, information technology, chemical engineering, naval architecture, civil engineering and so on.
       It publishes three professional periodicals: the weekly information bulletin "Enimerotiko Deltio" which is sent free to all members and contains news, comments and activities of TEE, the bimonthly journal Technika Chronika, containing studies, reports and articles on general technical subjects and the scientific publication Technika Chronika, published in five separate sections addressed to the various engineering fields, with scientific and research articles in Greek including an extended summary in English. In addition, TEE publishes scientific and technical books and translations of foreign specifications.
       All above works are used heavily by engineers as well as by other scientists and students. They are disseminated through the Documentation and Information Unit of TEE in Athens as well as the libraries of its 15 regional sections connected to a network through a virtual library system.
    User-Centered Design of Adaptive Interfaces for Digital Libraries BIBAFull-Text 693-694
      Tatiana Gavrilova; Alexander V. Voinov
    An approach to account for various user's characteristics, such as professional status, physiological and psychological peculiarities in both a Digital Library and any other adaptive system, addressed to an end-user", is described. This approach is based, first, on a battery of tests to formally measure user's characteristics and, second, on a method of statistical mapping of these factors onto the appropriate adjustable characteristics of the adaptive system.
    A New Method for Segmenting Newspaper Articles BIBAFull-Text 695-696
      Basilios Gatos; N. Gouraros; S. L. Mantzaris; Stavros J. Perantonis; A. Tsigris; P. Tzavelis; Nikolaos Vassilas
    Digital preservation of old newspapers contributes greatly to the historical register of a country's social, political and economical events. At the same time, newspaper preservation is an imperative necessity because of the fast paper deterioration and difficulty in tracing the overwhelming amount of information. Lambrakis Press S.A. owns a large collection of newspapers and periodicals that consists of 1,300,000 pages and covers a time period from 1890 up to date. This material is divided into 600,000 A2 pages, 500,000 A3 tabloid and 200,000 A4 pages approximately. Our team is working on all aspects of the transformation procedure from the printed material to an accessible digital archive (verification and quality control, digitization, cataloguing, search and retrieval, design and content presentation). The final digital documents form the foundation of our digital library.
       Preservation and processing of this precious material can be achieved by focusing on a series of problems related to the digitization of the printed material, such as: image enhancement by noise removal, isolation of newspaper articles by document understanding techniques (segmentation -- labeling). The successful tackling of these problems allows the subsequent efficient cataloguing by employing OCR, full text retrieval and information extraction techniques along with manual indexing.
       In our paper we will present the results of our research associated with the stage of segmentation of the various regions -- the image consists of -- as well as the identification of text regions which have to be separated from other regions, i.e. figures, drawings or line regions.
    Broadway, A Case-Based Browsing Advisor for the Web BIBAFull-Text 697-698
      Michel Jaczynski; Brigitte Trousse
    The World Wide Web (WWW) is an hypermedia of heterogeneous and dynamic documents, frequently referred to be the world wide digital library. This virtual space is growing more and more every day, offering to the user a huge amount of data. Two kinds of tasks can be handled to locate a relevant document through this space: querying and browsing. Querying is appropriate when the user has a clear goal which should usually be expressed through a list of keywords. Different servers on the WWW (such as Yahoo, Lycos, Altavista) can be then used to retrieve matching documents based on their indexing capability. Browsing is well suited when the user cannot express his goal explicitly or when query formulation by keywords is not adequate. Then, the user must navigate through this space, moving from one node to another, looking for a relevant document. These two tasks can be mixed so that querying gives a list of reasonable starting points for browsing.
       However, the huge size and the structure of this space make difficult the indexing of the documents required by querying access methods and could disorient the user during a browsing session. This poster focuses on the assistance given to a group of users during their browsing session, and more precisely on the design of a browsing advisor or recommendation system. A browsing advisor is able to follow the user during a browsing session to infer his goal, and then must advise him of potentially relevant documents to visit next. Three main approaches for recommendation computation can be distinguished: content-based recommendation, profile-similarity based recommendation and behaviour-similarity based recommendation.
    A Digital Library Model for the Grey Literature of Academic Institutes BIBAFull-Text 699-700
      Vassilios Chrissikopoulos; D. Georgiou; Nectarios Koziris; K. Toraki; Panayotis Tsanakas
    A great amount of scientific work carried out at universities, is being disseminated through non-commercially published types of literature, the so-called Grey literature, like dissertations, internal technical reports, teaching material, deliverables of research projects etc. The systematic storage, processing and dissemination of the intellectual work produced in the Greek higher education institutions is an objective which can be achieved with the use of new information and communication technologies.
       A digital library model for the Grey literature of the Greek higher education is proposed, based on the DIENST architecture, leading to an integrated, distributed digital library system. The system will allow the easy and fast entry, access and location of primary information contained and produced in any participating institution. The technical and organisational issues related to the design and implementation of such a system are presented.
    Cross-Language Web Querying: The EuroSearch Approach BIBAFull-Text 701-702
      Martin Braschler; Carol Peters; Eugenio Picchi; Peter Schäuble
    Initially comprising services from Italy (Arianna), Spain (OIZ) and Switzerland (EuroSpider), EuroSearch1 will provide a multilingual searching functionality. Each national site is responsible for maintaining and operating a search service for its own languages, so that the needs of distinct language communities can be catered for by native speakers of that language. The languages covered are currently French, German, Italian and Spanish, plus also English. Differences in the partners' document collections and indexing mechanisms have led to the implementation of different search strategies, depending on the collection to be queried. The cross-language search component of EuroSearch thus consists of an integration of lexicon- and corpus-based search mechanisms and two distinct types of searching will be activated: u query translation using a multilingual lexicon; enhanced by an experimental corpus-based mechanism u similarity thesaurus technology.

    Panels

    Interaction Design in Digital Libraries BIBAFull-Text 703
      Constantine Stephanidis
    In recent years, the field of Human-Computer Interaction (HCI) has made significant advances, penetrating an increasing number and range of computer-mediated human activities. In this context, interaction design has become a critical component of advanced interactive applications and telematic service as well as an increasingly complex challenge to meet.
    Beyond Navigation as Metaphor BIBAFull-Text 705-716
      David Benyon
    With the development of large information spaces such as digital libraries, the notion of user navigation through such spaces has gained prominence. The popular view of navigation is that it is a conscious, goal directed activity in which someone is trying to reach a destination. Such a view of navigation is essentially individualistic, objectivist and cognitive. A semiotic analysis of space recognises that there are many different views of space and that space is a subjectively defined concept. There is a context to space which needs to be communicated, negotiated and understood between people. More than just space, there is the idea of place. People produce or construct their places at different times and there is a knock on effect from one place to another. In this paper some implications of taking this different view of information space are explored.
    User Interaction in Digital Libraries: Coping with Diversity through Adaptation BIBAKFull-Text 717-735
      Constantine Stephanidis; Demosthenes Akoumianakis; Alex Paramythis
    User interface adaptations can be used to address several user interaction challenges in the development of digital library systems. To this end, this paper: (a) examines some of the intrinsic characteristics of digital library systems; (b) identifies some of the key Human-Computer Interaction (HCI) challenges; and (c) develops an argumentation for adaptations in digital library systems. By drawing parallels to recent work in HCI research into adaptable and adaptive user interaction, the paper illustrates potential areas in which user interface adaptation can provide a useful technique for advancing the quality of human interaction with a digital library system.
    Keywords: digital libraries; user interface adaptation; unified interface development
    Metadata and Content-Based Approaches to Resource Discovery BIBAFull-Text 737-738
      Thomas Baker; Judith Klavans
    Researchers in multilingual information retrieval and natural language processing are making progress on algorithms for guessing what a text is about and for translating queries between languages; some day we may have reliable programs for recognizing musical melodies or the content of images. In contrast, researchers in metadata focus more on seeking consensus on the meanings of categories for describing resources; on finding ways to allow simple schemas to interoperate with complex ones; and on designing frameworks for managing the messy equivalencies between metadata models in different fields and languages. Do the two perspectives form a continuum? What problems do they solve best? For resource discovery, what is the best balance between human and machine?
    Architectures and Services for Cultural Heritage Information BIBAFull-Text 739
      Panos Constantopoulos
    The electronic processing of cultural heritage information involves stages of acquisition/production, storage, indexing, use/exploitation and transfer. The value of information is compounded by usage and association to other information. It is thus important to ensure the linking of information from disparate sources or held in different systems, access according to multiple viewpoints and needs, and exchange among different systems and user groups. Key issues are the selection of architectures that ensure interoperability and the development of services that support the recording, safeguarding, scientific study and promotion of cultural heritage, including artifacts and information sources. This panel will address these issues with a view towards the shaping of a DL domain.
    Federated Scientific Data Repositories for the Environment Towards Global Scalable Management of Environmental Information: How Useful Will They Be? What Is Their Potential Impact? Shall We Save the Environment? BIBAFull-Text 741-742
      Catherine E. Houstis
    We are currently witnessing a proliferation of the Internet, the World Wide Web, and new distributed systems technologies. The idea of federated scientific data repositories, built from organizationally and geographically distributed units, is becoming an area of increasing interest to both scientists and authorities. The general public could benefit from them as well with appropriate access. Such a development would pave the way towards global sharing and combination of environmental information, thereby promoting co-operation among scientists and considerably supporting environmental information management and decision making. From a technical perspective, building federated scientific repositories involves confronting problems such as heterogeneity, distribution, integration, knowledge, authentication, appropriate interfaces and performance. From an organizational perspective it involves Inter-disciplinary co-operation, cross-organizational management, as well as important legal and economic issues. Within the computer science arena solutions exist to each of the technical problems, but their integration is still elusive. Notably, the organizational problems that have to be addressed at the management level may prove far more challenging. So, given that the ingredients exist, how are we to proceed? And will such an effort be worthwhile? In the end shall we save the environment?

    Treasure Chest or Pandora's Box: The Challenge of the Electronic Library as a Vehicle for Scholarly Communication

    Preservation and Access: Two Sides of the Same Coin BIBAFull-Text 743-752
      Pieter J. D. Drenth
    The number of conferences, meetings and publications on digital and electronic information over the past years has increased exponentially. Digital records seem to replace to a large extent the classical paper collections and paper records, and concern about availability and accessibility of paper material seem to vanish in the shining light of the explosion of present day's electronic communication and digitization developments.
       Let us bring in some realism. In a recent interview in Der Spiegel Klaus Dieter Lehmann, the director of the Deutsche Bibliothek, the German deposit library, mentioned under the appropriate heading Books do have advantages" that of the 300.000 acquisitions per year only 2-3000, that is less than 1%, are in digital form. It is obvious that if such a large proportion of large libraries is still in paper format concern for this type of material does not reflect a quaint preference for an old fashioned and outdated medium.
       An organization which is concerned with the preservation and access of our collective memory in all its forms is the European Commission on Preservation and Access, and it is a privilege for me as its chairman to present to the conference the aims and objectives and some of the past and future activities of this commission. The ECPA was established in 1994 by a group of librarians, archivists and scholars out of concern for the fate of millions of books and documents threatened by acidification and embrittlement in Europe.
    Access Versus Holdings: The Paradox of the Internet BIBAFull-Text 753-759
      Derek Law
    It is the purpose of this paper to argue that librarians have been blinded to its basic flaws by the gaudiness of the Internet and that we are confusing sources and resources. The Internet shows none of the features required for scholarly communication and whether or not we believe this will change, we should be developing models which offer electronic services as a viable and reliable resource.
       Although the Internet is of some age in the dog years which pass for computing time, the World Wide Web is relatively new, with the first web browser dating only from 1994. In the four years after that it achieved a phenomenal acceptance, in what Paul Evan Peters called the largest mass migration in human history. It was adopted by fifty million users in fifty months. Radio took 38 years to gain such an audience and television some thirteen years. Currently it has some seventy million users. And yet it lacks the important elements of sustainability necessary for scholarship:
  • Permanence
  • Availability
  • Accessibility One of the unremarked triumphs of librarianship in the last forty years is that we have created a system which allows the researcher reliably and persistently to identify and retrieve any document published anywhere in the world. This has been a long term project which is a bedrock for scholarship. The Web, on the other hand, is in fact a four year old experiment, not a robust service. Not for nothing is it called the World Wide Wait.
  • The Human Factors in Digital Library Development BIBAKFull-Text 761-769
      Andrew McDonald
    In developing successful digital libraries there are considerable human resource challenges for our institutions, library management and library staff. There is growing evidence from a number of research projects and from the experience of library managers that the management culture and attitudes required for digital libraries are quite different than those appropriate for print-based libraries. The critical human, cultural and organisational factors needed to create the environment in which digital delivery can be effective and sustained are explored. These include the cultural and organisational shifts in parent institutions; new service culture and values in the library; changes in organisational structure and management style in the library; and the effect of the digital library on service staff. Digital libraries need digital librarians who possess particular skills, knowledge and experience and these may be very different from those required of the "analogue" librarian. Regrettably, the importance of human factors in digital library development is often underestimated in relation to the technological and information challenges involved.
    Keywords: Human factors; digital libraries; digital librarians; library management; organisational culture; service culture and values; library staff
    Digital Information Management Within Modern Library Systems, Consortia and e-journals BIBAFull-Text 771-776
      Friedrich W. Froben
    In our modern world of fast changing technology, let us look back for a minute and remember the beginning of communication. Not so long ago, only language was available for interaction and only memory for storage. Few years ago -- relatively speaking -- the print culture started, first on stone and later on papyrus, that was around 1900 years ago, it was possible to write, to read, to store and to register. Most likely, this helped to promote and spread our culture, literature, philosophy, medical and natural science, but also technical and human mistakes.
       Today, one of the interesting questions is, whether the digital information, the new superhighway of fast data transfer will help to develop new values of our society, restore some of the old values, or if the new information will only be used to promote injustice. As long as libraries existed, remember the Alexandra library in old Egypt, they are always centres of information, of study, storage and innovation. Now their role is changing to using modern technology, to the distribution of knowledge via digital data networks and we will discuss some of this new role and its advantages and disadvantages (1-4).
       Consortium is one of this modern words, there are consortia for constructions, for banking, of airlines... and for libraries -- a larger unit of libraries to coordinate acquisition and to cooperate in many aspects. In this context, the consortium is thought to cooperate for joint access to online material, mainly e-journals. It is thought as a framework for the scientific libraries and allows individual members to participate in contracts with publishers or agents.

    Delos Workshop

    METU-Emar: An Agent-Based Electronic Marketplace on the Web BIBAFull-Text 777-790
      Asuman Dogac; Ilker Durusoy; Sena Nural Arpinar; Esin Gokkoca; Nesime Tatbul; Pinar Koksal
    In this paper, we describe a scenario for a distributed marketplace on the Web where resource discovery agents find out about resources that may want to join the marketplace and electronic commerce is realized through buying agents representing the customers and the selling agents representing the resources like electronic catalogs. We propose a possible architecture which is based on the emerging technologies and standards. In this architecture, the resources expose their metadata using Resource Description Framework (RDF) to be accessed by the resource discovery agents and their content through Extensible Markup Language (XML) to be accessed by the selling agents by using Document Object Model (DOM). The marketplace contains Document Type Definitions (DTDs) and a dictionary of synonyms to be used by the buying agents to help the customer to specify the item s/he wishes to purchase. Distribution infrastructure is CORBA and Web on which the buying and selling agents find out about each other using Trading Object Services. The modifications necessary to the proposed architecture considering only the available technology are also discussed.
    Electronic Commerce for Software BIBAFull-Text 791-800
      Tsuneo Ajisaka
    Since software is electronic itself, the features of electronic commerce to be highlighted in case software is its domain can be discussed comparing with other tangible kinds of merchandise. Several spectra of e-commerce are firstly investigated in terms of largely two types of software and its distribution, i.e., shrink-wrapped package software and custom software developed by contract. E-commerce for software is basically service- or process-oriented, in which a brokering service for software packages or components is particular. The architectures of the business-to-business software e-commerce, or the Software CALS, are discussed next. Business, logical, and physical architectures are investigated in terms of their structure and components, services, and data models. Middleware engineering for the UI, communication, and data servers will be one of the most important agenda for the deployment of software e-commerce.
    RainMaker: Workflow Execution Using Distributed, Interoperable Components BIBAFull-Text 801-818
      Santanu Paul; Edwin Park; David Hutches; Jarir K. Chaar
    As individuals and enterprises interconnect via wide area networks, workflows that span them seamlessly will become increasingly valuable. It is likely that heterogeneous participants -- humans, applications, organizations -- that are physically dispersed over such networks will share workflows that cut across organizational and geographic boundaries. We address the problem of designing a distributed workflow infrastructure that supports such scenarios in the presence of heterogeneous workflow systems and components. We present RainMaker, a workflow framework based on a service requestor/service provider execution model. RainMaker defines a core set of abstract interfaces that can be implemented by distributed workflow components. Together, the RainMaker execution model and interfaces provide a foundation for the interoperability of workflow systems and components.
    Design Criteria for a Virtual Market Place (ViMP) BIBAFull-Text 819-832
      Simon Field; Christian Facciorusso; Yigal Hoffner; Andreas Schade; Markus Stolze
    This paper considers the requirements customers and providers have from a virtual insurance market place, and proposes a set of desirable features to satisfy them. A design implementing these features is proposed, based on a logical structuring of the information needed to support the dialogue between providers and customers. The applicability of this design for market places trading products other than insurance is discussed, and further research to consider the particular features of business services is suggested.
    Intellectual Property Right and the Global Information Network BIBAFull-Text 833-838
      Nikos K. Lakoumentas; Emmanuel N. Protonotarios
    By digitizing copyright works and other protected objects, many benefits for the users arise, especially by simplifying the access, however, for the rights owners it can represent both an opportunity and a threat. Materials can be distributed speedily on the networks, and new markets are opened up, but there is also the danger of loss of sales through unauthorized use and exploitation of these same materials. There are different legal aspects and technologies, which cope with Intellectual Property Rights on the Global Information Network.
    NetBazaar: Networked Electronic Markets for Trading Computation and Information Services BIBAKFull-Text 839-856
      Jakka Sairamesh; Christopher F. Codella
    In this paper, we present the design and implementation of NetBazaar, which is a distributed, federated electronic trading system (Marketplace) for buying and selling network resources and services and information products and services distributed across the Internet. The trading system provides mechanisms for suppliers to advertise information about their services and attribute-value pairs, and for consumers to query for information about service offerings by the suppliers. In addition, the trading system offers services to perform the trades on behalf of the consumers or it offers the consumers with a list of suppliers to contact. In order to recover costs and profit, the trading system charges a small fee to the suppliers and consumers for every trade that occurs. The charges could vary depending on the complexity of the trade, such as the overheads of payment, transaction and contract enforcement. NetBazaar has been designed to support a variety of business models, pricing and market mechanisms, searching and matching algorithms, fast negotiation mechanisms for a high volume of trades, and distributed access for consumers and suppliers to the trading system. An initial version of NetBazaar has been implemented using CORBA and Java components.
    Keywords: Electronic Marketplace; Distributed Markets; Searching and Matching; Pricing; Product Differentiation; Industrial Organization
    The Shift Towards Electronic Commerce: Market Transformation and Employment Impact BIBAFull-Text 857-872
      Panayiotis Miliotis; Angeliki Poulymenakou; Georgios I. Doukidis
    Electronic commerce is a technology enabled market phenomenon having an increasing impact in how markets are changing. Electronic markets are in a transition characterized by the entrance of strong market players promoting concentration of market activity, and by a slower move, compared to early predictions, towards open electronic markets. Market change has significant impact on the levels, sources and nature of employment. The types of firms that will survive or enter in the emerging market places will affect the sources of employment in the market, while the nature of knowledge and skills required within electronic market places will affect the types of employment that will be offered in the future. This paper provides an understanding regarding future employment conditions by tracing the processes of market change and by providing a detailed explanation of the underlying rationale for these changes. Furthermore, a link has been developed between business transformation phenomena incurred by market change and transformation of the nature and types of work. The analysis model used to discuss specific changes to jobs and skills is applied illustratively to the case of the commerce sector.
    The Coyote Project: Framework for Multi-party E-Commerce BIBAFull-Text 873-889
      Asit Dan; Daniel M. Dias; Thao Nguyen; Marty Sachs; Hidayatullah Shaikh; Richard P. King; Sastry Duri
    The Internet provides the opportunity for quickly setting up deals between businesses for promoting each other's products, and to jointly offer new services. Specification and enforcement of such deals stretch traditional transaction processing concepts in several directions since they involve independent businesses with their own internal processes. First, the greater variability in response time in business to business interaction creates a need for asynchronous and event-driven processing, in which correct handling of reissued and cancelled requests is critical. Second, a new transaction processing paradigm is required that supports different views of a unit of business for all participants, i.e., service providers as well as end consumers. Between any two interacting parties, there may be several related interactions dispersed in time, creating a long running conversation. This paper describes our approach (Coyote) to solving these problems including use of a service contract for specifying the rules of interaction across businesses, and directly generating code for enforcement of the contract. We finally describe the architecture and a prototype of a system which implements the Coyote concepts.
    MarketNet: Using Virtual Currency to Protect Information Systems BIBAFull-Text 891-902
      Yechiam Yemini; Apostolos Dailianas; Danilo Florissi
    This paper describes novel market-based technologies for systematic, quantifiable and predictable protection of information systems against attacks. These technologies, incorporated in the MarketNet system, use currency to control access to information systems resources and to account for their use. Clients wishing to access a resource must pay in currency acceptable to the domain that owns it. An attacker must thus pay to access the resources used in an attack. Therefore, the opportunities to attack and the damage that can be caused are strictly limited by the budget available to the attacker. A domain can control its exposure to attacks by setting the prices of critical resources and by limiting the currency that it makes available to potential attackers. Currency carries unique identifiers, enabling a domain to pinpoint the sources of attacks. Currency also provides a resource-independent instrumentation to monitor and correlate access patterns and to detect intrusion attacks through automated, uniform statistical analysis of anomalous currency flows. These mechanisms are resource-independent, and admit unlimited scalability for very large systems consisting of federated domains operated by mutually distrustful administrations. They uniquely establish quantifiable and adjustable limits on the power of attackers; enable verifiable accountability for malicious attacks; and admit systematic, uniform monitoring and detection of attacks.

    Demonstrations

    CiBIT: Biblioteca Italiana Telematica -- A Digital Library for the Italian Cultural Heritage BIBAFull-Text 903-904
      Eugenio Picchi; Lisa Biagini; Davide Merlitti
    The primary objective of the CiBIT digital library is the widest possible diffusion of information on Italian art, literature and history. The aim of CiBIT is to allow scholars and other interested users throughout the world to access and study multimedia documentation (voice and image data are also included) containing all kinds of information on and describing different aspects of Italian culture. The following main areas of artistic and cultural interest are covered: philology, literature, linguistics, medieval and modern history, legal history, musicology.
       The intention is to make the CiBIT service available to students and scholars in university libraries and research institutes, in the first place, but also in libraries serving the general public. Main aims of the project are to allow users throughout the world (through Internet) to access and consult Italian cultural data, and to provide them with a series of sophisticated tools to process and analyse the textual material in various ways.
       The CiBIT query system, which is based on that of the DBT system1, has been completely developed in JAVA so that it can be used by all existing WEB browsers and can operate on all kinds of hardware and software platforms. Java applets have been implemented to provide the same level of functionality as the stand-alone DBT system. A multiserver middleware engine automatically selects the best host site depending on the particular requirements of a given query. The CiBIT digital library can be accessed through the consultation of a single text, of a set of texts, or of an entire corpus. Relevant subsets of texts in a corpus can be defined dynamically using the bibliographic data associated with the texts.
    The ERCIM Technical Reference Digital Library BIBAFull-Text 905-906
      Stefania Biagioni; José Luis Borbinha; Reginald Ferber; Preben Hansen; Sarantos Kapidakis; László Kovács; Frank Ross; Anne-Marie Vercoustre
    Within the context of the DELOS1 Working Group, eight institutions of the European Research Consortium for Informatics and Mathematics (ERCIM) are currently collaborating on the installation of an ERCIM Technical Reference Digital Library (ETRDL). The aim is to implement and test a prototype infrastructure for networked access to a distributed multi-format collection of technical documents produced by ERCIM members. The collection is managed by a set of interoperating servers, based on the Dienst system developed by a US consortium led by Cornell University and adopted by NCSTRL (Networked Computer Science Technical Reference Library). Pilot server sites have already been set up at half of the 14 ERCIM national labs. Servers are expected to be installed at the other centres soon. The aim is to assist ERCIM scientists to make their research results immediately available world-wide and provide them with appropriate on-line facilities to access the technical documentation of others working in the same field. Public access to this reference service is provided through Internet.
       In addition to the basic service provided by the DIENST system, some additional functionalities are being implemented in the ETRDL common user interface in order to meet the particular needs of the European IT scientific community. An author submission form has been included to facilitate the insertion of new documents by the users themselves. The service can be accessed through the DELOS Web site.
    NAIST Digital Library BIBAFull-Text 907-908
      Hideki Sunahara; Rei Atarashi; Toru Nishimura; Masakazu Imai; Kunihiro Chihara
    NAIST Digital Library provides practical library services for faculties, students and staffs of Nara Institute of Science and Technology. Operation of this system is started in April 1996 [1]. The system mainly consists of the database system including electronic books, journals, magazines, NAIST technical reports, etc., the digital video library system, the browser system, the information retrieval system. Our system allows users to browse electronic books, journals, transactions, and magazines through networks. In addition, the video browser is also provided.
       Fig. 1 shows the overview of our current system. File servers have a capacity of storing library data of totally 5TB. A hierarchical storage works as UNIX file systems. Furthermore, files on the servers will migrate to the suitable storage according to frequency of access to them.
       The library search engine enables users to easily obtain what they need in the library database. This search engine works as WWW server. Fig. 2 shows an example what you can actually get from NAIST Digital Library.
       The Data Input System generates data for digital library with scanners, OCR softwares, CD-ROM drives, and MPEG-2 encoders.
       A current system provides following services. -- search function includes bibliographical information search and whole text search.