[1]
What is the IQ of your data transformation system?
DB track: probabilistic and uncertain data
/
Mecca, Giansalvatore
/
Papotti, Paolo
/
Raunich, Salvatore
/
Santoro, Donatello
Proceedings of the 2012 ACM Conference on Information and Knowledge
Management
2012-10-29
p.872-881
© Copyright 2012 ACM
Summary: Mapping and translating data across different representations is a crucial
problem in information systems. Many formalisms and tools are currently used
for this purpose, to the point that developers typically face a difficult
question: "what is the right tool for my translation task?" In this paper, we
introduce several techniques that contribute to answer this question. Among
these, a fairly general definition of a data transformation system, a new and
very efficient similarity measure to evaluate the outputs produced by such a
system, and a metric to estimate user efforts. Based on these techniques, we
are able to compare a wide range of systems on many translation tasks, to gain
interesting insights about their effectiveness, and, ultimately, about their
"intelligence".
[2]
EDITED BOOK
Search Computing: Broadening Web Search
Lecture Notes in Computer Science 7538
/
Ceri, Stefano
/
Brambilla, Marco
2012
n.16
p.254
Springer Berlin Heidelberg
DOI: 10.1007/978-3-642-34213-4
== Extraction and Integration ==
Web Data Reconciliation: Models and Experiences (1-15)
+ Blanco, Lorenzo
+ Crescenzi, Valter
+ Merialdo, Paolo
+ Papotti, Paolo
A Domain Independent Framework for Extracting Linked Semantic Data from Tables (16-33)
+ Mulwad, Varish
+ Finin, Tim
+ Joshi, Anupam
Knowledge Extraction from Structured Sources (34-52)
+ Unbehauen, Jörg
+ Hellmann, Sebastian
+ Auer, Sören
+ Stadler, Claus
Extracting Information from Google Fusion Tables (53-67)
+ Brambilla, Marco
+ Ceri, Stefano
+ Cinefra, Nicola
+ Sarma, Anish Das
+ Forghieri, Fabio
+ et al
Materialization of Web Data Sources (68-81)
+ Bozzon, Alessandro
+ Ceri, Stefano
+ Zagorac, Srdan
== Query and Visualization Paradigms ==
Natural Language Interfaces to Data Services (82-97)
+ Guerrisi, Vincenzo
+ Torre, Pietro La
+ Quarteroni, Silvia
Mobile Multi-domain Search over Structured Web Data (98-110)
+ Aral, Atakan
+ Akin, Ilker Zafer
+ Brambilla, Marco
Clustering and Labeling of Multi-dimensional Mixed Structured Data (111-126)
+ Brambilla, Marco
+ Zanoni, Massimiliano
Visualizing Search Results: Engineering Visual Patterns Development for the Web (127-142)
+ Morales-Chaparro, Rober
+ Preciado, Juan Carlos
+ Sánchez-Figueroa, Fernando
== Exploring Linked Data ==
Extending SPARQL Algebra to Support Efficient Evaluation of Top-K SPARQL Queries (143-156)
+ Bozzon, Alessandro
+ Valle, Emanuele Della
+ Magliacane, Sara
Thematic Clustering and Exploration of Linked Data (157-175)
+ Castano, Silvana
+ Ferrara, Alfio
+ Montanelli, Stefano
Support for Reusable Explorations of Linked Data in the Semantic Web (176-190)
+ Cohen, Marcelo
+ Schwabe, Daniel
== Games, Social Search and Economics ==
A Survey on Proximity Measures for Social Networks (191-206)
+ Cohen, Sara
+ Kimelfeld, Benny
+ Koutrika, Georgia
Extending Search to Crowds: A Model-Driven Approach (207-222)
+ Bozzon, Alessandro
+ Brambilla, Marco
+ Ceri, Stefano
+ Mauri, Andrea
BetterRelations: Collecting Association Strengths for Linked Data Triples with a Game (223-239)
+ Hees, Jörn
+ Roth-Berghofer, Thomas
+ Biedert, Ralf
+ Adrian, Benjamin
+ Dengel, Andreas
An Incentive-Compatible Revenue-Sharing Mechanism for the Economic Sustainability of Multi-domain Search Based on Advertising (240-254)
+ Brambilla, Marco
+ Ceppi, Sofia
+ Gatti, Nicola
+ Gerding, Enrico H.
[3]
Automatically building probabilistic databases from the web
Demo session
/
Blanco, Lorenzo
/
Bronzi, Mirko
/
Crescenzi, Valter
/
Merialdo, Paolo
/
Papotti, Paolo
Proceedings of the 2011 International Conference on the World Wide Web
2011-03-28
v.2
p.185-188
© Copyright 2011 ACM
Summary: A relevant number of web sites publish structured data about recognizable
concepts (such as stock quotes, movies, restaurants, etc.). There is a great
chance to create applications that rely on a huge amount of data taken from the
Web. We present an automatic and domain independent system that performs all
the steps required to benefit from these data: it discovers data intensive web
sites containing information about an entity of interest, extracts and
integrate the published data, and finally performs a probabilistic analysis to
characterize the impreciseness of the data and the accuracy of the sources. The
results of the processing can be used to populate a probabilistic database.
[4]
Exploiting information redundancy to wring out structured data from the web
WWW posters
/
Blanco, Lorenzo
/
Bronzi, Mirko
/
Crescenzi, Valter
/
Merialdo, Paolo
/
Papotti, Paolo
Proceedings of the 2010 International Conference on the World Wide Web
2010-04-26
v.1
p.1063-1064
Keywords: data extraction, data integration, wrapper generation
© Copyright 2010 ACM
Summary: A large number of web sites publish pages containing structured information
about recognizable concepts, but these data are only partially used by current
applications. Although such information is spread across a myriad of sources,
the web scale implies a relevant redundancy. We present a domain independent
system that exploits the redundancy of information to automatically extract and
integrate data from the Web. Our solution concentrates on sources that provide
structured data about multiple instances from the same conceptual domain, e.g.
financial data, product information. Our proposal is based on an original
approach that exploits the mutual dependency between the data extraction and
the data integration tasks. Experiments confirmed the quality and the
feasibility of the approach.