Query Optimization for the Semantic Web
Query Optimization for the Semantic Web


A workshop at Universidad Rey Juan Carlos supported by the Spanish Agencia Española de Cooperación Internacional (AECI) under the project "Mecanismos de Optimización y Evaluación para consultar eficientemente la Web Semántica en la Administración Pública Electrónica".

May 31, 2007, URJC, Campus Móstoles, Deparamental II, Room 070


Speakers Abstracts Program&Materials Venue


Speakers

Abstracts

YARS2: Indexing and Query Evaluation in a Federated Environment

Andreas Harth, DERI, National University of Ireland, Galway

Traditional logic programming systems allow for evaluation of complex logical expressions on a confined number of statements, while traditional Web search engines employ keyword-based searches over massive amounts of hypertext. We integrate ideas from both systems engineering and research in logic into our prototype system. To arrive at a scalable infrastructure for processing massive amounts of semantic data, we introduce a distributed indexing framework for graph-structured data rooted in search systems engineering, and layer increasingly advanced query processing techniques on top of highly optimised atomic lookup operations. We describe the architecture of our initial query processor implementation, and highlight challenges and areas for furture work.

Answer Set Programming vs very large datasets

Giovambattista Ianni, Universitá della Calabria, Rende, Italy

Answer Set Programming (ASP) is a versatile family of rule languages originally conceived as quick specification tool for optimization and combinatorial problems. Recent research tackled the variety of technical problems related to the adoption of ASP within practical applications such as Semantic Web Data Management, Data Integration, Web Services. Besides interoperability, real applications require scalability levels capable to deal with very large dataset. The talk is divided in three parts. First, the audience is introduced to ASP languages and their special specification constructs, with particular focus on the DLV system. Second, applications of ASP towards database interoperability are surveyed, with particular focus on the DLV-DB system, a remarkable example of smart interfacing of an ASP system with a professional DBMS. Third, we illustrate current lines of research related with RDF data querying and update, where an ASP system is coupled with a native RDF triplestore Management System. In this scenario it is conceivable a) a quick way for enabling RDFS entailment capabilities, b) the introduction of a rule based mapping specification language, and ... more and more.

A Directed Hypergraph Model for RDF

Amadís Antonio Martínez Morales, Universidad de Carabobo, Venezuela, and
María Esther Vidal, Universidad Simiín Bolívar, Venezuela

RDF is a proposal of the W3C to express metadata about resources in the Web. The RDF data model allows several representations, each one with its own limitations at expressive power and support for the tasks of query answering and semantic reasoning. In this paper, we present a directed hypergraph model for RDF to represent RDF documents efficiently. We compare our approach with other proposals and we study its impact on the tasks of query answering and semantic inference. Finally, we explain the ob jectives that we plan to achieve in the context of this PhD dissertation.

Exploiting Equivalences in Logic Programs for Query Optimization

David Pearce, Universidad Rey Juan Carlos, Madrid, Spain

(to be announced)

Evaluating SPARQL++ - preliminary results

Axel Polleres, DERI, National University of Ireland, Galwayi/Universidad Rey Juan Carlos, Madrid, Spain

In this talk we will discuss preliminary results on translating SPARQL++ to Logic Programming under the Answer Set Semantics. SPARQL++ is an extended version of the RDF query language SPARQL enriching its basic concepts by ontology mappings, RDFS inferences, built-in functions and aggregates. We will show that all these extensions nicely fit with an evaluation in Datalog engines. Since the proposed added constructs in SPARQL++ increase the expressivity of SPARQL, new challenges arise for efficient query evaluation algorithms.

SPARQL Cost-based Query Optimization

Edna Ruckhaus and María-Esther Vidal, Universidad Simiín Bolívar, Venezuela

Large ontologies need an efficient SPARQL query language. In general, an ontology is comprised by explicit knowledge, and by infer- ence axioms. Efficient evaluation strategies are needed in Web ontologies query engines. In this work we propose a cost-based optimization tec- nique for SPARQL queries. In our approach, ontologies are formalized as a deductive database called a Deductive Ontology Base (DOB). The extensional database comprises all the ontology language statements that represent the explicit on- tology knowledge. The intensional database corresponds to the set of deductive rules which define the semantics of the ontology language. Specifically, we represent an OWL Lite ontology as a DOB knowledge base, and a SPARQL query as a DOB query. In this work we present a DOB algebra that is comprised of several oper- ators according to the different SPARQL syntactic options: basic pat- terns, optional patterns, union of patterns, and filters. We only consider Select queries without nested patterns. Each of these logical operators may correspond to one or more physical operators. We have defined an algorithm that implements each physical operator. Taking into account that our cost metric is focused on the number of intermediate triples that need to be inferred in order to answer the query, the cost function for these physical operators has been determined. The objective of our optimization technique is to find an order of the predicates in the body of the query, such that the number of intermediate inferred triples is reduced. We propose a hybrid cost model that combines two techniques for cost estimation: (1) a sampling technique proposed is applied for the estima- tion of the evaluation cost and cardinality of intensional predicates and of built-in predicates, and (2) a cost model `a la System R cost model is used for the estimation of the cost and cardinality of extensional pred- icates and the cost of conjunctive queries involving basic and optional patterns, and union of patterns. In [SPARQL Rules!] a translation from SPARQL to Datalog with stratified negation as failure is presented; the idea is to imple- ment SPARQL on top of existing rule query engines. We follow the same translation ideas but we add efficient physical operators and cost-based evaluation techniques in our system.

Program & Material

Venue Information

Universidad Rey Juan Carlos, Campus Móstoles, Deparamental II, room 070.

Tulipán s/n
28933 Móstoles
Madrid, España

Map of Universidad Rey Juan Carlos, DII/070
Tulipán s/n
28933 Móstoles
Madrid, España


Reachable from Madrid with public transport:


Organized by

Universidad Rey Juan Carlos   Universidad Simón Bolívar          deri institute

Contact person: Axel Polleres


Valid XHTML 1.0 Strict