Alberto Mendelzon Workshop School 2016

AMW SCHOOL

Panama City-Panama

6-10 June, 2016

ABOUT

Alberto Mendelzon Workshop School (AMWS):

For the past three years, AMW has been preceded by a two-day “Summer School”, with international speakers invited to present three-hour tutorials to a mix of students and other interested attendees.

The high-level goals of the AMW School are two-fold:

To host tutorials targeted at students (advanced undergraduate or postgraduate level) or other early- term researchers interested in the area of Data Management.

To provide a venue where young Latin American researchers can meet, discuss, learn and seek feedback on their topics, thus reinforcing research networks (of the future) in the area.

This year, the 4rd AMW School is organized by Juan Sequeda (Capsenta Labs, USA) and Domagoj Vrgoc (Pontificia Universidad Católica de Chile).

Please note that since Computer Science is an area in development in Panama, this year thefirst day of the School will consist of four introductory classes held in Spanish in order to attract local students. In previous years we have witnessed that younger students are not inclined to attend classes held exclusively in

English and this year we hope to avoid this by holding thefirst day of the school in Spanish. This will be followed by a day in Englishwhich will cater both to international visitors and to students who were attracted to the workshop by the prospect of learning about database topics in their native language. At the end of each day, we will also have a poster session which is intended to encourage the exchange of the ideas between students and the researchers participating in the school and the workshop.

The two days of the school are arranged such that the introductory courses from the first day prepare the students for the more advanced topics that will be presented the day after. This order is also maintained the first day, where each lecture builds on top of the previous one. We have a total of four lectures for the first day of the school and two advanced tutorials for day 2. The broad focus of each day is:

Day 1 (Databases and the Semantic Web): An introduction to foundational aspects of Databases, Description Logics, and the Semantic Web.

Day 2 (Big data and Ontologies): Talks about entity resolution in Big Data and how to answer queries in the presence of Ontologies.

We have arranged for six speakers of international repute in the area of Data Management to provide six lectures at the school.

Tutorial: Introduction to the Semantic Web and the Web of Linked Data

Oscar Corcho, Facultad de Informática

Universidad Politécnica de Madrid

The Semantic Web and the Web of Linked Data have become a reality in the last years partly due to the W3C consorcium’s effort to standardize the formats for representing data and ontologies (RDF, RDF Schema, RDFa, JSON-LD, OWL), and to standardize query languages (SPARQL). Likewise, many organisations have decided to start publishing linked data, to annotate their Web content using these formats, and to use public data in all types of applications. This lecture will introduce the fundamentals notions of the Semantic Web and the Web of Linked Data and describe the main research challanges we face in this area.

Speaker Bio: Oscar Corcho is an Associate Professor at Departamento de Inteligencia Artficial (Facultad de Informática , Universidad Politécnica de Madrid) , and he belongs to the Ontology Engineering Group. His research activities are focused on Semantic e-Science and Real World Internet, although he also works in the more general areas of Semantic Web and Ontological Engineering. Previously, he worked as a Marie Curie research fellow at the University of Manchester, and was a research manager at iSOCO. He holds a degree in Computer Science, an MSc in Software

Engineering and a PhD in Computational Science and Artficial Intelligence from UPM. He was awarded the Third National Award by the Spanish Ministry of Education in 2001. He has published several books, from which Ontological Engineering” can be highlighted as it is being used as a reference book in a good number of university lectures worldwide, and more than 100 papers in journals, conferences and workshops. He usually participates in the organisation or in the programme committees of relevant international conferences and workshops.

Tutorial: Introduction to Databases

Juan L. Reutter: Pontficia Universidad Católica de Chile

Tutorial: An introduction to Description Logics and Ontology Languages

Magdalena Ortiz: TU Wien

Databases are at the core of commercial software applications, and are essential for any application that requires storing, updating or consulting volumes of data in an efficient way. The purpose of this talk is to introduce the participants into the theoretical foundations that go along with the development of relational databases. We shall discuss the relationship of the Relational Algebra and Relational Calculus with SQL, the query language used in all relational database systems. We study some of the basic fundamental properties of these languages, and show how these languages are used back in database systems, to solve problems such as query optimisation or data integration.

Speaker Bio: Juan L. Reutter is an assistant professor at the Computer Science Department at Pontificia Universidad Católica de Chile and an associate investigator of the Chilean Center for Semantic Web Research. He received his PhD from the University of Edinburgh in May, 2013. His research interests are in data management and automata theory. He was the recipient of the Ramon Salas Award for the best Chilean work in engineering and the best paper award in ACM-PODS conference on 2011, and in 2014 he won the BCS distinguished dissertation competition and received the Cor Baayen Award from ERCIM, the European Research Consortium for Informatics and Mathematics. He has served on the program committees of various conferences and workshops, including SIGMOD and AAAI

Recent years have seen enormous progress in the development of ontologies, which are sharable, machine- readable domain conceptualizations. Ontologies are making the Web smarter, improving research and prac- tice in life-sciences, and opening door for a new generation of semantic awareness in information systems. Description Logics (DLs) are a well-established family of languages for Knowledge Representation and Rea- soning. They provide the formal foundations of the OWL languages for writing ontologies, and for the automated tools that ara available for creating and using these ontologies. In this tutorial, we will survey the basics of representing knowledge in DLs and get to know the DLs that underly the most popular OWL profiles. We will also give an overview of the main reasoning services needed for creating and using ontologies, and some of the algorithmic techniques that DL researchers have devised for providing these services.

Speaker Bio: Magdalena Ortiz is a tenure-track assistant professor for Knowledge Representation and Reasoning in the Institute of Information systems of TU Wien, where she is also a Hertha Firnberg Scholar and principal investigator in the project ”Recursive Queries over Semantically Enriched Data Repositories”. Her research interests are centered around logics for knowledge representation and reasoning, with focus on description logics and their application to data access and management. She has received several prizes and awards, including the EMCL Distinguished Alumni Award in for outstanding contributions to the field of Computational Logic, the Award of Excellence of the Austrian Federal Ministry for Science and Research, the Frderpreis of the Austrian Computer Society, the OeGAI Prize of the Austrian Society for Artiﬁcial Intelligence, and the Google Europe Anita Borg Memorial Scholarship.

Carlos Buil-Aranda

Universidad Técnica Federico Santa María

The amount of RDF Data available on the Web has increased dramatically over the last years. These RDF data is stored in distributed database systems allowing everybody to access it via dedicated query services called SPARQL endpoints. Examples of these endpoints are DBpedia (the Wikipedia in RDF format) or Bio2RDF a set of more than 30 distributed RDF databases storing dfferent types of biomedical data (such as medical publications data or gene data). This tutorial aims to provide participants with an overview SPARQL Query Federation, i.e. how to query all these RDF databases as if they were a single one. We will discuss the foundations of the W3C SPARQL Federated Query recommendation, we will provide several algorithms that can be implemented in order to allow such data federation andfinally we will present an overview of the systems implementing such algorithms.

Tutorial: Federation in SPARQL

Speaker Bio: Carlos Buil-Aranda is an assistant professor researcher at Universidad Técnica Federico Santa María, Chile. His work is focused on federated query processing for SPARQL and trying to improve the SPARQL user queries by looking at the endpoints query logs. Carlos received his Ph.D. degree in

2012 from Universidad Politécnica de Madrid, and obtained the best Computer Science Ph.D. Thesis Award from that university. He received the Best Paper Award at the Extended Semantic Web Conference 2011 and the Best Evaluation Paper Award at the International Semantic Web Conference 2013.

Tutorial: Entity Resolution in Big Data

Lise Getoor: University of California Santa Cruz

Tutorial: An introduction to Description Logics and Ontology Languages

Diego Calvanese: Free University of Bozen-Bolzano

Entity resolution (ER), the problem of extracting, matching and resolving entity mentions in structured and unstructured data, is a long-standing challenge in database management, information retrieval, machine learning, natural language processing and statistics. Accurate and fast entity resolution has huge practical implications in a wide variety of commercial, scientific and security domains. Despite the long history of work on entity resolution, there is still a surprising diversity of approaches, and lack of guiding theory. Meanwhile, in the age of big data, the need for high quality entity resolution is growing, as we are inundated with more and more data, all of which needs to be integrated, aligned and matched, before further utility can be extracted. In this tutorial, I’ll bring together perspectives on entity resolution from a variety of fields, including databases, information retrieval, natural language processing and machine learning, to provide, in one setting, a survey of a large body of work. I’ll discuss both the practical aspects and theoretical underpinnings of ER. I’ll describe existing solutions, current challenges and open research problems. In addition to giving attendees a thorough understanding of existing ER models, algorithms and evaluation methods, the tutorial will cover important research topics such as scalable ER, active and lightly supervised ER, and query-driven ER.

Speaker Bio: Lise Getoor is a Professor in the Computer Science Department at UC Santa Cruz. Her research areas include machine learning, data integration and reasoning under uncertainty, with an emphasis on graph and network data. She is a AAAI Fellow, serves on the Computing Research Association and International Machine Learning Society Boards, was co-chair of ICML 2011, is a recipient of an NSF Career Award and ten best paper and best student paper awards. She received her PhD from Stanford University, her MS from UC Berkeley, and her BS from UC Santa Barbara, and was a Professor at the University of Maryland, College Park from 2001-2013

TBA

Speaker Bio: Diego Calvanese is a full professor at the Research Centre for Knowledge and Data (KRDB), Faculty of Computer Science, Free University of Bozen-Bolzano, where he teaches graduate and undergrad- uate courses on knowledge bases and databases, ontologies, theory of computing, and formal languages. He received a PhD from Sapienza University of Rome in 1996. His research interests include formalisms for knowledge representation and reasoning, ontology based data acces and integration, description logics, Semantic Web, graph data management, data-aware process verification, and service modeling and synthesis. He has been actively involved in several national and international research projects in the above areas (including FP6-7603 TONES, FP7-257593 ACSI, FP7-318338 Optique).

He is the author of more than 300 refereed publications, including ones in the most prestigious international journals and conferences in Databases and Artificial Intelligence, with more than 22000 citations and an h-index of 62, according to Google Scholar. He is one of the editors of the Description Logic Handbook. He has served in over 100 program committee roles for international events, and he is a member of the editorial board of JAIR and of Big Data Research. In 2012-2013 he has been a visiting researcher at the Technical University of Vienna as Pauli Fellow of the ”Wolfgang Pauli Institute”. He is the program chair of the 34th ACM Symposium on Principles of Database Systems (PODS 2015), program co-chair of the 28th Description Logic Workshop (DL 2015), and the general chair of the 28th European Summer School in Logic, Language and Information (ESSLLI 2016). He has been nominated ECCAI Fellow in 2015.

Table 1: Tentative Schedule for AMW School 2016

Time   Day 1 (June 6) Day 2 (June 7)
09:00–09:15 Introduction Announcements
09:15–10:45 Tutorial by Oscar Corcho Tutorial by Lise Getoor (I)
10:45–11:00 Coffee   Coffee
11:00–12:15 Tutorial by Carlos Buil-Aranda Tutorial by Lise Getoor (II)
12:15–14:00 Lunch Lunch
14:00–15:30 Tutorial by Juan L. Reutter Tutorial by Diego Calvanese (I)
15:30–15:45 Coffee Coffee
15:45–17:00    Tutorial by Magdalena Ortiz Tutorial by Diego Calvanese (II)
17:00–18:00 Poster Session Poster Session

Tentative Schedule

The tentative schedule for the school is given in Table 1. Each tutorial will last three hours with a 15 minute break in the middle. Lunch is given additional length to allow for discussion amongst participants. On the first evening, there will be one hour of lightening talks, where all participants will present a short slot on the topic of their interest; this slot was very successful last year, where all postgraduate students in attendence spoke briefly about their topics, later connecting with more senior researchers to talk further. On the second evening, we plan to hold an informal social event.

If you have any questions about the event, please don’t hesitate to contact the AMWS chairs:

Juan Sequeda (mailto: juanfederico@gmail.com ) and Domagoj Vrgoc (mailto: domagojvrgoc@gmail.com ).