Theory and Practice of Data Provenance

Seminar Summer Semester 2013 (2+2 SWS)

Theory and Practice of Data Provenance

Prof. Dr. Bertram Ludäscher, University of California at Davis

(Visiting Professor at the Database Systems Research Group)

 

The origin and processing history of an artifact is known as its provenance. There has been growing interest in recent years in the theory and practice of data provenance, triggered by important practical applications, e.g., in data quality, curation, warehousing, workflows, etc. In addition some elegant theoretical foundations of provenance have emerged that unify different prior notions such as why-provenance, how-provenance, and lineage. The objective of this seminar is to get an overview of the most recent technical results and practical developments in the field of data provenance. The provenance approaches discussed include:

●algebraic (provenance semirings, provenance polynomials),

● logic-based (proof trees, derivations),

● game-theoretic (provenance games).

● argumentation-theoretic.

For example, for positive queries, provenance semirings provide an elegant unifying framework for prior work on lineage, why, and c-tables (conditional tables) among others, but does not account for special kinds of provenance, most notably why-not provenance. In addition to theoretical foundations of provenance, we will also cover practical and engineering aspects, e.g., how to effectively and efficiently query large provenance graphs.

Time and Location: Wednesday, 2-4pm, SR 26, INF 329 (First meeting: 17.4.2013)

Required Background: Some prior knowledge in logics, AI or databases will be helpful but is not required.

 

Literature: Information about the seminar topics and corresponding literature (conference and journal papers) will be given during the first meeting. All papers will also be distributed through the Moodle Website for this seminar.

Credit Points: In order to receive the 4 ECST for this seminar, students have to (1) participate in all presentations, (2) give a presentation (about 40 minutes), and (3) prepare a technical report covering the topic they presented in the seminar. More details will be given during the first meeting.

Participants: Students with Computer Science, Mathematics, or Scientific Computing as major.

Further Information: Prof. Dr. Bertram Ludäscher, ludaesch@ucdavis.edu, INF 348, Room 19.