Andreas Spitz

Phone: +49 (0) 6221 / 54 - 14309
Fax: +49 (0) 6221 / 54 - 14351
Office: INF 205, room 1/333 (first floor)
Email: spitz(at)informatik.uni-heidelberg(dot)de
Office hours: by appointment


News

2018-05-09: Our paper Efficient Anti-community Detection in Complex Networks was accepted for presentation at the 30th International Conference on Scientific and Statistical Database Management (SSDBM'18), July 9-11, Bolzano-Bozen, Italy.

2018-04-14:
Our paper Heterogeneous Subgraph Features for Information Networks was accepted for presentation at the Workshop on Graph Data Management Experiences & Systems and Network Data Analytics (GRADES-NDA'18) in conjunction with SIGMOD'18, 10 - 15 June, Houston, USA.

2018-02-19:
Our paper Predicting Document Creation Times in News Citation Networks (together with Jannik Strötgen) was accepted for presentation at the Temporal Web Analytics Workshop (TempWeb'18) in conjunction with the Web Conference (WWW'18), 23 - 27 April, Lyon, France.

2018-02-15:
Our paper Exploring Entity-centric Networks in Entangled News Streams was accepted for presentation at the Journalism, Misinformation, and Fact Checking alternate track at the Web Conference (WWW'18), 23 - 27 April, Lyon, France.

2017-12-11:
Our paper Entity-centric Topic Extraction and Exploration: A Network-based Approach was accepted for presentation at the European Conference on Information Retrieval (ECIR'18), March 26 - 29, Grenoble, France.


About

I studied Applied Computer Science with a minor in Computational Linguistics at the Ruprecht-Karls-University Heidelberg and received my Master of Science in December 2014 for the analysis of citation networks between news articles. Currently, I am working as a researcher and PhD student at the Institute of Computer Science in the Database Systems Research group of Prof. Dr. Michael Gertz.

My research is centered on the extraction of implicit (entity) networks as an efficient and versatile representation of large document collections or streams of news articles that supports a variety of subsequent information retrieval tasks. Potential applications include the entity-centric search, exploration, and extractive summarization of news articles or Wikipedia, the detection and visualization of network topics, the exploration of evolving entangled news streams, and entity disambiguation or entity linking.

As a focus point of the DFG funded EventAE project, I investigate the detection and modeling of event descriptions through implicit network substructures in large-scale document collections. In the SCIDATOS project, which is funded by the Klaus Tschira Foundation, we investigate the application of implicit networks to the domain of medical texts.

Online profiles: [google scholar] [dblp] [acm digital library]


Research Interests

  • Information Retrieval / Information Extraction
  • Network Analysis / Graph Algorithms
  • Data Mining / Text Mining
  • Machine Learning
  • Formal Languages


Short CV

Education

  • 01/2015 - present Researcher and PhD student at the Institute of Computer Science at Heidelberg University
  • 10/2012 - 12/2014 Applied computer science (M.Sc., graduated with distinction) at Heidelberg University, minor in computational linguistics. Title of thesis: Analysis and exploration of centrality and referencing patterns in networks of news articles, Advisor: M. Gertz.
  • 10/2009 - 09/2012 Applied computer science (B.Sc.) at Heidelberg University, minor in computational linguistics. Title of thesis: An evaluation of similarity measures for the projection of bipartite networks, Advisors: G. Reinelt, K. Zweig.

Summer Schools and Seminars

Stipends & Scholarships

  • 04/2016 - 09/2016 Microsoft Azure for Research Award; project title: heiLIGHTS: A Heidelberg approach to Learning Influential GrapH Topology Structures.
  • 10/2013 - 09/2014 Germany Scholarship (Deutschlandstipendium), Heidelberg University
  • 10/2012 - 09/2013 Karl-Steinbuch scholarship for supporting innovative IT- and media-related projects, MFG Baden-Württemberg, Germany; project title: AFFINE: A network analytic approach to identifying characteristics of cinematic milestones

Professional Experience

  • 04/2015 - present Visiting Lecturer, Baden-Württemberg Cooperative State University Mannheim
  • 01/2015 - present Research Assistant, Heidelberg University
  • 10/2010 - 03/2012 Student Assistant, Heidelberg University

Voluntary Activities

  • 03/2015 - 02/2017 Doctoral student representative in the Council for Graduate Studies, Heidelberg University
  • 11/2015 - 10/2016 Representative of the natural sciences in the executive committee of the Heidelberg University PhD student government (Doktorandenkonvent)
  • 03/2015 - 10/2015 Member of the commission for founding the Heidelberg University PhD student government

Teaching

  • Lectures (lecturer):
    Operating Systems [summer 2015, 2016]
  • Lectures (assistant):
    Complex Network Analysis [winter 2016/2017]
    Knowledge Discovery in Databases [winter 2016/2015]
    Introduction to Databases [summer 2015, 2016, 2017]
    Introduction to Software Engineering [winter 2011/2012]
    Algorithms and Data Structures [summer 2011]
    Introduction to Applied Computer Science [winter 2010/2011]
  • Seminars (co-supervisor):
    Fake News, Fact Checking, and Filter Bubbles [summer 2018]
    Embeddings for Data Analysis [summer 2018]
    Information Networks
    [summer 2017]
    Similarity Search [summer 2016]
    Event Detection [summer 2015]
  • Software practicals (supervisor / co-supervisor):
    various topics [summer 2015 - present]
  • Bachelor- / Master-Theses (supervisor / co-supervisor):
    various topics [summer 2015 - present]

Reviewing Activities



Publications

2018

  • Sebastian Lackner, Andreas Spitz, Matthias Weidemüller, and Michael Gertz.
    Efficient Anti-community Detection in Complex Networks.
    In: Proceedings of the 30th International Conference on Scientific and Statistical Database Management (SSDBM'18), Bozen-Bolzano, Italy, July 9-11. 2018
    [pdf] [acm] [bibtex] [data] [code] [slides]
  • Andreas Spitz, Diego Costa, Kai Chen, Jan Greulich, Johanna Geiß, Stefan Wiesberg, and Michael Gertz.
    Heterogeneous Subgraph Features for Information Networks.
    In: Proceedings of the First International ACM SIGMOD Workshop on Graph Data Management Experiences & Systems and Network Data Analytics (GRADES-NDA '18), Houston, TX, USA, June 10 - 15. 2018, 7:1–7:9
    [pdf] [acm] [bibtex] [code] [slides] [poster]
  • Andreas Spitz, and Michael Gertz.
    Exploring Entity-centric Networks in Entangled News Streams.
    In: Proceedings of the 27th International Conference on World Wide Web (WWW '18) Companion, Lyon, France, April 23-27. 2018, 555–563
    [pdf] [acm] [bibtex] [data] [code] [slides]
  • Andreas Spitz, Jannik Strötgen, and Michael Gertz.
    Predicting Document Creation Times in News Citation Networks.
    In: Proceedings of the 27th International Conference on World Wide Web (WWW '18) Companion, Lyon, France, April 23-27. 2018, 1731–1736
    [pdf] [acm] [bibtex] [data/code] [slides]
  • Andreas Spitz, and Michael Gertz.
    Entity-centric Topic Extraction and Exploration: A Network-based Approach.
    In: Proceedings of the 40th European Conference on IR Research (ECIR '18) Grenoble, France, March 26-29. 2018, 3–15
    [pdf] [springer] [bibtex] [data] [code] [slides]
  • Erich Schubert, Andreas Spitz, and Michael Gertz.
    Exploring Significant Interactions in Live News.
    In: Proceedings of the Second International Workshop on Recent Trends in News Information Retrieval (NewsIR '18), Grenoble, France, March 26. 2018, 39–44
    [pdf] [ceur] [bibtex] [slides] [poster]

2017

  • Andreas Spitz, Gloria Feher, and Michael Gertz.
    Extracting Descriptions of Location Relations from Implicit Textual Networks.
    In: Proceedings of the 11th Workshop on Geographic Information Retrieval (GIR '17) Heidelberg, Germany, November 30 - December 1. 2017, 1:1–1:9
    [pdf] [acm] [bibtex] [data] [slides]
  • Ludwig Richter, Johanna Geiß, Andreas Spitz, and Michael Gertz.
    HeidelPlace: An Extensible Framework for Geoparsing.
    In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP '17), Copenhagen, Denmark, September 7-11. 2017, 85–90
    [pdf] [acl] [bibtex] [code] [poster]
  • Johanna Geiß, Andreas Spitz, and Michael Gertz.
    NECKAr: A Named Entity Classifier for Wikidata.
    In: Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL '17), Berlin, Germany, September 13-14. 2017, 115–129
    [pdf] [springer] [bibtex] [code] [slides]
  • Erich Schubert, Andreas Spitz, Michael Weiler, Johanna Geiß, and Michael Gertz.
    Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding.
    In: CoRR abs/1708.03569. 2017
    [pdf] [arXiv] [bibtex]
  • Andreas Spitz, Satya Almasian, and Michael Gertz.
    EVELIN: Exploration of Event and Entity Links in Implicit Networks.
    In: Proceedings of the 26th International Conference on World Wide Web (WWW '17) Companion, Perth, Australia, April 3-7. 2017, 273–277
    [pdf] [acm] [bibtex] [poster] [demo]

2016

  • Andreas Spitz, Johanna Geiß, Michael Gertz, Stefan Hagedorn, and Kai-Uwe Sattler.
    Refining Imprecise Spatio-temporal Events: A Network-based Approach.
    In: Proceedings of the 10th Workshop on Geographic Information Retrieval (GIR '16), Burlingame, California, USA, October 31. 2016, 5:1–5:10
    [pdf] [acm] [bibtex] [data] [slides]
  • Andreas Spitz, and Michael Gertz.
    Terms over LOAD: Leveraging Named Entities for Cross-Document Extraction and Summarization of Events.
    In: Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval, (SIGIR '16), Pisa, Italy, July 17-21. 2016, 503–512
    [pdf] [acm] [bibtex] [data] [code] [slides]
  • Andreas Spitz, Johanna Geiß, and Michael Gertz.
    So Far Away and Yet so Close: Augmenting Toponym Disambiguation and Similarity with Text-based Networks.
    In: Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data (GeoRich '16), San Francisco, California, USA, June 26 - July 1. 2016, 2:1–2:6
    [pdf] [acm] [bibtex] [data] [slides]
  • Andreas Spitz, Vaibhav Dixit, Ludwig Richter, Michael Gertz, and Johanna Geiss.
    State of the Union: A Data Consumer's Perspective on Wikidata and Its Properties for the Classification and Resolution of Entities.
    In: Wiki, Papers from the 2016 ICWSM Workshop, Cologne, Germany, May 17. 2016
    [pdf] [aaai] [bibtex] [poster]
  • Andreas Spitz, Anna Gimmler, Thorsten Stoeck, Katharina Anna Zweig, and Emőke-Ágnes Horvát.
    Assessing Low-Intensity Relationships in Complex Networks.
    In: PloS one 11 (4). 2016, e0152536
    [pdf] [plos] [bibtex]

2015

  • Johanna Geiß, Andreas Spitz, Jannik Strötgen, and Michael Gertz.
    The Wikipedia Location Network: Overcoming Borders and Oceans.
    In: Proceedings of the 9th Workshop on Geographic Information Retrieval (GIR '15), Paris, France, November 26-27. 2015, 2:1–2:3
    [pdf] [acm] [bibtex] [data] [slides]
  • Andreas Spitz, and Michael Gertz.
    Breaking the News: Extracting the Sparse Citation Network Backbone of Online News Articles.
    In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (ASONAM '15), Paris, France, August 25 - 28. 2015, 274–279
    [pdf] [acm] [bibtex] [data] [slides]
  • Christian Brugger, André Lucas Chinazzo, Alexandre Flores John, Christian de Schryver, Norbert Wehn, Andreas Spitz, and Katharina Anna Zweig.
    Exploiting Phase Transitions for the Efficient Sampling of the Fixed Degree Sequence Model.
    In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (ASONAM '15), Paris, France, August 25 - 28. 2015, 308–313
    [pdf] [acm] [bibtex]
  • Johanna Geiß, Andreas Spitz, and Michael Gertz.
    Beyond Friendships and Followers: The Wikipedia Social Network.
    In: Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, (ASONAM '15), Paris, France, August 25 - 28. 2015, 472–479
    [pdf] [acm] [bibtex] [data]
  • Andreas Spitz, Jannik Strötgen, Thomas Bögel, and Michael Gertz.
    Terms in Time and Times in Context: A Graph-based Term-Time Ranking Model.
    In: Proceedings of the 24th International Conference on World Wide Web (WWW '15) Companion, Florence, Italy, May 18-22. 2015, 1375–1380
    [pdf] [acm] [bibtex] [data] [slides]

2014

  • Andreas Spitz, and Emőke-Ágnes Horvát.
    Measuring Long-term Impact Based on Network Centrality: Unraveling Cinematic Citations.
    In: PloS one 9 (10). 2014, e108857
    [pdf] [plos] [bibtex]
  • Andreas Spitz, and Emőke-Ágnes Horvát.
    A Cookbook of Cinematic Delicacies That Do Not Expire.
    In: Leonardo 47 (3). 2014, 271–271
    [pdf] [mit press] [bibtex]

2013

  • Andreas Spitz, Katharina Anna Zweig, and Emőke-Ágnes Horvát.
    SICOP: Identifying Significant Co-interaction Patterns.
    In: Bioinformatics 29 (19). 2013, 2503–2504
    [pdf] [oxford] [bibtex] [code]



Posters

  • A. Spitz, D. Costa, K. Chen, J. Greulich, J. Geiß, S. Wiesberg and M. Gertz
    Heterogeneous Subgraph Features for Information Networks.
    1st GRADES-NDA Workshop at SIGMOD'18, Houston, TX, USA, 2018. [pdf]
  • E. Schubert, A. Spitz and M. Gertz
    Exploring Significant Interactions in Live News.
    NewsIR'18 at ECIR'18, Grenoble, France, 2018. [pdf]
  • L. Richter, J. Geiß, A. Spitz and M. Gertz
    HeidelPlace: An Extensible Framework for Geoparsing.
    EMNLP'17, Copenhagen, Denmark, 2017. [pdf]
  • A. Spitz, S. Almasian and M. Gertz
    EVELIN: Exploration of Event and Entity Links in Implicit Networks.
    WWW'17, Perth, Australia, 2017. [pdf] [demo]
  • A. Spitz, V. Dixit, L. Richter, M. Gertz and J. Geiß
    State of the Union: A Data Consumer's Perspective on Wikidata and Its Properties for the Classification and Resolution of Entities.
    2nd Wiki Workshop at ICWSM'16, Cologne, Germany, 2016. [pdf]
  • A. Spitz
    Identifying Events from Co-Occurrences and Context across Large Document Collections.
    3rd Heidelberg Laureate Forum, Heidelberg, Germany, 2015. [pdf]
  • A. Spitz and E.-A. Horvát
    Cinematic Delicacies that do not Expire: Long-term Impact of Films Based on Network Centrality.
    6th European Summer University in Digital Humanities Culture & Technology, Leipzig, Germany, 2015. [pdf]


Talks

  • Heterogeneous Subgraph Features for Information Networks
    First International ACM SIGMOD Workshop on Graph Data Management Experiences & Systems and Network Data Analytics (GRADES-NDA '18), Houston, Texas, June 10. 2018. [slides]
  • Exploring Entity-centric Networks in Entangled News Streams
    27th International Conference on World Wide Web (WWW'18), Lyon, France, April 23-27, 2018. [slides]
  • Predicting Document Creation Times in News Citation Networks
    8th Temporal Web Analytics Workshop (TempWeb'18), Lyon, France, April 23, 2018. [slides]
  • Entity-centric Topic Extraction and Exploration: A Network-based Approach
    40th European Conference on IR Research (ECIR '18), Grenoble, France, March 26-29, 2018. [slides]
  • Exploring Significant Interactions in Live News
    Second International Workshop on Recent Trends in News Information Retrieval (NewsIR '18), Grenoble, France, March 26, 2018. [slides]
  • Extracting Descriptions of Location Relations from Implicit Textual Networks
    11th Workshop on Geographic Information Retrieval (GIR'17), Heidelberg, Germany, November 30 - December 1, 2017. [slides]
  • NECKAr: A Named Entity Classifier for Wikidata
    International Conference of the German Society for Computational Linguistics and Language Technology (GSCL'17), Berlin, Germany, September 13-14, 2017. [slides]
  • Applications of Latent Entity Networks in Information Retrieval
    Workshop Internationale Klima- und Energiediskurse, Darmstadt, Germany. May 26, 2017. [slides]
    (invited talk, Prof. Marcus Müller)
  • Refining Imprecise Spatio-temporal Events: A Network-based Approach
    10th Workshop on Geographic Information Retrieval (GIR'16) at SIGSPATIAL'16, San Francisco, USA. October 31, 2016. [slides]
  • Extraction and Applications of Implicit Networks from Unstructured Text
    Max Planck Institute for Informatics, Saarbrücken, Germany. September 14, 2016. [slides]
    (invited talk, Dr. Jannik Strötgen)
  • Terms over LOAD: Leveraging Named Entities for Cross-Document Extraction and Summarization of Events
    39th International Conference on Research and Development in Information Retrieval (SIGIR '16), Pisa, Italy. July 20, 2016. [slides]
  • So Far Away and Yet so Close: Augmenting Toponym Disambiguation and Similarity with Text-Based Networks
    3rd International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data (GeoRich '16), San Francisco, USA. June 26, 2016. [slides]
  • The Wikipedia Location Network - Overcoming Borders and Oceans
    9th Workshop on Geographic Information Retrieval (GIR '15), Paris, France. November 26, 2015. [slides]
  • Networks in Information Extraction
    Heidelberg Institute for Theoretical Studies, Heidelberg, Germany. September 9, 2015.
    (invited talk, Prof. Michael Strube)
  • A Book in a Minute: Identifying Times, Events, and Context Across Large Document Collections
    6th European Summer University in Digital Humanities Culture & Technology, Leipzig, Germany. August 6, 2015.
  • Breaking the News: Extracting the Sparse Citation Network Backbone of Online News Articles
    Statistical Natural Language Processing Colloquium, Heidelberg, Germany. June 26, 2015. [slides]
  • Terms in Time and Times in Context: A Graph-based Term-Time Ranking Model
    5th Temporal Web Analytics Workshop (TempWeb '15), Florence, Italy. May 18, 2015. [slides]
  • From Text to Context: Extracting Date-Term-Networks from Wikipedia
    DHBW Mannheim, IMBIT. Mannheim, Germany. March 11, 2015.
    (invited talk, Prof. Peter Mayr)
  • AFFINE: Analyzing the Features of the Film Reference Network
    MFG Foundation, Karl-Steinbuch Stipend Certificate Ceremony. Stuttgart, Germany. November 19, 2013.
    (invited talk, Dr. Christian Förster)