Dennis Aumiller

Phone: +49 (0) 6221 / 54 - 14353
Fax: +49 (0) 6221 / 54 - 14351
Office: INF 205, room 1/312 (first floor)
Email: aumiller(at)informatik.uni-heidelberg(dot)de
Office hours: By appointment


2021-07-01: I will be joining Amazon Research Berlin as an Applied Scientist Intern later this year.

2021-04-15: Our paper Structural Text Segmentation of Legal Documents, has been accepted at the 18th International Conference on Artificial Intelligence and Law. This work builds on the previously announced pre-print.

2020-12-07: A pre-print of our work, Topical Change Detection in Documents via Embeddings of Long Sequences, is now available on the arXiv platform. See here for the paper.

2020-11-09: Our shared task submission, uniHD@CL-SciSumm 2020: Citation Extraction as Search, has been among the best-performing submissions and selected for oral presentation at the 1st Workshop on Scholarly Document Processing (SDP@EMNLP 2020).

2020-08-08: Our paper, TiCCo: Time-Centric Content Exploration, has been accepted as demonstration paper at the 29th ACM International Conference on Information and Knowledge Management (CIKM 2020).

2020-05-08: Our paper, A Versatile Hypergraph Model for Document Collections, has been accepted as long paper at the 32nd International Conference on Scientific and Statistical Database Management (SSDBM 2020).

2020-02-28: Our paper, Time-Centric Exploration of Court Documents, has been accepted for presentation at the International Workshop on Narrative Extraction from Texts held in conjunction with the 42nd European Conference on Information Retrieval (Text2Story@ECIR'20).

2019-10-15: I will be presenting my joint work with Artem Sokolov, Comments on the Error-Bound of Behavioral Cloning & the Performance of a Non-Aggregated DAgger Algorithm, at the Amazon Research Day 2019 in Berlin.

2019-09-04: Our paper DNA accessibility of chromatosomes quantified by automated image analysis of AFM images, got published in Scientific Reports.


I studied Applied Computer Science with a minor in Computational Linguistics at Heidelberg University and finished my Master of Science in May 2019. For my thesis project, I investigated hypergraphs as a structure for more efficient representations of document collections. Since June 2019 I am now working as a researcher and PhD student at the Institute of Computer Science in the Database Systems Research group of Prof. Dr. Michael Gertz.

My main interests are focused on the automated processing of large document collections. Specifically, I investigate suitable models for interactive text exploration and knowledge representation. We previously experimented with visualization interfaces that allow for denser information content, and re-structuring of original document sections in a more coherent manner. As a related sub-task, this also involves learning efficient paragraph similarities across documents, as well as producing topically targeted summaries or keyphrases for individual document sections.

If you are a Heidelberg University student and interested in NLP or IR with a focus on summarization and text representation, feel free to message me about potential practicals or theses.

Reviewing Activities



  • Lecture Assistant for graduate lecture "Text Analytics" (Winter 2020)
  • Lecture Assistant for "Databases 1" (Summer 2019, Summer 2020, Summer 2021)
  • Head Teaching Assistant for graduate course "Complex Network Analysis" (Winter 2018)
  • Teaching Assistant for "Databases 1" (Summer 2016, Summer 2017)
  • Head Teaching Assistant for graduate lecture "Computer Graphics" (Winter 2016, Prof. Dr. Filip Sadlo)
  • Student Practical supervision (undergraduate/graduate semester research projects, since Summer 2019)

Supervised Master Theses:

  • Fabio Becker: "A Generative Model for Dynamic Networks with Community Structures" (co-supervised, Winter 2020)

Supervised Undergraduate Theses:

  • Jan-Gabriel Mylius : "Visual Analysis of Paragraph Similarity" (co-supervised, Winter 2020)
  • Stefan Hickl: "Automatisierte Generierung von Inhaltsverzeichnissen aus PDF-Dokumenten" (co-supervised, Summer 2020)

Research Interests

  •     (Multi-)Document Summarization
  •     Keyphrase Extraction
  •     Text Exploration
  •     (Temporal) Information Retrieval
  •     Machine Learning / Natural Language Processing


  • Dennis Aumiller, Satya Almasian, Sebastian Lackner, and Michael Gertz.
    Structural Text Segmentation of Legal Documents.
    In: Eighteenth International Conference for Artificial Intelligence and Law (ICAIL'21), June 21–25, 2021, Sāo Paulo, Brazil. 2021
    [pdf] [code] [DOI:10.1145/3462757.3466085]
  • Philip Hausner, Dennis Aumiller, and Michael Gertz.
    TiCCo: Time-Centric Content Exploration.
    In: Mathieu d'Aquin, Stefan Dietze, Claudia Hauff, Edward Curry, and Philippe Cudré-Mauroux (eds.), CIKM '20: The 29th ACM International Conference on Information and Knowledge Management, Virtual Event, Ireland, October 19-23, 2020. 2020, 3413–3416
    [online] [acm] [demo] [code] [bibtex]
  • Philip Hausner, Dennis Aumiller, and Michael Gertz.
    Time-centric Exploration of Court Documents.
    In: Ricardo Campos, Alípio Mário Jorge, Adam Jatowt, and Sumit Bhatia (eds.), Proceedings of Text2Story - Third Workshop on Narrative Extraction From Texts co-located with 42nd European Conference on Information Retrieval, Text2Story@ECIR 2020, Lisbon, Portugal, April 14th, 2020 [online only] 2593. 2020, 31–37
    [online] [bibtex]
  • Dennis Aumiller, Satya Almasian, Philip Hausner, and Michael Gertz.
    UniHD@CL-SciSumm 2020: Citation Extraction as Search.
    In: Muthu Kumar Chandrasekaran, Anita de Waard, Guy Feigenblat, Dayne Freitag, Tirthankar Ghosal, Eduard H. Hovy, Petr Knoth, David Konopnicki, Philipp Mayr, Robert M. Patton, and Michal Shmueli-Scheuer (eds.), Proceedings of the First Workshop on Scholarly Document Processing, SDP@EMNLP 2020, Online, November 19, 2020. 2020, 261–269
    [online] [aclweb] [bibtex]
  • Andreas Spitz, Dennis Aumiller, Bálint Soproni, and Michael Gertz.
    A Versatile Hypergraph Model for Document Collections.
    In: Proceedings of the 32nd International Conference on Scientific and Statistical Database Management (SSDBM '20), Vienna, Austria, July 7-9. 2020
    [pdf] [acm]
  • Martin Würtz, Dennis Aumiller, Lina Gundelwein, Philipp Jung, Christian Schütz, Kathrin Lehmann, Katalin Tóth, and Karl Rohr.
    DNA accessibility of chromatosomes quantified by automated image analysis of AFM data.
    In: Scientific Reports 9 (1). 2019
    [online] [code] [DOI] [DOI:10.1038/s41598-019-49163-4]