Dr. Johanna Geiß

Phone: +49 (0) 6221 / 54 -14309
Fax: +49 (0) 6221 / 54 - 5684
Office: INF 205, room 01.333
Email: johanna.geiss(at)informatik.uni-heidelberg(dot)de
Office hours  Monday 10.30am-11:30am (with prior appointment via e-mail only).      

News

  • 30.11.2018: Good Bye!
  • 30.08.2017: 

    I am on parental leave, I will be back in October 2018!  See you then.

  • 03.07.2016: Our demo Paper "HeidelPlace: An Extensible Framework for Geoparsing" was accepted at the Conference on Empirical Methods in Natural Language Processing (EMNLP'17), September 7-11, Copenhagen, Denmark. The Java framework is available for download.
  • 08.05.2017: Our EventAE Event Repository is now accessible  through our Event Repository API. You can search for specific event data and export it. The events can be filtered by keyword, time range and location or any combinations of these filters.
  • 15.04.2017: The new base datasets for persons, locations and organization are available for download. They were extracted from the Wikidata dump from 20 March 2017. They can be downloaded from the dataset page or the NECKAr page.
  • 16.11.2016: We launched our new project website for EventAE, a DFG project about Event Exploration of Linked Open Data link
  • 14.11.2016: Die Noten für die Python Klausur stehen in Moodle, soviel vorweg; es haben alle bestanden! Herzlichen Glückwunsch!
  • 26.09.2016: Our paper Refining Imprecise Spatio-temporal Events: A Network-based Approach was accepted for the 10th Workshop on Geographic Information Retrieval (GIR'16) at ACM SIGSPATIAL, San Francisco, USA.

About me

Currently, I am working as a postdoctoral researcher at the Institute of Computer Science in the Database Systems Research group of Prof. Dr. Michael Gertz. Since January 2016 I am involved in the research project SCIDATOS (Scientific Computing for Improved Detection and Therapy of Sepsis), together with the Center for Scientific Computing (IWR) Heidelberg and the University Medical Center Mannheim (UMM). The project is funded by the Klaus Tschira Foundation. More ...

I studied Computational Linguistics, Near Eastern Archaeology and Jewish Studies at the Ruprecht-Karls-University Heidelberg and received my Magister Artium in January 2006. In 2011 I was awarded the PhD degree by the University of Cambridge. My dissertation is entitled “Latent Semantic Sentence Clustering for Multi-Document Summarization” and is available as Technical Report from the Computer Lab, Cambridge.

My last name is correctly written with a "ß" (sharp s or eszett). In German it is used only after long vowels and diphthongs while ss is written after short vowels. The HTML entity for ß is   ß.  In TeX and LaTeX, \ss produces ß.

Teaching

SoSe 2017

  • Information Networks (Seminar)
  • Software practical Data and Text Mining Advanced Students

WiSe 2016/2017

  • Python Kurs
  • Software practical Data and Text Mining Advanced Students

SoSe 2015

  • Similarity Search (Seminar), Th, 2:00-4:00pm
  • Software practical Data and Text Mining Advanced Students

WiSe 2015/2016

  • Python Kurs
  • Information System Engineering Project  [LSF-Info]
  • Software practical Data and Text Mining Advanced Students

SoSe 2015

  • Event Detection (Seminar)
  • Software practical Data and Text Mining Advanced Students

Research Projects

  • SCIDATOS (Klaus Tschira Stiftung): Scientific Computing for Improved Detection and Therapy of Sepsis, together with the Center for Scientific Computing (IWR) Heidelberg and the University Medical Center Mannheim (UMM). The goal of the project is the reliable diagnosis of sepsis and its timely therapy in critically ill patients.
  • EventAE: Event-basierte Exploration von Linked Open Data (DFG)

Research Interests

  • Event Detection
  • Information Extraction (IE)
  • Information Retrieval (IR)
  • Natural Language Processing

Short Curriculum Vitae

  • Since 12/2014 Postdoc at the Institute of Computer Science at Heidelberg University
  • 08/2013-08/2014 Parental Leave
  • 05/2011 - 09/2014 Computational Linguist at Lingenio GmbH, Heidelberg, Germany
  • 09/2011 - 02/2012 Visiting Marie Curie Fellow, University of Leeds, UK
  • 01/2006 - 03/2007 Software Developer at SAP AG, Walldorf, Germany

Education

  • 04/2007 - 10/2011 PhD Student at University of Cambridge, UK
  • 09/1998 - 01/2006 M.A. at Heidelberg University (Computational Linguistics, Near Eastern Archaeology, Jewish Studies)

Awards & Scholarships

  • 2010 Lundgren Research Award
  • 2009 ACL-IJCNLP 2009 Student Travel Grant
  • 2009 Travel Award, St Edmund’s College, Cambridge
  • 2007 - 2009 Cambridge European Trust Bursary
  • 2007 - 2009 Charter Studentship of the St Edmund’s College, Cambridge
  • 2007 - 2009 Engineering and Physical Sciences Research Council (EPSRC)  Doctoral Training Account fees-only Award
  • 2007 - 2009 Departmental Award, Computer Laboratory, University of Cambridge

Publications

2018

  • Andreas Spitz, Diego Costa, Kai Chen, Jan Greulich, Johanna Geiß, Stefan Wiesberg, and Michael Gertz.
    Heterogeneous Subgraph Features for Information Networks.
    In: Proceedings of the First International ACM SIGMOD Workshop on Graph Data Management Experiences & Systems and Network Data Analytics (GRADES-NDA '18), Houston, TX, USA, June 10 - 15. 2018
    [pdf] [acm] [code]

2017

  • Johanna Geiß, Andreas Spitz, and Michael Gertz.
    NECKAr: A Named Entity Classifier for Wikidata.
    In: Proceedings of the International Conference of the German Society for Computational Linguistics and Language Technology (GSCL '17), Berlin, Germany, September 13-14. 2017
    [pdf] [code] [slides]
  • Ludwig Richter, Johanna Geiß, Andreas Spitz, and Michael Gertz.
    HeidelPlace: An Extensible Framework for Geoparsing.
    In: Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (EMNLP '17), Copenhagen, Denmark, September 7-11. 2017
    [pdf] [acl] [bibtex] [code] [poster]
  • Erich Schubert, Andreas Spitz, Michael Weiler, Johanna Geiß, and Michael Gertz.
    Semantic Word Clouds with Background Corpus Normalization and t-distributed Stochastic Neighbor Embedding.
    In: CoRR abs/1708.03569. 2017
    [pdf] [arXiv] [bibtex]

2016

  • Andreas Spitz, Johanna Geiß, Michael Gertz, Stefan Hagedorn, and Kai-Uwe Sattler.
    Refining Imprecise Spatio-temporal Events: A Network-based Approach.
    In: Christopher B. Jones, and Ross Purves (eds.), Proceedings of the 10th Workshop on Geographic Information Retrieval, GIR 2016, Burlingame, California, USA, October 31, 2016. 2016, 5:1–5:10
    [pdf] [acm] [bibtex] [data] [slides]
  • Andreas Spitz, Vaibhav Dixit, Ludwig Richter, Michael Gertz, and Johanna Geiss.
    State of the Union: A Data Consumer's Perspective on Wikidata and Its Properties for the Classification and Resolution of Entities.
    In: Robert West, Leila Zia, Dario Taraborelli, and Jure Leskovec (eds.), Wiki, Papers from the 2016 ICWSM Workshop, Cologne, Germany, May 17, 2016 WS-16-17. 2016
    [pdf] [aaai] [bibtex] [poster]
  • Andreas Spitz, Johanna Geiß, and Michael Gertz.
    So Far Away and Yet so Close: Augmenting Toponym Disambiguation and Similarity with Text-based Networks.
    In: Andreas Züfle, Benjamin Adams, and Dingming Wu (eds.), Proceedings of the Third International ACM SIGMOD Workshop on Managing and Mining Enriched Geo-Spatial Data, GeoRich@SIGMOD 2016, San Francisco, California, USA, June 26 - July 1, 2016. 2016, 2:1–2:6
    [pdf] [acm] [bibtex] [data] [slides]
  • Johanna Geiß, and Michael Gertz.
    With a Little Help from my Neighbors: Person Name Linking Using the Wikipedia Social Network.
    In: Jacqueline Bourdeau, Jim Hendler, Roger Nkambou, Ian Horrocks, and Ben Y. Zhao (eds.), Proceedings of the 25th International Conference on World Wide Web, WWW 2016, Montreal, Canada, April 11-15, 2016, Companion Volume. 2016, 985–990
    [pdf] [acm] [bibtex] [data] [poster] [slides]

2015

  • Johanna Geiß, Andreas Spitz, Jannik Strötgen, and Michael Gertz.
    The Wikipedia Location Network: Overcoming Borders and Oceans.
    In: Ross S. Purves, and Christopher B. Jones (eds.), Proceedings of the 9th Workshop on Geographic Information Retrieval, GIR 2015, Paris, France, November 26-27, 2015. 2015, 2:1–2:3
    [pdf] [acm] [bibtex] [data] [slides]
  • Johanna Geiß, Andreas Spitz, and Michael Gertz.
    Beyond Friendships and Followers: The Wikipedia Social Network.
    In: Jian Pei, Fabrizio Silvestri, and Jie Tang (eds.), Proceedings of the 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining, ASONAM 2015, Paris, France, August 25 - 28, 2015. 2015, 472–479
    [pdf] [acm] [bibtex] [data] [slides]

2012

  • Kurt Eberle, Johanna Geiß, Mireia Ginesti-Rosell, Bogdan Babych, Anthony Hartley, Reinhard Rapp, Serge Sharoff, and Martin Thomas.
    Design of a hybrid high quality machine translation system.
    In: Proceedings of the Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) at EACL-2012, Avignon, France, April 23 - 23, 2012 . 2012, 1001–112
    [pdf] [acm]

2011

  • Johanna Geiss.
    Latent semantic sentence clustering for multi-document summarization.
    PhD thesis, University of Cambridge, UK. 2011
    [pdf] [acm] [bibtex] [online]

2009

  • Johanna Geiss.
    Creating a Gold Standard for Sentence Clustering in Multi-Document Summarization.
    In: ACL 2009, Proceedings of the 47th Annual Meeting of the Association for Computational Linguistics and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 2-7 August 2009, Singapore, Student Research Workshop. 2009, 96–104
    [pdf] [acm] [bibtex]

2008

  • Johanna Geiß.
    Latent Semantic Indexing and Information Retrieval - a quest with BosSE.
    Vdm Verlag Dr. Müller. 2008, ISBN: 978-3639003949

2006

  • Johanna Geiß.
    Latent Semantic Indexing and Information Retrieval-A quest with BosSE.
    Master thesis, Heidelberg University. 2006
    [online]

Theses

Johanna Geiß
Latent semantic sentence clustering for multi-document summarization
PhD Thesis (Technical Report 802), Computer Laboratory, University of Cambridge, 2011. 

[pdf]

Johanna Geiß
Latent Semantic Indexing and Information Retrieval: A Quest with Bosse
Master Thesis,  Vdm Verlag Dr. Müller, Saarbrücken 2008, ISBN: 978-3639003949. 

[pdf

Posters

Kurt Eberle, Johanna Geiß, Mireia Ginestí-Rossell, Bogdan Babych, Martin Thomas, Serge Sharo, Anthony Hartley and Reinhard Rapp 

HyghTra – A Hybrid High Quality Translation System, 10th International Workshop on Treebanks and Linguistic Theories (TLT10), Heidelberg, Germany, 2012

Talks

  • Working with Electronic Health Records 
    Statistical Natural Language Processing Colloquium, Heidelberg, Germany, 9 November 2016.
  • Beyond Friendships and Followers: The Wikipedia Social Network
    International Conference on Advances in Social Networks Analysis and Mining, Paris, 25 August 2015
  • The Wikipedia Social Network
    Statistical Natural Language Processing Colloquium, Heidelberg, Germany, 17 July 2015.
  • Sentence Clustering for Multi-Document Summarization
    Computer Science Talk,King’s College, Cambridge, 2009
  • Natürlichsprachige Suchanfragen über strukturierte Daten
    TaCoS Saarbrücken, 2006
  • Multilinguales Information Retrieval mit Latent Semantic Analysis
    TaCoS Saarbrücken, 2006
  • Tutorial zu Latent Semantic Indexing
    TaCoS Giessen, 2003

Professional Activities

Mitglied der Gleichstellungskommission der Fakultät für Mathematik und Informatik (seit 2016)

Mitglied der Aufnahmeprüfungskommission für die Bachelor-Studiengänge in Informatik (WS 2016/17)

Program committee member

Joint Workshop on Exploiting Synergies between Information Retrieval and Machine Translation (ESIRMT) and Hybrid Approaches to Machine Translation (HyTra) 2012

External reviewer

Computational Intelligence 2011

Computer Speech and Language 2010 & 2014

International Journal of Geographical Information Science 2017