Jan Philip Wahle

I am a PhD candidate in computer science and natural language processing at the University of Göttingen in Germany adviced by Prof. Gipp and Dr. Ruas . I received my master's degree in computer science from the University of Wuppertal and worked for the automotive company Aptiv PLC before continuing with my PhD studies. I have been a visiting researcher at the National Research Council (NRC ) Canada. My research has been presented at ACL , EMNLP , EACL , and COLING .

What I'm up to

I'm fascinated by human intelligence, particularly the role of language in cognitive abilities. Human language is not only an indispensable way for the dissemination of information, it opens up the dimensions of outer worlds (social communication with others) and inner worlds (reflection and thought). Language allows to travel thoughts back and forth in time, create hypothetical scenarios, and generate new ideas. Because I believe language is key to human intelligence, I have dedicated my professional life to understanding it and making computers learn about it to build more intelligent systems. Lately I have figured that these intelligent systems could also pose harm to humanity, so I'm also working on making sure they don't.


Here are updates about recent activities, such as talks, awards, travels, and workshops.

Jul 2024: I will give a talk to the Volkswagen Foundation about AI tools in the funding process.

Apr 2024: I will give a talk about NLP innovations for businesses to the company Eschbach.

Feb 2024: I will give a talk at the University of Groningen to the Jantina Tammes School of Digital Society, Technology and AI organized by Tommaso Caselli about "Insights, Findings, and Recommendations for Paraphrasing and Plagiarism in the Age of LLMs."

Dec 2023: I will attend EMNLP in Singapore.

Nov 2023: I will give a talk at LMU Munich to Barbara Plank's MaiNLP group about our EMNLP paper "We are Who We Cite: Bridges of Influence Between Natural Language Processing and Other Academic Fields."

Jul 2023: I will attend ACL in Toronto.

Apr 2023: I got awarded a six-month scholarship by the DAAD to visit the National Research Council Canada and work with Saif M. Mohammad.

Mar 2023: I am visiting the group of Benjamin Roth in Vienna.

Mar 2023: I will give a talk at the Open Science Workshop organized by Birgit Schmidt about "AI in the Scientific Writing Process."

Dec 2022: I will attend EMNLP in Abu Dhabi.


I love conducting research, especially on innovations that enable humans to do things they were not able to do before. I also care deeply about the future of humanity and human safety, because I love us and our little planet. Below are a few projects that I've worked on (with the help of other great human beings). You can also find more research on Google Scholar .

CiteAssist: A System for Automated Preprint Citation and BibTeX Generation
SDProc @ ACL 2024
Lars Kaesberg, Terry Ruas, Jan Philip Wahle, Bela Gipp
[pdf] [bibtex] [code] [demo]
MAGPIE: Multi-Task Media-Bias Analysis of Generalization of Pre-Trained Identification of Expressions
Tomáš Horych, Martin Wessel, Jan Philip Wahle, Terry Ruas, Jerome Waßmuth, André Greiner-Petter, Akiko Aizawa, Bela Gipp, Timo Spinde
[pdf] [bibtex] [code]
Text-Guided Image Clustering
EACL 2024 (Oral)
Andreas Stephan, Lukas Miklautz, Kevin Sidak, Jan Philip Wahle, Bela Gipp, Claudia Plant, Benjamin Roth
[pdf] [bibtex] [code]
Paraphrase Types for Generation and Detection
EMNLP 2023
Jan Philip Wahle, Bela Gipp, Terry Ruas
[pdf] [bibtex] [code] [demo]
AI Usage Cards: Responsibly Reporting AI-generated Content
JCDL 2023
Jan Philip Wahle, Terry Ruas, Saif M. Mohammad, Norman Meuschke, Bela Gipp
[pdf] [bibtex] [template] [demo]
We are Who We Cite: Bridges of Influence Between NLP and Other Academic Fields
EMNLP 2023 (Oral)
Jan Philip Wahle, Terry Ruas, Mohamed Abdalla, Bela Gipp, Saif M. Mohammad
[pdf] [bibtex] [code] [demo] [blog]
The Elephant in the Room: Analyzing the Presence of Big Tech in NLP Research
ACL 2023 (Oral)
Mohamed Abdalla, Jan Philip Wahle, Terry Ruas, Aurelie Névéol, Fanny Ducel, Saif M. Mohammad, Karen Fort
[pdf] [bibtex] [code]
How Large Language Models are Transforming Machine-Paraphrase Plagiarism
EMNLP 2022 (Oral)
Jan Philip Wahle, Terry Ruas, Frederic Kirstein, Bela Gipp
[pdf] [bibtex] [code]
Analyzing Multi-Task Learning for Abstractive Text Summarization
GEM @ EMNLP 2022
Frederic Kirstein, Jan Philip Wahle, Terry Ruas, Bela Gipp
[pdf] [bibtex] [code]
D3: A Massive Dataset of Scholarly Metadata for Analyzing the State of Computer Science Research
LREC 2022 (Oral)
Jan Philip Wahle, Terry Ruas, Saif Mohammad, Bela Gipp
[pdf] [bibtex] [code]
Identifying Machine-Paraphrased Plagiarism
iConference 2022
Jan Philip Wahle, Terry Ruas, Tomas Foltýnek, Norman Meuschke, Bela Gipp
[pdf] [bibtex] [code]
Testing the Generalization of Neural Language Models for COVID-19 Misinformation Detection
iConference 2022
Jan Philip Wahle, Nischal Ashok, Terry Ruas, Norman Meuschke, Tirthankar Ghosal, Bela Gipp
[pdf] [bibtex] [code]
Are Neural Language Models Good Plagiarists? A Benchmark for Neural Paraphrase Detection
JCDL 2021
Jan Philip Wahle, Terry Ruas, Norman Meuschke, Bela Gipp
[pdf] [bibtex] [code+data]


I enjoy sharing ideas that excite me and get you excited about them too. Check out where I've been chatting lately – who knows, you might catch me at the next conference buffet. You can find more on YouTube .

We Are Who We Cite: Bridges Of Influence Between NLP And Other Fields

A talk I gave at EMNLP 2023 in Singapore on the cross-field influence of NLP and other fields.

The Elephant In The Room: Analyzing Big Tech Presence In NLP Research

A talk I gave at ACL 2023 in Toronto on the influence of Big Tech companies on NLP research.

AI Usage Cards: Responsibly Reporting AI-Generated Content

A pre-recorded talk for JCDL 2023 in New Mexico about a system to document AI use in research papers.

D3: A Massive Dataset Of Scholarly Metadata

A pre-recorded talk I also gave at LREC 2022 in Marseille on a dataset for scholarly metadata.

Identifying Machine-Paraphrased Plagiarism

A pre-recorded talk I also gave hybrid during COVID on detecting machine-paraphrased plagiarism.


University of Göttingen
Papendiek 14, Office 0.209​
37073 Göttingen, Germany
wahle {at} uni-goettingen {dot} de

Other things

Also text me if you like any of these: table tennis, chess, FPV drones, piano, techno music.