Jan Philip Wahle

Research Scientist at the University of Göttingen

Jan Philip is a computer science researcher at the University of Göttingen, Germany, with a focus on NLP and AI research. Currently pursuing his Ph.D. in Göttingen, he has also worked with leading scholars in the field at the National Research Council, Canada.

Jan’s research interests lie in two main applications of NLP: paraphrase detection and plagiarism detection, both of which are critical for improving natural language processing. His work in paraphrase detection involves developing algorithms that can identify when two phrases have the same meaning across different lexical variations through paraphrase types, which have numerous applications in NLP tasks like text-summarizationmisinformation detection, and plagiarism detection.

In the area of plagiarism detection, Jan is focused on developing techniques to identify automatically generated content from models like ChatGPT, which are becoming increasingly prevalent as AI and machine learning advance. Jan’s research aims to provide transparency to this type of content and distinguish it from content created by humans.

Jan has published numerous papers in top-tier NLP and AI venues and has written several blog posts on society and ethics in AI. He is also a sought-after speaker and has given talks on NLP and AI around the world.  To stay up-to-date with Jan Philip’s research, you can follow him on Twitter and LinkedIn. When not working on research, Jan enjoys outdoor activities like hiking, skiing, and flying fpv drones, as well as playing chess and spending time with his family.

Breaking Down Barriers In NLP And AI​

NLP Breakthrough and Advancing AI Practices

I’m specializing in paraphrase and plagiarism detection, and content generation. With my expertise in lexical semantics and semantic similarity, I’m dedicated to finding solutions that provide transparency and lead to responsible AI practices for various applications such as text-summarization or misinformation detection. For a short summary of my CV, see my ORCID or LinkedIn.

Plagiarism Detection

Research in plagiarism detection aims to distinguish between content generated by AI models and that created by humans, in order to provide transparency and ensure proper attribution of user-generated content.

Content Generation

 Research is being conducted to explore the capabilities and limitations of AI models in generating human-like content, and to develop techniques to distinguish automatically generated content from that created by humans.

Paraphrase Generation

This research focuses on using NLP, machine learning, and deep learning techniques to generate different paraphrases of a given text while preserving its original meaning. The goal is to improve the accuracy of various NLP tasks, advancing the field of AI research.

Lexical Semantics

Research in lexical semantics involves studying the meanings of words and how they are used in context, which is crucial for many NLP tasks, such as machine translation and sentiment analysis.


Work in text summarization involves developing models that can automatically generate condensed versions of longer texts, which can be useful in a variety of machine learning and NLP tasks.

Semantic Similarity

Focused on developing algorithms that accurately measure the similarity between two pieces of text based on their underlying meaning, using NLP, machine learning, and deep learning techniques. The ultimate goal is to advance AI research and improve the performance of various NLP tasks.