Seminar Applied Artificial Intelligence

Content

Selected topics from the field of socio-technical knowledge (see topics of the lecture Collaborative Intelligence). The seminar teaches the students how to write and present a scientific paper on a specific topic. Students are also introduced to doing a literature review of scientific papers. The final presentations will be carried out in the form of a block event, which will be in-person at the DFKI site in Kaiserslautern. More details will be given in the mandatory introductory meeting.

Requirements

This seminar is offered to both Bachelor and Master students. Registration via OLAT is required for this seminar; the access code will be given in the introductory meeting. As we provide each student with a topic and a tutor, the number of seminar places is limited to the number of topics available this semester (see below).

Materials

You can find all course materials, news and information in OLAT. The access code for the OLAT course will be given in the introductory meeting.

Organisation

The seminar presentations of all students take place as a block event at the end of the semester. This will be organized in-person at the DFKI site in Kaiserslautern.

There will be a mandatory introductory online meeting via BigBlueButton (BBB), where the form and topics of the seminar will be presented, and the OLAT access code will be published:

Introductory meeting: 23.04.2024 (Tuesday), 11:00 - 12:00
Meeting link: https://bbb.rlp.net/b/nat-g1u-mfm-mdv
Participant Access code: 391026

During the lecture period, students will work on the assigned seminar topics with guidance from their supervisors. Discussions with their supervisor will take place individually. Meetings with the supervisors and organizers take place virtually. Based on requiremernt, we offer a two-hour course about scientific writing and working with LaTeX.

The paper is to be written in English and should be of length 10 pages (Bachelor: 8 pages) at the end. The presentations, which are also given in English, take place at the end of the semester and last about 25 minutes each (Bachelor: 20 minutes), including questions. Students should follow the provided templates for their seminar paper.

Topics

List of topics with a short description, corresponding supervisor and preferred level of student (Bachelor/Master/Any). Bachelors students are eligible for topics marked as Bachelor or Any.

[Any] Datasets and NN Architectures for SITS data (Francisco Mena)
Earth observation has been an open field for deep learning research. As a challenging field where the data is quite heterogeneous and with various structures (metadata, time series, static or temporal images), has allowed the exploration of quite diverse neural network (NN) architectures. Particularly, the Satellite Image Time Series (SITS) is a data structure that can be handled with different models depending on the task: classification, segmentation, or regression. Satellite images are different from standard RGB images since they can contain multi-spectral information (several bands in the sunlight spectrum) or be of an entirely different type, like radar or lidar images. In this seminar, the student will search for datasets that use SITS as input data for different predictive tasks. In addition, it has to identify different models (NN architectures but also any other) that can handle this data type.
[Any] From aerial images to city structure types using AI (Julia Mayer)
Maps play a major role in the planning of cities. There are 52 differentiated area types that are available as mapping units for the urban structure and are summarized in 16 general structure types. How can this classification be (partially) automated using aerial images and AI? The aim of this study is to find existing approaches and their applicability to german standards.
[Any] Automated indicator recognition to assess the local climate (Julia Mayer)
Many different indicators are used to assess the local climate. These include, for example, the degree of sealing, shading and topography. The aim of this study is to investigate which of these indicators can be recorded (partially) automatically and under what exact conditions.
[Master] Knowledge Graph-enhanced Prompting (Desiree Heim)
Large Language Models (LLMs) get currently a lot of attention and are often utilized to solve natural language processing tasks. A big advantage of LLMs is that they are easily accessible also for non-experts since they allow humans to interact with them using natural language. However, humans formulate their prompts to LLMs often not precisely or leave out important contextual information that is clear to them but might lead to ambiguous prompts. Knowledge graphs (KGs) as an established solution for explicit knowledge representation can be utilized to expand or enhance prompts to guide the LLM output generation more, clarify ambiguous passages and add missing information. The goal of this seminar is to create an overview of existing methods enhancing prompts with information from the KG and classify the approaches regarding suitable criteria.
[Any] Deep Transfer Learning in Time Series Anomaly Detection (Ensiye Tahaei)
Deep transfer learning is emerging as an approach for improving anomaly detection in time series data. This seminar aims to explore integrating deep transfer learning into anomaly detection and discover how this technique can lead to more accurate, efficient, and reliable detection methods. The goal is to provide an in-depth survey of current methodologies and applications in the field.
[Master] Classification of Learing Agents in Social Simulation (Veronika Kurchyna)
The Journal of Articial Societies and Social Simulation (JASSS) is the leading journal of this field. Machine Learning’s recent popularity has led to an influx of research integrating learning mechanisms into agent technologies. However, from a traditional theory-driven agent perspective: what concepts of learning are being used? Are the agents truly learning as part of the model, or is learning a meta-step before the actual simulation? The goal of this topic is to learn of the major types of learning as understood in psychological theory and classify the usage of these learning theories in publications in the JASSS, either by manual screening of relevant articles or utilising novel, partially automated literature screening-approaches.
[Master] A Systematic Review of AI Approaches in Biomarker Discovery (Ahtisham Fazeel Abbasi)
More than 10,000 diseases contribute to millions of deaths annually. Modern medicine's inability to offer generalized treatments has hindered efforts to reduce this mortality rate. Precision medicine, however, presents personalized solutions by delving into patients' genetic makeup and providing targeted interventions. A pivotal aspect of precision medicine lies in biomarker discovery, aiming to pinpoint genetic factors underlying diseases. This seminar aims to consolidate studies on biomarker discovery, creating a cohesive repository for comprehensive information. The objective is to encompass various diseases, data modalities, feature engineering methods, machine or deep learning models, and interpretability or explainability approaches under one unified platform.
[Any] Deep Learning for Audio Data Classification (Philipp Engler)
Deep learning methods can make sense of patterns in complex signals and are able to deal with vast amounts of data. They can be used to interpret and classify audio signals such as music, speech or noise. For audio signals the representation or preprocessing of the data is a critical choice, as the raw data can be difficult to analyze with sampling rates often as high as 48 kHz and lots of periodicities. Depending on the task either timings, frequencies or combinations of such may be important cues for a deep learning model. Preprocessing in different ways may help to extract the salient features or may destroy important bits of information. Further, various neural network architectures exist that can be used, such as convolutional neural networks (CNNs), recurrent neural networks (RNNs) and transformer architectures - even combinations are possible. We aim to find and compare interesting and promising methods published over recent years for classifying audio data with deep learning models.
[Master] Quantitative Evaluation of Explainability Methods in Artificial Intelligence (Adriano Lucieri)
The field of Artificial Intelligence (AI) resulted in advanced models capable of performing tasks with accuracy surpassing human capabilities in certain domains. However, the complexity and opacity of these models, particularly in deep learning, have raised concerns regarding their interpretability and explainability. Explainable AI (XAI) seeks to address these concerns by developing methodologies that provide insights into the decision-making processes of AI models, thereby fostering trust and transparency. A critical aspect of XAI research involves the quantification of explainability methods, which aims to evaluate and measure the effectiveness and quality of explanations provided by AI systems. This seminar will delve into the current state of research on quantitative evaluation methods for explainability in AI, focusing on the methodologies that assess the goodness of explanations in a measurable manner.
[Master] Recent Advancements in OOD (Out Of Distriution) Detection (Dayananda Herurkar)
OOD Detection refers to the task of identifying data points that differ significantly from the training distribution of a machine learning model. This seminar involves exploration of various techniques and challenges associated with out of distribution detection, highlighting its importance in ensuring the reliability and robustness of machine learning systems. Techniques such as uncertainty estimation is discussed, along with their applications across diverse domains including computer vision, natural language processing.
[Master] Advancements in AI-based Voice Analysis for Disease Detection (Adriano Lucieri)
The field of artificial intelligence (AI) has seen remarkable advancements in the realm of voice analysis integrated in many smart assistants like Amazon's Alexa or Apple's Siri. Voice analysis, however, also bears the potential to support medical diagnosis for early detection of diseases. This seminar will provide a comprehensive review of the current state of the art in AI-driven voice analysis technologies, their applications in healthcare, and the methodologies employed in disease detection and diagnosis. You will explore the intricacies of voice as a biomarker, the volume of data required for robust analysis, the variety of machine learning and deep learning methods utilized, and the challenges and opportunities that lie ahead in this rapidly evolving field.
[Any] Foundational Models for Time Series data (Francisco Mena)
Foundational and pre-trained models has been widely studied in natural language processing and vision domain. Since time series data is quite diverse in each application and use-case, the foundational models has been vaguely explored. This seminar has the purpose of study and search for pre-trained and foundational models for general time series data in different applications and tasks.
[Master] Exploiting Diffusion Models for Uncertainty Quantification (Miro Miranda Lorenz)
Diffusion models offer a versatile framework for understanding complex data interactions, while also providing tools to assess and manage uncertainty. By leveraging diffusion processes, these models enable a comprehensive exploration of data relationships, enhancing both model performance and decision-making processes. This seminar investigates the significance of diffusion models in uncertainty quantification various domains, with a special focus on multimodal and time series data.
[Any] Quantum Computing applications for Remote Sensing (Supreeth Mysore Venkatesh)
This seminar explores the integration of quantum computing with satellite-based Earth Observation/remote sensing. The seminar will focus on how quantum computing can address complex challenges in satellite operations, including, but not limited to, mission planning, scheduling, constellation formation, orbit optimization, and collision avoidance. Participants will review theoretical frameworks and discuss potential quantum algorithms that could enhance the efficiency and accuracy of managing satellite systems. This seminar is suited for students interested in the convergence of advanced computing technologies and aerospace applications, providing insights into the future of space technology.
[Master] Deep Learning Approaches for Requirement Analysis (Summra Saleem)
"Following the success of artificial intelligence approaches in diverse types of application areas (energy, NLP and bioinformatics), software development industry is trying to utilize the power of deep learning methods for the development of more accurate and reliable software. Specifically, in software development, requirement engineering through deep learning based approaches is an active area of research. The prime objective of this project is to utilize deep learning approaches to empower the process of requirement analysis.
[Master] Deep Learning Approaches for Bio-chemistry (Hina Ghafoor)
Following the success of artificial intelligence (AI) in various application domains such as energy, natural language processing (NLP), and biochemistry, the biomedical industry is increasingly turning to the power of deep learning methods to enhance the accuracy and reliability of tools. Specifically, within the realm of biochemistry, compound classification using deep learning approaches has emerged as a prominent area of research. The primary goal of this project is to leverage deep learning methodologies to advance compound classification techniques in the field of biochemistry, thereby empowering researchers with AI-driven tools for more precise and efficient analysis.
[Bachelor] Denoising Bio-Images with Deep Learning (Brian Moser)
Bio-Images offer vital insights into dynamic biological processes, crucial for understanding health and disease mechanisms. However, the imaging of live samples often requires low light conditions to minimize light toxicity, resulting in noisy images that are difficult to interpret. Deep Learning has proven highly effective in noise reduction while preserving essential signals. Deep Learning methods learn from example data, offering a content-aware solution. In this seminar, we will discuss various types of noise commonly found in Bio-images and define denoising challenges. We then explore leading Deep Learning techniques, highlighting their benefits and limitations to aid users in selecting the most suitable methods for their specific needs.
[Master] Segment Anything in 3D (Fabian Schmeisser)
Image segmentation is a staple task in the field of computer vision. The accurate segmentation of objects in images has an incredible range of applications, from automated driving, over biomedical image analysis, and many more. The Segment Anything Model (SAM) is a foundation model for image segmentation which revolutionized the field since its publication in 2023. Next to remarkable zero-shot performance for a wide variety of different image modalities, many studies additionally fine-tune its capabilities for a specific domain. The aim of this seminar topic is to compile a comprehensive review of adaptations of SAM that tackle the task of 3D image segmentation.
[Any] Transformer based Image Segmentation (Tobias Nauen)
Transformers constitute the state of the art in NLP and CV and are the state of the art in image classification. However, their application to dense prediction tasks, like semantic segmentation, has presented unique challenges. Unlike CNNs that excel in capturing local features, transformers struggle because the inherent bias introduced by cutting up images into patches. This seminar delves into how researchers are overcoming these hurdles by designing transformer-based architectures specifically for semantic segmentation. The student will understand the key challenges in bringing transformers to semantic segmentation and what approaches have been proposed to overcome these. The student will look into the current state of the art transformers for semantic segmentation and compare and contrast approaches taken to recover the fine-grained image details.
[Master] Handwriting Generation through Stylistic Adaptation between Similarly Written Languages. (Nauman Riaz)
The evolution of generative artificial intelligence (AI) has paved the way for innovative applications across diverse fields, including linguistic transcription and script adaptation. Recent developments in AI-driven generative models present a remarkable capacity to bridge gaps between languages and scripts, making them an especially valuable tool in language learning and digital document processing. The study will focuses on the intricate challenge of adapting the handwriting styles of English writers to German characters using generative AI models. Considering the uniqueness of individual handwriting and the distinct characteristics of German script, we will focus on the exploration of how contemporary techniques such as Generative Adversarial Networks (GANs) and Transformer models can be harnessed to replicate and transfer personal handwriting styles from one similarly written alphabetic script to another. By delving into the specifics of neural network-based style transfer and character generation, the aim is not only to achieve stylistic adaptation but also to maintain the legibility and authenticity of handwritten texts.
[Master] Statistical Heterogeneity in Federated Learning (Ahmed Anwar)
In statistical machine learning, one of the fundamental assumptions is that data is independent and identically distributed. However, in certain scenarios such as federated learning, this assumption is challenged, due to the potential diversity of data sources used for training or at inference. This problem is called Statistical Heterogeneity or the “Non-IID” problem, and is regarded as one of the main obstacles for efficient federated learning. In this seminar, we aim at showing the significance of the heterogeneity problem by briefly shedding the light on the IID assumption in statistical machine learning algorithms. After identifying the problem, we look through the literature at the common approaches of mitigating it especially under the federated learning setup.
[Master] Large Language Models for Data Management (Marc Gänsler)
This seminar project explores the role of Large Language Models (LLMs) in data management. The student will review current research to understand how LLMs are changing the way we handle large databases. Topics include the impact of LLMs on data retrieval, query processing, natural language interfaces, data integration, and knowledge discovery. The project aims to identify challenges, opportunities, and implications of using LLMs in data management and suggests future research directions.
[Master] Fake News Detection (Tahseen Rizvi)
This seminar aims to conduct a comprehensive literature review of the evolving landscape of fake news detection, exploring its current status, emerging opportunities, and persistent challenges. Through a meticulous analysis of recent literature, we will gain insights into the methodologies, technologies, and strategies employed in identifying and combating misinformation. This seminar seeks to contribute to a more in-depth understanding of the complexities surrounding fake news detection in contemporary discourse by critically examining the intersection of technological advancements and societal implications.
[Master] Survey of Multi-Modal GenAI Tools in the Educational and Creative Sector (Tahseen Rizvi)
This seminar examines the intersection of open source and commercial Generative AI tools, focusing on their application in the educational and creative sectors like text, image and music generation. This topic will enable us to explore the efficacy and adaptability of these tools across diverse data modalities such as audio, video, images, and avatars. Through comprehensive analysis, we will gain valuable insights into the practical implications, potential benefits, and limitations of incorporating Generative AI technologies into cutting edge solutions designed for educational and creative sectors.
[Master] Analyzing aspects relevant for Training LLMs from Scratch (Tahseen Rizvi)
The seminar aims to explore the profound implications arising from training Large Language Models (LLMs) from scratch. The topic additionally delves into the factors such as dataset curation, regulatory frameworks such as the AI Act, associated expenses, and the evolving architectural paradigms pertinent to the training of LLMs. Furthermore, the seminar will critically examine ethical considerations, including biases, privacy concerns, and societal impacts. This will necessitate a nuanced understanding of the responsible development and deployment of these powerful AI technologies.
[Master] Generative AI in the Financial Sector (Tahseen Rizvi)
This seminar topic investigates the transformative potential of Generative AI, along with its applications and implications within the financial sector. This topic aims to explore how Generative AI models can streamline processes, enhance risk management strategies, and optimize decision-making frameworks within financial institutions. Furthermore, this topic will provide an extensive review of the scholarly discourse on the integration of Generative AI, gaining critical insights into its impact on risk assessment, portfolio management, and customer-centric strategies within the financial sector.
[Master] Generative AI in the Medical Domain (Tahseen Rizvi)
The multifaceted realm of integrating Generative Artificial Intelligence into healthcare is explored in this topic, focusing on innovative applications like drug discovery and personalizing treatment plans for individuals. In this topic, ethical dilemmas, such as safeguarding patient privacy, as well as navigating legal frameworks concerning regulation and liability, are also examined comprehensively. By adopting this holistic perspective, we aim to navigate the complex landscape of integrating generative AI responsibly and ethically into medical practice, fostering advancements while upholding patient well-being and legal compliance.
[Master] Exploring Retrieval Augmented Generation (RAG) Evaluation Frameworks (Tahseen Rizvi)
This seminar intends to provide a detailed exploration of the complexities inherent in evaluating Retrieval Augmented Generation (RAG) systems, encompassing a wide array of frameworks, metrics, and implementations currently in use. This topic also aims to explore the intricate interplay between retrieval-based and generative models, emphasizing the importance of comprehensive evaluation frameworks to measure their performance accurately. We aim to gain insights into the complexities of evaluating hybrid systems, enabling them to discern the most effective strategies for improving the quality and effectiveness of AI-generated content.
[Master] Knowledge Graph-Based Assistants in Finance (Michael Schulze)
Knowledge graphs are more and more adopted in organizations and represent a key technology for applying AI on heterogenous industry data. Consequently, they are also employed for realizing smart assistants to support knowledge workers. This thesis aims to analyze and compare such assistants in the finance domain that leverage knowledge graph technologies.
[Any] Preprocessing Techniques in Microscopic Image Analysis (Nabeel Khalid)
Preprocessing techniques in microscopic image analysis are crucial for improving the quality of images and the accuracy of subsequent segmentation and tracking processes. Review the current state and advancements in preprocessing techniques used in microscopic image analysis to improve the accuracy and efficiency of cell segmentation and tracking.
[Pre-assigned] Finetuning Methods for LLMs (Pervaiz Khan)
Large Language Models (LLMs) have shown success in various NLP tasks such as Text Generation, text summarization, etc. Sometimes their performance is sub-optimal on the new data. Therefore, one needs to finetune them on the new data. However, finetuning of LLMs differs from traditional finetuing methods as LLMs require huge computional resources. Several methods exist in the literature to reduce computational costs associated with finetuing of LLMs. The aim of this topic is to study various finetuning methods of LLMs, the pros and cons of each method.
[Pre-assigned] Appearance-based Gaze Estimation (Jayasankar Santhosh)
Appearance-based gaze estimation offers a promising technology for enhancing e-learning experiences by analyzing eye movements through facial features captured by a webcam. This information is valuable for e-learning platforms as it allows them to understand a student's focus and engagement and also provide real-time feedback if a student's gaze strays from the learning materials. The goal of this seminar is to investigate the state-of-the-art approaches, utilizing appearance-based gaze estimation for e-learning.
[Pre-assigned] Mitigating limitations of perturbation-based explanation methods (Hiba Najjar)
A certain type of explanation method in artificial intelligence relies on perturbing input samples and passing them through networks. The change in their prediction can serve as a proxy to estimate the importance of the perturbed features. One commonly criticized limitation of this approach is the creation of out-of-distribution (OOD) samples, which raises questions about the reliability of the feature importance estimates it provides. In this seminar, the student will explore various perturbation-based methods for explainability introduced in the literature and investigate the proposed solutions to mitigate the OOD limitation.

Topics marked as pre-assigned have been already assigned to students who have previously worked with their DFKI supervisors. These topics are not available for assignment.