Thesis Topics
This list includes topics for potential bachelor or master theses, guided research, projects, seminars, and other activities. Search with Ctrl+F for desired keywords, e.g. ‘machine learning’ or others.
PLEASE NOTE: If you are interested in any of these topics, click the respective supervisor link to send a message with a simple CV, grade sheet, and topic ideas (if any). We will answer shortly.
Of course, your own ideas are always welcome!
Object-Background combination. A novel data-augmentation for Object Detection
Type of Work:
- Master
Keywords:
- data augmentation
- object detection
- synthetic data
- vision transformer
Description:
Object detection, the task of identifying and localizing instances of objects within an image by predicting bounding boxes and class labels, is a cornerstone of many computer vision applications such as surveillance, autonomous navigation, visual search, and robotics. The performance of deep learning models for object detection is critically dependent on the availability of large, diverse datasets with accurate bounding box annotations. Manually creating such datasets is a laborious and costly endeavor.
Data augmentation techniques are indispensable for synthetically enlarging training datasets, thereby improving model robustness and generalization. This thesis will investigate a novel, cutting-edge data augmentation method that segments objects from existing images and intelligently recombines them with various background images. A significant advantage of this approach is the automatic generation of ground-truth bounding boxes for the newly composed scenes, directly addressing a major bottleneck in object detection dataset creation.
This thesis aims to thoroughly explore and leverage this novel object-based recombination data augmentation technique specifically for the task of object detection. We hypothesize that this method can substantially enhance the performance, robustness (e.g., to scale, occlusion, and varied contexts), and domain adaptation capabilities of object detection models.
NLP based Protein Design
Type of Work:
- Guided Research
- Master
- Project
Keywords:
- bioinformatics
- CLIP models
- multi-modal learning
- natural language processing
- protein design
- protein engineering
- text-guided design
Description:
This project explores using natural language descriptions to guide protein design and engineering. Instead of only using protein sequences and structures, this approach incorporates human knowledge about protein functions written in text format, such as “binds to DNA” or “catalyzes glucose breakdown.”
The thesis will investigate multi-modal frameworks that can understand both textual descriptions of desired protein functions and protein sequence data. Students will work on developing models that can take text descriptions like “design an enzyme that breaks down plastic” and generate corresponding protein sequences with those properties. This approach bridges the gap between high-level functional descriptions that humans understand and the complex molecular details needed for actual protein design.
The goal is to make protein design more accessible by allowing researchers to describe what they want a protein to do in plain English, rather than requiring deep expertise in protein structure and biochemistry. This could accelerate the development of new enzymes for biotechnology, medicine, and environmental applications.
References:
- A Text-guided Protein Design Framework
- Natural Language Prompts Guide the Design of Novel Functional Protein Sequences
Plant Genetics based AI-driven Breeding Tool
Type of Work:
- Master
Keywords:
- agriculture AI
- crop optimization
- genetics
- genomics
- machine learning
- phenotype prediction
- plant breeding
Description:
This project aims to develop an AI tool that helps plant breeders create better crops faster and more efficiently. Traditional plant breeding takes many years of trial and error to develop crops with desired traits like higher yield, disease resistance, or drought tolerance.
The thesis will focus on building machine learning models that can predict which plant crosses will produce the best offspring based on genetic data. Students will work with plant genomic datasets to train AI models that can suggest optimal breeding strategies. The tool will analyze genetic markers and predict traits like crop yield, nutritional content, or environmental adaptability before plants are actually grown.
This approach can significantly reduce the time needed to develop new crop varieties from decades to just a few years. The AI system will help farmers and researchers make data-driven decisions about which plants to breed together, leading to more sustainable agriculture and better food security.
References:
- Genomic selection in plant breeding: methods, models, and perspectives
- Machine learning for plant breeding and biotechnology
RNA Sequence Design using Deep Learning
Type of Work:
- Master
Keywords:
- bioinformatics
- deep learning
- neural networks
- RNA design
- sequence optimization
Description:
The goal of this project is to use deep learning methods to design RNA sequences with specific desired functions. Traditional RNA design relies on complex rules and manual optimization, which can be slow and limited. This thesis will explore how neural networks can learn patterns from existing RNA data to automatically generate new sequences that fold into target structures or perform specific biological tasks.
The project will focus on training deep learning models on RNA sequence-structure datasets and developing methods to generate functional RNA molecules. The approach will combine sequence generation techniques with structure prediction to ensure the designed RNAs can actually fold correctly and work as intended.
References:
- RNA design rules from a massive open laboratory
- Improved RNA secondary structure prediction by maximizing expected accuracy
Importance-Sampled Coresets via Neural Image Compression
Type of Work:
- Guided Research
- Master
Keywords:
- coreset selection
- deep learning
- neural image compression
Description:
The goal of this project is to explore the intersection of coreset selection [1] and neural image compression [2] for data-efficient training in deep learning. Specifically, the thesis will investigate the use of importance-sampled coresets based on the compressibility of input samples. The core idea is that the ease with which an image can be compressed by a neural compression model may reflect its redundancy or informativeness. By analyzing the latent representations and compression performance (e.g., reconstruction error, bitrate) of a neural compressor, the project will aim to define an importance metric. This metric will then be used to select a subset of training data - the coreset - that is representative yet compact.
- [1] A Coreset Selection of Coreset Selection Literature: Introduction and Recent Advances
- [2] COOL-CHIC: Coordinate-based Low Complexity Hierarchical Image Codec
Efficient Optimization with Multi-Level Gradient Accumulation
Type of Work:
- Master
Keywords:
- machine learning
- optimization
Description:
Multi-level methods are widely used in numerical analysis to solve problems efficiently by combining solutions across coarse and fine resolutions (levels). This project explores how a similar idea can be applied to gradient-based optimization in deep learning: gradients are first computed on coarse levels (e.g. low resolution or small size) using a large batch size, then refined using residual gradients from finer levels. The goal is to improve the quality of gradient estimates while reducing the computational cost of high-resolution training. The student will implement this approach in Jax and test it on models for classification or generative tasks. Background in deep learning and interest in optimization techniques is important; familiarity with Python, Jax/PyTorch and NumPy is a plus but not strictly required.
Pruning image super-resolution models by removing unnecessary ReLU activations.
Type of Work:
- Guided Research
- Master
Keywords:
- Deep Learning
- Image Processing
- Image Super-Resolution
Description:
This work investigates the optimization of image super-resolution neural network architectures by removing ReLU and other noise-canceling activation layers. The resulting method should combine convolution layers surrounding the removed activation layers into a single convolution layer, reducing redundancy and improving computational efficiency. As a starting point, the selection of ReLU layers for removal will be based on an analysis of their activations distributions (non-noise-canceled vs noise-canceled) using a representative dataset.
Combining Dynamic Attention-Guided Diffusion and Wavelet-Based Diffusion for Image Super-Resolution
Type of Work:
- Guided Research
- Master
Keywords:
- deep learning
- single image super-resolution
- vision transformer
Description:
This thesis focuses on merging two techniques developed in our group [1, 2]. The first component, Dynamic Attention-Guided Diffusion, allows selective diffusion across regions of interest in the image, driven by time-dependent attention mechanisms. This method ensures that only certain parts of the image are diffused at specific time-steps, enhancing focus on critical image regions. The second component, Wavelet-based Diffusion, introduces image processing in the frequency domain via discrete wavelet transforms (DWT). Instead of working in the pixel domain, this method applies diffusion in the frequency domain, effectively capturing and enhancing multiscale image details. By combining these approaches, this work will explore the synergy of frequency-domain wavelet transforms with dynamic, time-based attention in diffusion models. The research aims to produce sharper, high-resolution images by diffusing across relevant areas in both the spatial and frequency domains, leading to more efficient and accurate SR results.