Seminar Applied Artificial Intelligence

Content

Selected topics from the field of socio-technical knowledge (see topics of the lecture Collaborative Intelligence). On the basis of the sources, the seminar teaches the students how to create and present a scientific paper. Students are also introduced to the review of papers. The final presentation is carried out in the form of a block event. More details will be given in the obligatory kick-off meeting.

Requirements

This seminar is offered to both Bachelor and Master students. Registration via OLAT is required for this seminar. As we provide students with subjects as well as tutors, the number of seminar places is limited accordingly.

Materials

You can find all course materials, news and information in OLAT. The access code for the OLAT course is given in the kick-off meeting.

Organisation

The seminar takes place as a block event at the end of the semester. On Tuesday, April 26, 2022, there will be a mandatory introductory online meeting at 11:00 a.m. via BigBlueButton (BBB), where the form and topics of the seminar will be presented, and the OLAT access code is published:

During the lecture period, students will work out the topics. Discussions with their supervisor will take place individually. We offer a mandatory two-hour course about scientific writing and the work with LaTeX.

The paper is to be written in English and should consist of 10 (Bachelor: 8) pages at the end. The presentations, which are also given in English, take place at the end of the semester and last about 25 (Bachelor: 20) minutes each, including questions. Students will receive obligatory templates for their seminar paper as well as for the presentation slides.

Topics

List of topics with a short description and the corresponding supervisor:

  • Feature Encoding for Genomic Sequences in ML/DL (Ahtisham Fazeel)Feature extraction/encoding (FE) is a critical step while applying ML/DL models to the genomic sequences. As it is a pressing topic, a lot of research is going on in this area. In this work, a survery on such techniques should be presented such that it could provide an overview of FE methods for genomic sequences.
  • Pedestrian Tracking for Autonomous Driving (Abdul Hannan Khan)One of the major objectives of an autonomous vehicles is to avoid collisions with pedestrians by detecting and tracking them. A lot of research has been done in recent years to improve pedestrian tracking inorder to improve overall preception of an autonomous vehicle. This work will focus on a survey of recent pedestrian tracking techniques with analysis focusing accuracy and inference time tradeoff.
  • On Privacy-Preserving Machine Learning - Understanding the Risks (Dominique Mercier)This seminar is about the privacy preserving deep learning and the impact of different attack and defense mechanisms. Deep learning has proven to achieve incredible results in different areas. However, data sensitivity plays a pivotal role when it comes to the real-world deployment of artificial intelligence in safety-critical domain. E.g. to utilize the full potential of deep neural networks is it crucial to understand the attack mechanism and evaluate their impact on the proposed model. In conjunction with the attacks it is important to understand the impact of possible defense mechanisms.
  • Explainable and Privacy Preserved Document Information Extraction (Saifullah, pre-assigned)Extracting information from documents is one of the most important tasks in understanding document analysis, which involves extracting text, image, or layout information from semi-structured documents. As in many other areas, deep learning has been shown to achieve exceptional results in this area, but when it comes to explainability and privacy of the models, research still lags far behind. Both explainability and privacy are equally important for the application of deep learning models in the real world. Therefore, it is crucial to understand how these two points can be achieved in the field of document information extraction and what approaches currently exist to address them.
  • GANs for XAI (Adriano Lucieri)Generative Adversarial Networks (GANs) are powerful models for the synthesis of new, realistic data samples. Architectures like StyleGAN are able to produce photorealistic images of non-existing persons, scenes and objects, even with the ability to precisely condition the generation process with respect to specific attributes. Apart from their ability to create new content, GANs are also promising in their function to explain existing classifiers, or as intrinsically interpretable models. The goal of this seminar is to conduct a literature search of applications of GANs in the context of explainable AI (XAI).
  • Satellite radar data for ML models (Cristhian Sanchez)Optical satellite images are conventionally used in ML models because of the potential information one can acquire through different electromagnetic spectrum, yet optical images are limitated by wheather conditions. This limitation can be covered up by using or combining synthetic aperture radar imaging (SAR) data, e.g. Sentinel-1 products, with optical data. In the other hand, the complexity on radar data may be challenging. Therefore, the goal of this seminar is to provide an overview on the ML techniques used to handle SAR data and the fields of applications (e.g. agriculture, terrain deformations).
  • Survey on Recent Optical Flow Methods (Christiano Gava, pre-assigned)Optical flow is a well known research topic in Computer Vision. With the recent advances in Deep Learning, some learning-based algorithms yield impressive results. These algorithms are, however, developed exclusively for perspective images and cannot be directly applied to spherical images. This seminar aims at providing insights on which ideas, network architectures and design choices could be combined or adapted to develop a learning-based approach for optical flow estimation on spherical images.
  • Self Supervised Learning in Earth Observation (Marco Stricker)Satellites are constantly observing the earth and are generataing large amounts of data in a high frequency. Due to this scale, the data needs to be analyzed automatically with AI. However, the generated data is unlabeled and finding annoted datasets for the large amount of possible use cases is challenging and even if such datasets exist, the amount of samples is mostly limited. In order to alleviate this problem, several researchers start using self supervised learning. The goal of this seminar is to present several ways researchers have included self supvervised learning for earth observation tasks.
  • Gaussian Processes in Remote Sensing (Hiba Najjar)Multi-spectral satellite images with high spatial and temporal resolution are often used in environmental and human sustainability applications, such as crop yield prediction, species distribution modeling, climate change modeling and natural disaster prevention. A recent and promising trend has focused on exploring the potential of using Gaussian Processes (GP) to address these applications, under the assumption that they constitute a solid Bayesian framework, consistently formulating many function approximation problems. Among other goals, they are often used for probabilistic prediction, uncertainty estimation, parameter retrieval and model inversion. The focus of the Seminar will thus be to review the different variants of GP developed to address remote sensing and earth observation applications.
  • Physics-Based Modeling using Machine Learning (Miro Miranda)Data driven machine learning have proven to be highly successful, but sometimes is lacking accuracy in solving complex systems.There is a growing consensus that the solution to complex systems require novel methodologies that integrate the advantages of ML methods with traditional physics-based models. A prominent example is the solution and discovery of ordinary or partial differential equations. But there are also other applications such as the physics-guided design of model architectures. The topic of the seminar will be to review the different methods for physics-guided machine learning. The focus will be on remote sensing data such as Sentinel-2 imagery, Elevation models, or climate data. The goal will be to find potential application areas within these fields of research but also to get familiar with physics-informed machine learning.
  • ML based Sentiment Analysis (Marc Gänsler)Sentiment analysis based on machine learning is a subfield of text mining. It refers to the automatic evaluation of texts with the goal of identifying an expressed attitude towards something (company, brand, product, person, project, ...) as positive or negative. This mood / public opinion towards a certain topic usually fluctuates over time. With a sufficiently large number of news articles, it is possible to create sentiment time series on any topic, accurate to the day. The goal of this paper review is to compile the current state of research on this topic: Which machine learning methods are currently used in the field of sentiment analysis? Which of them are particularly suitable and why? Where are the biggest challenges at the moment? In which direction is research moving? What new types of applications for sentiment analysis are currently emerging or could emerge in the future?
  • Federated Learning for Non-IID data (Dayananda Herurkar)Federated learning (FL) is a distributed machine learning framework for solving privacy and security issues. This approach involves training multiple client models on their local machine, aggregating their updates to a final global model, and then transferring the global updates back to clients. With all its advantages wrt. privacy, FL suffers from Non-IID data, resulting in performance deterioration of FL models (some clients gain no benefit as the global model is not good as local models trained on their local machines). This seminar explores a few methods that have already been proposed to overcome Non-IID problems in FL.
  • Geospatial Metadata (Julia Mayer)When working with many different large datasets we can’t get around a well-organized and established handling of metadata to ensure searchability, interchangeability and to keep an eye on various aspects of data quality. Here we will explore the standards, formats, common tools and problems when collecting data about datasets in a metadatabase, with a strong focus on geospatial data.
  • Kalman Filter for assimilation of satellite data in process based crop models (Marcela Charfuelan)In many situations and problems we are interested in modeling how a phenomena change over time. One popular methodology to do so is to use state space models, which are a representation of some physical system where input, output and state variables are related by first-order differential equations. In these systems it is important to estimate the so-called belief state, given observed data. Kalman Filter (KF) and its many variantsare one of the methods used to track changes, basically KF allows us to estimate the state of a process. KF has been used in many applications, from navigation and tracking of objects to robotic motion and data assimilation in crop models. Data assimilation is the technique whereby remote sensing data (e.g. satellite data) are used as inputs in crop models, to adjust or reset state variables. The objective of this seminar topic is to make a review of recent publications related to assimilation of satellite data into crop models, in particular the use of variants of the Kalman filter.
  • Attention Mechanism in Deep Learning for Earth Observation (Francisco Mena Toro)Earth Observation (EO) allows the study and analysis of different aspects of human life and natural resources, usually under multiple observations of Remote Sensing (RS). Deep learning models are often a suitable option for modeling the complex and heterogeneous relationships in RS data. While attention mechanisms have proven to be a key factor in different problems of deep learning by having the ability to self-adapt the input in different layers based on its own information. With this, the models become more dynamic and adaptive. For instance, in a meteorological time series, the model might put attention to the first values, as they contain the most important information for prediction in a particular field, however, it might learn to put attention to the last times in another field. The advantages of the attention mechanism in deep learning have not been fully exploited and explored in EO. The objective will be to provide a review of publications in deep learning methods that use attention mechanism in EO applications, taking notes on the advantages, disadvantages and limitations of the applicability.
  • Sensor-fusion in Autonomous Driving (Vikas Rajashekar)This seminar topic is intended to perform a literature survey on recent trends in sensor-fusion of camera vision, LiDAR sensors, and Radar sensors.
  • Self-supervised Representation Learning for Time Series Data (Deepak Pathak)Discriminative feature learning is essential for any deep learning model to perform well on any downstream task such as classification, regression, etc. In recent, self-supervised methods have achieved better results in computer-vision problems from unlabeled data. Motivated by contrastive learning and a self-supervised learning approach, several research studies show promising results to learn "good" features from unlabeled data for time series data. This research can be helpful in a problem where labeled time series data is limited. Hence, this seminar aims to do a literature review of methods and research available on self-supervised representation learning for time series data.
  • Solving Partial Differential Equations by learning solution operators (Dinesh Krishna Natarajan)To improve the modeling and simulation of physical systems governed by partial differential equations (PDEs), deep learning is being increasingly used to replace the slow traditional PDE solvers. Initially, physics-informed neural networks (PINNs) were used to enforce the physical laws on the predictions of the neural network. Recently, the focus of physics-informed deep learning has evolved from learning the mapping between simulation data (PINNs) towards the learning of the solution operator from simulation data, as in Deep Operator Networks (DeepONets) and Physics-informed Neural Operators (PINO). The goal of this seminar is to review the latest state-of-the-art methods in learning solution operators and their applications in PDE-based fluid simulations.
  • Rule-based Approaches for Knowledge Graph Completion (Michael Schulze)Knowledge graphs are typically incomplete, and numerous approaches exist for predicting missing links. This thesis surveys on recent non-latent, rule-based approaches for knowledge graph completion that are able to provide interpretability and explanations.
  • Deep Learning Approaches for File Carving (Mirjam Wehr)When deleting a digital file, fragments of that file remain on the disk. To recreate that file, one must find which fragments belong together. This is called File Carving. Currently most approaches in that area try to find characteristic markers and patterns but generate a lot of false positives doing so. To improve the classification of file fragments deep learning approaches are used.
  • Named Entity Recognition with External Knowledge (Mirjam Wehr)To train new and more specialized Named Entity Recognizer requires a lot of labeled data. That however is very hard to come by. Therefor there is a lot of research regarding different approaches to train NER without or with only small, labeled datasets. One of these approaches is to take External Knowledge like Knowledge Graphs, Gazetteers, or the Knowledge of different Domains, into account.
  • Survey on Graph Neural Networks (Christoph Balada)The applications of GNNs are manifold and range from the prediction of material properties in chemistry, to the prediction of protein interactions in the research of new drugs, to the application on social graphs, such as Facebook. However, unlike ConvNets, where standard architectures such as the popular ResNet have become established, the field of GNNs is still open and no one-size-fits-all solution has yet been found. GNNs can differ in the way the convolution is computed, input is provided (node and/or edge features) and what kind of output is generated. The goal of this seminar is to give an overview of the state of the art in GNNs. For this overview, two aspects in particular will be considered. Firstly, the way a convolution is calculated and secondly, the way edge and node features are taken into account.
  • Machine Learning Approaches for Evaluating Document or Generating Document (Ko Watanabe)Now a days, Deep L or Grammerly exists to support editors or writters to correct grammers in the sentences. For the extention of the research, we want to identify whether there are ways to evaluate the writtings it self. Or if there are any way to generate documents that you want to write automatically by giving some certain vocabularies. We hope to see if you can evaluate your own survey according to the algorithm from what you've search. Looking forward to see the findings.
  • Review of methods for Cell tracking (Nabeel Khalid)Object tracking has many applications in various fields specially in biomedical domain. Cell tracking can be used to detect changes in cell movement patterns which can then be used to detect diseases, help in development of medicine etc. There are different approaches available for cell tracking using both traditional computer vision approaches and deep learning based approaches. This survey will focus on comparison of different approaches for cell tracking.
  • From Text to Knowledge Graphs (Marc Gänsler)Auto-generating knowledge graphs from text (e.g. from news articles on the web) is a difficult challenge and faces several problems. The goal of this paper review is to compile an overview about the current research regarding this area: What kind of approaches are currently used to auto-generate triples / KGs from text, or from single sentences? How precise are these approaches? What are the main challenges? What kind of Machine Learning technologies play an important role here? Where is the research focus going?

Topics marked as pre-assigned have been already assigned to students working with DFKI before.

Contact