Interest for autonomous driving is growing around the globe. Current production car’s assist systems already show some conditional driving automation. Technological advancements allow to achieve higher levels of driving automation, up to autonomous performing of entire dynamic driving tasks. Nonetheless, the save deployment of autonomous vehicles beyond research laboratory environments in real traffic on public roads necessitates further developments, which makes technologies around autonomous vehicles an area of active scientific research. A robust comprehensive environmental perception and understanding are basic requirements for derived actions and save driving behavior. It can only be achieved through the combination of different multi-modal sensing technologies and according data fusion.
An analysis of current trends shows that camera and LiDAR sensing technologies in combination with deep artificial neural network architectures and semantic segmentation in the context of autonomous driving form a set of current and challenging topics, equally interesting from an industrial as well as research perspective, to be addressed in this master thesis in computer science.
With the MIG (Made In Germany), the Free University Berlin maintains an adequately equipped autonomous vehicle platform serving as research platform with permission to operate in real world traffic. As part of this thesis a deep neural network architecture framework suitable for semantic segmentation is integrated in the MIG platform. Furthermore an appropriate deep artificial neural network for pixelwise semantic segmentation of the vehicle’s camera images is selected and implemented. Eventually a system is developed and implemented into the MIG that performs cross-modal transfer of pixelwise semantic labels from 2D images to corresponding 3D point clouds generated by a LiDAR scanner. All implementations and their underlying technologies are then assessed concerning their suitability in the context of autonomous driving.
Representation learning, a set of techniques of automatically discovering the useful features for further processing, significantly influences the final results of the task. However, most of current representation learning methods cannot be applied on the time series data due to the temporal nature of the time series. In order to extract features from this type of date, we propose three representation learning methods. The first one is based on time delayed embeddings, Wasserstein distance, and multidimensional scaling. The second method is based on the variational autoencoder, which is a powerful deep learning model that learns the representation of the data and even generates data of a similar pattern. The third method learns the representation of the data and even generates data of a similar pattern. The third method learns the representation efficiently based on mutual information, which belongs to the field of semi supervised learning. In this thesis, we first explain these three methods mathematically and use examples to illustrate them. Then we implement these methods and test them using standard dataset. Finally, we train these models using the time series data from ThyssenKrupp and demonstrate that the representations given by these three methods share certain consistency. In addition, this master thesis is based on the project with ThyssenKrupp. The time series data is also generated by the rotation of the products produced by ThyssenKrupp.
To accomplish the task of highly automated driving it is necessary to get an understanding of the scene surrounding the autonomous vehicle. One step towards the problem of maneuver recognition in the field of intelligent vehicles is closely related to the task of action recognition of human actions. Approaches found in literature solve the problem of action recognition based on RGB and optical flow images for every-day activities. Multi-stream networks with 3D-convolutional layers are frequently used to solve this task. The fusion methods as well as the inputs vary in literature. Besides this, long short-term memory cells are often found in the area of action recognition. Compared to 3d-convolutions they are able to recognize long term motions.
This thesis discusses the question as to whether approaches for action recognition can be used in the field of maneuver recognition. To achieve this, different fusion methods for multi-stream networks as well as different input images were examined. Furthermore, the impact of the addition of a long short-term memory cell was tested and evaluated.
Collective behavior research is a vast and complex field. Understanding the varying interaction patterns between members of a group and simulating them with the help of models is a difficult and long process. But in recent years biomimetic robots gave rise to new chances of evaluating and verifying the state of the art in this field.
Leading is one key aspect in collective behavior. Decriphering its meachnism would gain the ability to immensely influence a whole group in a desired direction which in turn can be used in experimental setups for investigating social interactions e.g. foraging.
A biomimetic robot can be perfectly exploited for this research question. An implemented model will instantly verify in wet lab experiments whether its underlying logic is giving back the desired reaction from real living beings.
In this work, the model is a modified version of Couzin's zone model, adapted for leading behavior. In this model it is assumed that individuals try to keep the members of their group in a close zone around them. It got implemented in the framework of the RoboFish project as an automatic behavior module and got tested with guppies (Poecilia reticulata).
The results show that leaders can keep larger distances to other members of the group before forced to draw closer again. Followers do need to stay near other individuals to feel comfortable inside the group. Because of this distance relationship, leaders have more freedom in their choices and need to keep less track about the actions of the other members. As the robot is allowed to remove itself for longer distances before it is forced to travel back to the fish than the average guppy, it was indeed able to take the role of a leader several times.
As this work only presents an implementation which concentrate on few core aspects, the module gives promise for more insight into the mechanisms of leading with future expansions considering additional factors in its code base.
Autonomous driving cars have not been a rarity for a long time. Major manufacturers such as Audi, BMW and Google have been researching successfully in this field for years. But universities such as Princeton or the FU-Berlin are also among the leaders. The main focus is on deep learning algorithms. However, these have the disadvantage that if a situation becomes more complex, enormous amounts of data are needed. In addition, the testing of safety-relevant functions is increasingly difficult. Both problems can be transferred to the virtual world. On the one hand, an infinite amount of data can be generated there and on the other hand, for example, we are independent of weather situations. This paper presents a data generator for autonomous driving that generates ideal and undesired driving behavior in a 3D-environment without the need of manually generated training data. A test environment based on a round track was built using the Unreal Engine And AirSim. Then, a mathematical model for the calculation of a weighted random angle to drive alternative routes is presented. Finally, the approach was tested with the CNN of NVidia, by training a model and connect it with AirSim.
Decision-making is an important task in autonomous driving. Especially in dynamic environments, it is hard to be aware of every unpredictable event that would affect the decision. To face this issue, an autonomous car is equipped with different types of sensors. The LiDAR laser sensors, for example, can create a three dimensional 360 degree representation of the environment with precise depth information. But like humans, the laser sensors have their limitations. The field of view could be partly blocked by a truck and with that, the area behind the truck would be completely out of view. These occluded areas in the environment increase the uncertainty of the decision and ignoring this uncertainty would increase the risk of an accident. This thesis presents an approach to estimate such areas from the data of a LiDAR sensor. Therefore, different methods will be discussed and finally the preferred method evaluated.
Das System zur einfachen und sicheren Steuerung von Industrierobotern wurde entwickelt um eine flexible Kommunikation zwischen verschiedenen Diensten zu ermöglichen. Dabei ist besonderer Wert auf Einfachheit, Performance und Erweiterbarkeit gelegt worden. Die Bedienung der Roboter erfolgt über das Netzwerk mittels einer HoloLens. Diese zeigt zur Visualisierung Hologramme an, mit deren Hilfe der Benutzer über definierte Gesten den Roboter ausrichten kann. Die Augmented Reality ermöglicht es, intuitiv mit dem Roboter zu interagieren und ihn Aufgaben erledigen zu lassen. Letztlich wurde das Konzept mittels eines Versuches und Software-Tests validiert.
Supervised training methods require a lot of labeled training data which often does not exist in sufficient amounts as the data has to be annoxated manually - a laborious task. This thesis describes a novel approach of unsupervised domain adaptation which turns labeled simulated data samples into labeled realistic data samples by using unlabeled real data samples for training. It is shown that the proposed model is able to generate realistic labeled images of human faces out of simulated face models generated from the Basel Face Model.
I present my investigations into explicitly mapping dependencies between image areas and identify limitations of that approach. I continue with an analysis of variational autoencoders. Building on the insight, I train VAEs on different sized image patches extracted from the CIFAR-10 and CelebA datasets. The patch VAEs are used to compute a similarity metric between individual patches. As an evaluation, linear models are trained to perdict the labels given the unsupervised representations. The patch-VAEs allow to obtain better classification accuracies then using the representations of a global VAE on CIFAR-10 and CelebA datasets. The patch VAEs are used to compute a similarity metric between individual patches. As an evalutation, linear models are trained to predict the labels given the unsupervised representations. The patch-VAEs allow to obtain better classifications accuracies then using the representations of a global VAE on CIFAR-10 and CelebA. I find that for VAEs the features of the earlier layers give a better linear classification performance than their representations.