Humans perceive the real world through their sensory organs: vision, taste, hearing, smell, and touch. In terms of information, we consider these different modesalso referred to as different... Show moreHumans perceive the real world through their sensory organs: vision, taste, hearing, smell, and touch. In terms of information, we consider these different modesalso referred to as different channels of information or modals. Considering multiple channels of information, at the same time, is referred to as multimodal and the input as multimedia. By their very nature, multimedia data are complex and often involve intertwined instances of different kinds of information. We can leverage this multimodal perspective to extract meaning and understanding of theworld. This is comparable to how our brain processes these multiple channels, we learn how to combine and extract meaningful information from them. In this thesis, the learning is done by computer programs and smart algorithms. This is referred to as artificial intelligence. To that end, in this thesis, we have studied multimedia information, with a focus on vision and language information representation for semantic mapping. The aims of the semantic mapping learning in this thesis are: (1) visually supervised word embedding learning; (2) fine-grained labellearning for vision representation; (3) kernel-based transformation for image and text association; (4) visual representation learning via a cross-modal contrastivelearning framework. Show less
Remote sensing has a long and successful track record of detecting and mapping archaeological traces of human activity in the landscape. Since the early twentieth century, the tools and procedures... Show moreRemote sensing has a long and successful track record of detecting and mapping archaeological traces of human activity in the landscape. Since the early twentieth century, the tools and procedures of aerial archaeology evolved gradually, while earth observation remote sensing experienced major steps of technological and methodological advancements and innovation that today enable the monitoring of the earth’s surface at unprecedented accuracy, resolution and complexity. Much of the remote sensing data acquired in this process potentially holds important information about the location and context of archaeological sites and objects. Archaeology has started to make use of this tremendous potential by developing new approaches for the detection and mapping of archaeological traces based on digital remote sensing data and the associated tools and procedures. This chapter reviews the history, tools, methods, procedures and products of archaeological remote sensing and digital image analysis, emphasising recent trends towards convergence of aerial archaeology and earth observation remote sensing. Show less
Zingman, I.; Saupe, D.; Penatti, O.A.B.; Lambers, K. 2016
We develop an approach for the detection of ruins of livestock enclosures (LEs) in alpine areas captured by high-resolution remotely sensed images. These structures are usually of approximately... Show moreWe develop an approach for the detection of ruins of livestock enclosures (LEs) in alpine areas captured by high-resolution remotely sensed images. These structures are usually of approximately rectangular shape and appear in images as faint fragmented contours in complex background. We address this problem by introducing a rectangularity feature that quantifies the degree of alignment of an optimal subset of extracted linear segments with a contour of rectangular shape. The rectangularity feature has high values not only for perfectly regular enclosures but also for ruined ones with distorted angles, fragmented walls, or even a completely missing wall. Furthermore, it has a zero value for spurious structures with less than three sides of a perceivable rectangle. We show how the detection performance can be improved by learning a linear combination of the rectangularity and size features from just a few available representative examples and a large number of negatives. Our approach allowed detection of enclosures in the Silvretta Alps that were previously unknown. A comparative performance analysis is provided. Among other features, our comparison includes the state-of-the-art features that were generated by pretrained deep convolutional neural networks (CNNs). The deep CNN features, although learned from a very different type of images, provided the basic ability to capture the visual concept of the LEs. However, our handcrafted rectangularity-size features showed considerably higher performance. Show less