This is the new book of my colleague Horst Eidenberger. I collaborated with him in several workshops of our multimedia metadata group. It is on the frontiers of machine intelligence for multimedia understanding.
Frontiers of Media Understanding: The Common Methods of Audio Retrieval, Biosignal Processing, Content-Based Image Retrieval, Face Recognition, Music Classification, Speech Recognition, Text Retrieval and Video Surveillance
Author
Horst Eidenberger
Vienna University of Technology
http://www.ims.tuwien.ac.at/hme/
Abstract
Media understanding is the science/art of identifying semantic structures in digital media objects such as audio, biosignals, images, text and videos. This volume ends the work started in "Fundamental Media Understanding" and continued in "Professional Media Understanding" (atpress, 2011/12). It investigates the scientific frontiers of multimedia information retrieval. Soft frontier areas such as the influence of media theory and psychophysical research are considered as well as core topics such as semantic template matching, Kalman filtering, the limits of learning, dynamic aspects of categorization, human-like similarity perception and developing a neural view on the machine learning problem. In contrast to related publications, this book does not focus on one type of media but considers all the above-named as well as a few others. The author endeavors to identify similarities between the methods employed in audio retrieval, image understanding, text summarization and many other research domains. It turns out that a number of significant parallels do exist. Structuring the methods along common criteria and discussing their similarities and differences breaks the ground for a new research discipline: true computational understanding of multimedia content.
Link
http://www.amazon.com/Frontiers-Media-Understanding-Horst-Eidenberger/dp/3848210924/ref=sr_1_1?ie=UTF8&qid=1350041726&sr=8-1&keywords=horst+eidenberger
Table of Contents
1 Reflection of Professional Methods
1.1 Conclusions from Advanced Methods
1.2 Building Blocks of Categorization
1.3 Which Methods When?
1.4 Overview Over Scientific Frontiers
2 Media Philosophies
2.1 The Image in Philosophy
2.2 Media Theories
2.3 Semiotics
2.4 Media and Information
3 Perception and Psychophysics
3.1 Human Perception and Cognition
3.2 Perceptual and Cognitive Errors
3.3 Psychophysical Theory
3.4 Psychoacoustics and Psychophysics of Vision
4 Description by Templates
4.1 Convolution Everywhere
4.2 Templates for One-Dimensional Media
4.3 Static Visual Templates
4.4 Dynamic Template Adaptation Models
5 Semantic Descriptions and Applications
5.1 The Semantic Scale
5.2 Semantic Feature Transformations
5.3 Semantics in Audio, Biosignals and Text
5.4 Visual Semantic Applications
6 Convergent Filtering
6.1 Models of Convergence
6.2 Vector Quantization
6.3 The Kalman Filter
6.4 Associative Memories
7 Frontiers of Learning Machines
7.1 Analysis of Categorization Methods
7.2 Limits of Learning
7.3 Dynamical Systems Theory
7.4 Oscillating Classifiers
8 Human-Like Similarity Perception
8.1 Similarity as Measurement
8.2 Similarity as Counting
8.3 Dual Process Models
8.4 Similarity as Alignment and Transformation
9 Neural Media Understanding
9.1 Neural Foundations
9.2 Artificial Neural Networks
9.3 Neural Description and Filtering
9.4 Neural Networks for Categorization
10 Finale and Future
10.1 Summary
10.2 Essential Findings
10.3 Critical Review
10.4 Outlook: To Do List
Appendix A Mathematical Notation
Appendix B Similarity Models
Description of Chapters
Chapter 1 lists the major findings of the second part, names major potentials of the professional methods, develops a set of categorization building blocks, sketches best combinations of media understanding methods and provides an overview over the third part.
Chapter 2 discusses the relationship of perception and reality, theories of media content and media usage, the semiotic analysis of arbitrary symbol systems and potentials for merging of media theory, semiotics and information theory for the benefit of better media understanding.
Chapter 3 lists fundamental aspects of human perception, shows where perceptual and cognitive insufficiencies of the human brain lie, gives an introduction into the psychophysical model and discusses psychophysical aspects of hearing and vision.
Chapter 4 revisits the fundamental convolution problem, links it to human similarity measurement, lists templates for audio, biosignals and stock data, and introduces static and dynamic models for visual media representation.
Chapter 5 introduces the semantic scale, describes the usage of low-level descriptions for semantic enhancement and semantic applications in the audio and the visual domain.
Chapter 6 develops a model of convergence for iterative filtering processes, discusses learning vector quantization, the Kalman filter for scalar quantization and quantization by associative memories such as the Hopfield network and the Boltzmann machine.
Chapter 7 reviews communalities of categorization methods, presents a system of learning bounds, introduces fundamental methods of dynamical systems and applies these methods on dynamic classifiers.
Chapter 8 explains distance-based similarity, the improvements reached through the usage of predicate-based models, their integration in dual process models and the new perspectives gained from structural alignment and transformational similarity.
Chapter 9 analyzes the building blocks of human cognition, explains how these are imitated in artificial neural networks and discusses practical networks for description, filtering and categorization, including the spike response mode, radial basis function networks and cascade correlation.
Chapter 10 summarizes the findings of the book, emphasizes the most important points, estimates the practical applicability of some important ideas and sketches a vision of future media understanding research.
Monday, 15 October 2012
New Book on the Frontiers of Multimedia Information Retrieval
Posted on 01:13 by Unknown
Subscribe to:
Post Comments (Atom)
0 comments:
Post a Comment