英语论文网

eNTERFACE’05
July 18- August 12, 2005 – Faculté Polytechnique de Mons, Belgium
A Multimodal (Gesture+Speech) Interface for 3D Model Search and Retrieval Integrated in a Virtual Assembly Application
Project Title: A Multimodal (Gesture+Speech) Interface for 3D Model Search and Retrieval Integrated in a Virtual Assembly Application
留学生论文网Principal investigator:
Dr. Dimitrios Tzovaras (ITI-CERTH)
Candidates:
Konstantinos Moustakas
Date:
10/3/2005
Abstract:
The goal of the project is the development of a multimodal interface for content-based
search of 3D objects based on sketches. This user interface will integrate graphical,
gesture and speech modalities to aid the user in sketching the outline of the 3D object
he/she wants to search from a large database. Finally, the system will be integrated in
a virtual assembly application, where the user will be able to assemble a machine
from its spare parts using only speech and specific gestures.
Project objective:
Search and retrieval of 3D objects is a very challenging issue with application
branches in numerous areas like recognition in computer vision and mechanical
engineering, content-based search in e-commerce and edutainment applications etc.
These application fields will expand in the near future, since the 3D model databases
grow rapidly due to the improved scanning hardware and modeling software that have
been recently developed.
The difficulties of expressing multimedia and especially three dimensional content via
text-based descriptors reduces the performance of the text-based search engines to
retrieve the desired multimedia content efficiently and effectively. To resolve this
problem, 3D content-based search and retrieval (S&R) has drawn a lot of attention in
the recent years. A typical S&R system evaluates the similarities between query and
target objects according to low-level geometric features. However, the requirement of
a query model for searching by example often reduces the applicability of an S&R
platform, since in many cases the user knows what kind of object he wants to retrieve
but he does not have a 3D model to use as query.
Imagine the following use case: The user of a virtual assembly application is trying to
assemble an engine of its spare parts. He inserts some rigid parts into the virtual scene
and places them in the correct position. At one point he needs to find a piston and
assemble it to the engine. In this case, he has to manually search in the database to
find the piston. It would be faster and much more easier if the user had the capability
of sketching the outline of the piston using specific gestures combined with speech in
order to perform the search.
In the context of this project the integration of speech and gestures in the S&R
platform will be addressed. Speech commands are going to be used for selecting
categories of objects to be searched and for inquiring the automatic sketching of
simple geometrical objects. The system will also use gesture information for
deforming the geometrical shapes, for sketching new lines and curves and also for
connecting them in order to be able to design complex object outlines.
In addition to the research part of the project, it has also an application part since the
final test bed of the system will be a virtual assembly application.
In particular, the objective