Sketch recognition is a discipline which has gained an increasing
interest in the last 20 years. This is due to the appearance of new
devices such as PDA, Tablet PC's or digital pen \& paper protocols.
From the wide range of sketched documents we focus on those that
represent structured documents such as: architectural floor-plans,
engineering drawing, UML diagrams, etc. To recognize and understand
these kinds of documents, first we have to recognize the different
compounding symbols and then we have to identify the relations
between these elements. From the way that a sketch is captured,
there are two categories: on-line and off-line. On-line input modes
refer to draw directly on a PDA or a Tablet PC's while off-line
input modes refer to scan a previously drawn sketch.
This thesis is an overlapping of three different areas on Computer
Science: Pattern Recognition, Document Analysis and Human-Computer
Interaction. The aim of this thesis is to interpret sketched
documents independently on whether they are captured on-line or
off-line. For this reason, the proposed approach should contain the
following features. First, as we are working with sketches the
elements present in our input contain distortions. Second, as we
would work in on-line or off-line input modes, the order in the
input of the primitives is indifferent. Finally, the proposed method
should be applied in real scenarios, its response time must be slow.
To interpret a sketched document we propose a syntactic approach. A
syntactic approach is composed of two correlated components: a
grammar and a parser. The grammar allows describing the different
elements on the document as well as their relations. The parser,
given a document checks whether it belongs to the language generated
by the grammar or not. Thus, the grammar should be able to cope with
the distortions appearing on the instances of the elements.
Moreover, it would be necessary to define a symbol independently of
the order of their primitives. Concerning to the parser when
analyzing 2D sentences, it does not assume an order in the
primitives. Then, at each new primitive in the input, the parser
searches among the previous analyzed symbols candidates to produce a
Taking into account these features, we have proposed a grammar based
on Adjacency Grammars. This kind of grammars defines their
productions as a multiset of symbols rather than a list. This allows
describing a symbol without an order in their components. To cope
with distortion we have proposed a distortion model. This distortion
model is an attributed estimated over the constraints of the grammar
and passed through the productions. This measure gives an idea on
how far is the symbol from its ideal model. In addition to the
distortion on the constraints other distortions appear when working
with sketches. These distortions are: overtracing, overlapping, gaps
or spurious strokes. Some grammatical productions have been defined
to cope with these errors. Concerning the recognition, we have
proposed an incremental parser with an indexation mechanism.
Incremental parsers analyze the input symbol by symbol given a
response to the user when a primitive is analyzed. This makes
incremental parser suitable to work in on-line as well as off-line
input modes. The parser has been adapted with an indexation
mechanism based on a spatial division. This indexation mechanism
allows setting the primitives in the space and reducing the search
to a neighbourhood.
A third contribution is a grammatical inference algorithm. This
method given a set of symbols captures the production describing it.
In the field of formal languages, different approaches has been
proposed but in the graphical domain not so much work is done in
this field. The proposed method is able to capture the production
from a set of symbol although they are drawn in different order. A
matching step based on the Haussdorff distance and the Hungarian
method has been proposed to match the primitives of the different
symbols. In addition the proposed approach is able to capture the
variability in the parameters of the constraints.
From the experimental results, we may conclude that we have proposed
a robust approach to describe and recognize sketches. Moreover, the
addition of new symbols to the alphabet is not restricted to an
expert. Finally, the proposed approach has been used in two real
scenarios obtaining a good performance.
|Date of Award||18 Jun 2010|
|Supervisor||Josep Llados Canet (Director) & Gemma Sánchez Albaladejo (Director)|