This is an expanded, updated, annotated and otherwise messed-around-with version of a paper presented at Smart Sketches 2004. I like to think it's also a good introduction to the state of the art of interpreting line drawings.
Convert Sketch to Line Drawing
Frontal Geometry:
Beautification of Solid Models
If your interest is in the history of line drawing interpretation rather than the current state of the art, [Company et al] is a good place to start.
Can computers interpret line drawings of engineering objects? In principle, they cannot: any line drawing is the 2D representation of an infinite number of possible 3D objects. Fortunately, a counter-argument suggests that computers should be able to interpret line drawings. Human engineers use line drawings to communicate shape in the clear expectation that the recipient will interpret the drawing in the way the originator intended. It is believed [Lipson, Varley] (and considerable anecdotal evidence supports) that human interpretation of line drawings is a skill which can be learned. If such skills could be translated into algorithms, computers could understand line drawings.
There are good reasons why we want computers to interpret line drawings. Studies such as [Jenkins] have shown that it is common practice for design engineers to sketch ideas on paper before entering them into a CAD package. Clearly, time and effort could be saved if a computer could interpret the engineer's initial concept drawings as solid models. Furthermore, if this conversion could be done within a second or two, it would give helpful feedback, further enhancing the designer's creativity [Grimstead].
The key problem is to produce a model of the
3D object the engineer would regard as the most reasonable
interpretation of the 2D drawing, and to do so quickly.
While there are infinitely many objects which could result in drawings
corresponding to (for example) the following two drawings
(adapted from [Yankee]), in
practice, an engineer would be in little doubt as to which was the
correct interpretation.
For this reason, the problem is as much heuristic as geometric:
it is not merely to find a geometrically-realisable solid which corresponds
to the drawing, but to find the one which corresponds to the engineer's
expectations.
We suggest the following fully automatic approach, requiring no user
intervention; our implementation verifies its utility for many drawings of
polyhedral objects.
(Elsewhere [Varley et al],
I have outlined an approach for interpreting certain drawings of curved
objects with minimal user intervention. Much of the work on curved objects
was done by my colleagues at the
Suzuki Laboratory.)
A solid model of a 3D object describes the topology and geometry of its faces, edges and vertices. Topology records connectivity between e.g. vertices and edges; geometry gives shape and positions e.g. the spatial coordinates of vertices.
A natural line drawing [Sugihara] is a 2D drawing which represents the object as viewed from some viewpoint, and comprises lines (corresponding to visible or partially-visible edges) and junctions (where lines meet---most, but not all, junctions correspond to visible vertices of the object). Loops of lines and junctions form regions, which correspond to visible or partially-visible faces of the object. Note the careful distinction between 2D ideas (drawings, regions, lines, junctions) and 3D ideas (objects, faces, edges, vertices).
A frontal geometry is an intermediate stage between 2D drawing and 3D object (and thus is sometimes called "2½D"). In a frontal geometry, everything visible in the natural line drawing is given a position in 3D space, but the occluded part of the object, not visible from the chosen viewpoint, is not present.
A polyhedron is trihedral if three faces meet at each vertex. It is extended trihedral [Parodi et al] if three planes meet at each vertex (there may be four or more faces if some are coplanar). It is tetrahedral if no more than four faces meet at any vertex. It is a normalon if all edges and face normals are aligned with one of three main perpendicular axes.
Junctions of different shapes are identified by letter: junctions where two lines meet are L-junctions, junctions of three lines may be T-junctions, W-junctions or Y-junctions, and junctions of four lines may be K-junctions, M-junctions or X-junctions. Vertex shapes follow a similar convention: for example, when all four edges of a K-vertex are visible, the drawing has four lines meeting at a K-junction.
When reconstructing an object from a drawing, we take the correct object to be the one which a human would decide to be the most plausible interpretation of the drawing.
Convert Sketch to Line Drawing
For drawings of polyhedral objects, we believe it to be most convenient for the designer to input straight lines directly, and our own prototype system, RIBALD, includes such an interface. However, it could be argued that freehand sketching is more "intuitive", corresponding to a familiar interface: pen and paper. Several systems exist which are capable of converting freehand sketches into natural line drawings - see, for example, [Zeleznik et al], [Mitani], and [Sezgin and Stahovich]. More recently, [Shesh and Chen] have demonstrated a system, SMARTPAPER, which as well as converting freehand sketches into wireframe drawings, can also inflate simple wireframes.
Which Lines are Convex/Concave?
Line labelling is the process of determining whether each line in the drawing
represents a convex, a concave, or an occluding edge.
For drawings of trihedral objects with no hole loops,
the line labelling problem was essentially solved by
[Huffman] and
[Clowes], who elaborated the
catalogue of valid trihedral junction labels.
This turns line labelling into a discrete constraint satisfaction
problem with 1-node constraints that each
junction must have a label in the catalogue and 2-node constraints that
each line must have the same label throughout its length.
The Clowes-Huffman catalogue for L-, W- and Y-junctions
is shown below:
+ indicates a convex edge,
- indicates a concave edge, and an arrow indicates an occluding edge
with the occluding face on the right-hand side of the arrow.
In trihedral objects, T-junctions (see next figure)
are always occluding.
For trihedral objects, algorithms for Clowes-Huffman line labelling, such as those of [Waltz] and [Kanatani], although theoretically taking O(2n) time, are usually O(n) in practice [Parodi and Torre]. It is believed that the time taken is more a function of the number of legal labellings than of the algorithm, and for trihedral objects there is often only a single legal labelling. For example, the first illustrative drawing above has only one valid labelling if the trihedral (Clowes-Huffman) catalogue is used.
Extending line labelling algorithms to non-trihedral normalons is fairly
straightforward [Parodi et al]. The
additional legal junction labels are those shown below. Note, however, that a
new problem has been introduced: the new T-junctions are not occluding.
Extension to the 4-hedral general case is less straightforward. The
catalogue of 4-hedral junction labels is much larger
[Varley and Martin] - for
example, the junctions below are just the possibilities for
W-junctions.
Because the 4-hedral catalogue is no longer sparse, there are often
many valid labellings for each drawing. Non-trihedral line labelling
using the previously mentioned algorithms is now O(2n) in
practice as well as in theory, and thus too slow. Furthermore, choosing the
best labelling from the valid ones is not straightforward either,
although there are heuristics which can help
(see [Varley and Martin]).
Instead, an alternative labelling method is to use a relaxation algorithm. Label probabilities are maintained for each line and each junction; these probabilities are iteratively updated. If a probability falls to 0, that label is removed; if a probability reaches 1, that label is chosen and all other labels are removed. In practice, this method is much faster - labels which are possible but very unlikely are removed quickly by relaxation, whereas they are not removed at all by combinatorial algorithms. However, relaxation methods are less reliable (the heuristics developed for choosing between valid labellings when using combinatorial methods are reasonably effective). In test we performed on 535 line drawings [Varley], combinatorial labelling labelled 428 entirely correctly, whereas relaxation labelling only labelled 388 entirely correctly.
The most serious problem with either approach is that in treating line
labelling as a discrete constraint satisfaction problem, the geometry of the
drawing is not taken into account.
For example, the next two drawings are labelled the same.
The problems created by ignoring geometry become much worse in drawings with
several non-trihedral junctions (see
[Varley et al]), and for these,
other methods are required.
A new approach to labelling outlined in that paper and subsequently developed further
[Varley et al]
makes use of an idea previously proposed for inflation
[Lamb and Bandopadhay]:
On its own, this method does not work well: e.g. it is difficult to specify a threshold distance d between two faces such that a distance greater than d corresponds to a step, and hence an occluding line, while if the distance is less than d the planes meet and the line is convex or concave. However, using the predictions made by this method as input to a relaxation labelling algorithm provides far better results than using arbitrary initialisation in the same algorithm.
This idea can be combined with many inflation ideas when producing a provisional geometry. Various variants of the idea have been considered (see [Varley et al]), particularly with reference to how the i,j,k axes are identified in a 2D drawing, without as yet any firm conclusions as to which is best overall. Another strength is that the idea uses the relaxation labeller to reject invalid labellings while collating predictions made by other approaches. This architecture allows additional approaches to labelling, such as Clowes-Huffman labelling for trihedral objects, to make a contribution in those cases where they are useful [Varley et al].
Even so, the current state-of-the-art only labels approximately 90% of non-boundary edges correctly in a representative sample of drawings of engineering objects [Varley et al].
Note that any approach which uses catalogue-based labelling can only label those drawings whose vertices are in a catalogue. Although [Huffman] gives a method for determining whether any given junction label is geometrically possible, I know of no fast algorithm based on this idea.
It seems unlikely that (for example) 7-hedral and 8-hedral extended
K-type vertices of the type shown here
will be catalogued in the near future.
In view of this, one must question whether line labelling is needed. Humans are skilled at interpreting line drawings, and introspection tells us that line labelling is not always a part of this process - it may even be that humans interpret the drawing first, and then (if necessary) determine which lines are convex, concave and occluding from the resulting mental model.
Our investigations indicate that line labelling is needed, at least at present. We are investigating interpreting line drawings without labelling, based on identifying aspects of drawings which humans are known or believed to see quickly, such as line parallelism [Lipson and Shpitalni], cubic corners [Perkins] and major axis alignment [Lamb and Bandopadhay]. Current results are disappointing. Better frontal geometry can be obtained if junction labels are available. More importantly, the frontal geometry is topologically unsatisfactory. Distinguishing occluding from non-occluding T-junctions without labelling information is unreliable, and as a result, determination of hidden topology is unlikely to be successful.
Determining which lines in a drawing are intended to be
parallel in 3D is surprisingly difficult.
It is, for example, obvious to a human which lines in the two drawings below are
intended to be parallel and which are not, but determining this algorithmically
presents problems.
[Sugihara] attempted to define this problem away by using a strict definition of the general position rule: the user must choose a viewpoint such that if lines are parallel in 2D, the corresponding edges in 3D must be parallel. This makes no allowance for the small drawing errors which inevitably arise in a practical system.
[Grimstead]'s bucketing approach,
grouping lines with similar orientations, works well for many drawings, but
fails for both drawings above.
My own bundling approach [Varley],
although somewhat more reliable, fares no better with these two drawings.
The basic idea used in bundling is that edges are parallel if they look
parallel unless it can be deduced from other information that they cannot
be parallel.
The latter is problematic for two reasons.
Firstly, if other information means labelling, identification of parallel
lines must occur after labelling, limiting the system organisation for computing
frontal geometry.
Secondly, to cover increasingly rare exceptional cases, we must add extra, ever
more complex, rules for deducing which lines may or may not be parallel. This is
tedious and rapidly reaches the point of diminishing returns.
For example, a rule which can deduce that the accidental coincidence in the next
figure should not result in parallel lines would be both complicated to
implement and of no use in many other cases.
Furthermore, there are also cases where it is far from clear even to humans
which edges should be parallel in 3D. The next drawing is a case in point:
should the roof of the car be parallel to the bonnet or the bodywork?
And what about the bottom edge?
In view of these problems, more recent approaches to frontal geometry (such as [Varley et al]) simply ignore the possibility that some lines which appear parallel in 2D cannot in fact be parallel in 3D. Initially, it is assumed that they are parallel; this information is then re-checked after inflation.
Inflation is the process of converting a flat 2D drawing into 2½D by assigning z-coordinates (depth coordinates) to each vertex, producing a frontal geometry. The approach taken here is the simplest: we use compliance functions [Lipson and Shpitalni] to generate equations linear in vertex depth coordinates, and solve the resulting linear least squares problem. Many compliance functions can be translated into linear equations in z-coordinates. Of these, the most useful are:
Cubic Corners [Perkins], sometimes
called corner orthogonality, assumes that a W-junction or
Y-junction corresponds to a vertex at which three orthogonal faces meet.
A linear equation relates depth coordinates
zV and zA to angles F and G.
[Nakajima] reports successful creation of
frontal geometry solely by using a compliance function similar to corner
orthogonality, albeit with a limited set of test drawings in which orthogonality
predominates.
Line Parallelism uses two edges assumed to be parallel in 3D. The linear equation relates the four z-coordinates of the vertices at either end of the two edges. Line parallelism is not, by itself, inflationary: there is a trivial solution (z=0 for all vertices).
Vertex Coplanarity uses four vertices assumed to be coplanar. The linear
equation relating their z-coordinates is easily obtained from 2D
geometry. Vertex coplanarity is also not, by itself, inflationary, having the
trivial solution z=0 for all vertices.
General use of four-vertex coplanarity is not recommended
([Lipson] notes that if three vertices on
a face are collinear, four-vertex coplanarity does not guarantee a planar face).
However, it is invaluable for cases like the following drawings, to link
vertices on inner and outer face loops: without it the linear system of depth
equations would be disjoint, with infinitely many solutions.
[Lipson and Shpitalni] list the above and several other compliance functions; the following is my own invention.
Junction-Label Pairs [Varley]
assume that pairs of junctions with identified labels have the same depth
implications they would have in the simplest possible drawing containing such a
pair.
An equation is generated relating the vertex depths at each end of the line
based on the junction labels of those vertices.
For example, see the next figure: this pair of junction labels can be found
in an isometric drawing of a cube, and the implication is that the Y-junction is nearer
to the viewer than the W-junction, with the ratio of 2D line length to
depth change being sqrt(2):1.
Note that two of the most successful compliance functions, line parallelism and
junction line pairs, require input information (parallel lines and line labels
respectively) which, as we have seen, cannot always reliably be obtained.
However, when this input information is both available and correct,
inflation using the above compliance functions is the most
reliable of the stages of processing described in this paper. Although it
occasionally fails to determine correctly which end of a line should be
nearer the viewer, such failures arise in cases like the one in the next
drawing where a human would also have difficulty. The only
systematic case where using a linear system of compliance
functions fails is for Platonic and Archimedean solids, but a known
special-case method ([Marill]'s MSDA)
is successful for these.
In order to make the frontal geometry process more robust when the input information (especially line labelling) is incorrect, we have experimented with two approaches to inflation which do without some or all of this information.
The first is the "preliminary inflation" described above: find the i,j,k axes in the drawing, inflate the object in i,j,k space, then determine the transformation between i,j,k and x,y,z spaces. This requires parallel line information (to group lines along the i,j,k axes). Where such information is misleading, the quality of inflation is unreliable, but this is not always a problem. For example, the left-hand problem drawing is labelled correctly despite incorrect parallel line information. Once (i) a drawing has been labelled correctly and (ii) there is reason to suppose that the parallel line information is unreliable, better-established inflation methods can be used to refine the frontal geometry. However, the right-hand problem drawing is one of those which is not labelled correctly, precisely because of the misleading parallel line information.
A second promising approach, needing further work, attempts to emulate what is known or hypothesised about human perception of line drawings. It allocates merit figures to possible facts about the drawing; these, and the geometry which they imply, are iteratively refined using relaxation:
As yet, further work is needed on this approach, but I believe that it holds promise.
A third, simpler, approach assumes that numerically correct geometry is not required at this early stage of processing, and identifying relative depths of neighbouring vertices is sufficient. [Schweikardt and Gross]'s work, although limited to objects which can be labelled using the Clowes-Huffman catalogue, and not extending well to non-normalons, suggests another possible way forward.
Ideally, one method should work for all drawings of polyhedral objects; identification of special cases should not be necessary. However, the state-of-the-art is well short of this ideal - in practice it is useful to identify certain frequent properties of drawings and objects. Identification of planes of mirror symmetry is particularly useful. Knowledge of such a symmetry can help both to construct hidden topology and to beautify the resulting geometry Identification of centres of rotational symmetry is less useful [Varley], but similar methods could be applied.
The technique adopted is straightforward: for each possible bisector of each face, create a candidate plane of mirror symmetry, attempt to propagate the mirror symmetry across the entire visible part of the object, and assess the results using the criteria of (i) to what extent the propagation attempt succeeded, (ii) whether there is anything not visible which should be visible if the plane of mirror symmetry were a genuine property of the object, and (iii) how well the frontal geometry corresponds to the predicted mirror symmetry.
Classification of commonly-occurring types of objects (examples include extrusions, normalons, and trihedral objects) is also useful~[Varley], as will be seen in Section~\ref{TOPOLOGY.
One useful combination of symmetry and classification is quite common in engineering practice (see, for example, the two drawings in the Introduction): a semi-normalon (where many, but not all, edges and face normals are aligned with the major axes) also having a dominant plane of mirror symmetry aligned with one of the object's major axes [Varley]. The notable advantage of this classification is that during beautification it provides constraints on the non-axis-aligned edges and faces.
I recommend that symmetry detection and classification should be performed after creation of the frontal geometry. Detecting candidate symmetries without line labels is unreliable, and assessing candidate symmetries clearly benefits from the 3D information provided by inflation. The issue is less clear for classification. Some classifications (such whether the object is a normalon) can be done directly from the drawing, without creating the frontal geometry first. However, others cannot, so for simplicity it is preferable to classify the object after creating its frontal geometry.
Once the frontal geometry has been determined, the next stage of processing is to add the hidden topology. The method is essentially that presented in [Varley et al]: firstly, add extra edges to complete the wireframe, and then add faces to the wireframe to compete the object, as follows:
While the wireframe is incomplete:
The process of completing the wireframe topology varies in difficulty according to the type of object drawn. I illustrate two special-case object classes, extrusions and normalons, and the general case. In some cases (e.g. if the object is symmetrical or includes a recognised feature), more than one vertex can be added in one iteration, as described in [Varley]. Such cases increase both the speed and reliability of the process of completing the wireframe.
Completing the topology of extrusions from a known front end cap is
straightforward. The illustration shows a drawing and the corresponding
completed extrusion wireframe.
Knowing that the object is a normalon simplifies reconstruction of the
wireframe, since when hypothesised edges are projected along axes,
there is usually only one possibility from any particular incomplete
vertex. The next illustration shows a drawing of a normalon
(adapted from [Yankee]) and the
corresponding completed wireframe.
Similarly, if the object is trihedral, there can be at most one new edge from
each incomplete vertex, simplifying reconstruction of the correct wireframe.
The next illustration shows a drawing of a trihedral object
(again adapted from [Yankee])
and the corresponding completed wireframe.
However, in the general case, where the object is neither a normalon nor
trihedral, there is the significant difference that hypothesised edges may be
projected in any direction parallel to an existing edge. Even after eliminating
edges which would be visible, there may be several possibilities at any
given incomplete vertex. The large number of possible options
rapidly becomes confusing and it is easy to choose an incorrect
crossing-point at an early stage. Although such errors can sometimes be
rectified by backtracking, the more common result is a valid but unwanted
wireframe. Only very simple drawings can be processed reliably.
The next illustration shows a general-case object and the corresponding
completed wireframe; simple as it looks, this represents the limit of the
current state of the art.
One particular problem, a specific consequence of the approach of completing the wireframe before faces, is that there is no assurance that the local environments of either end of a new edge match. It may happen that a sector around a new edge is solid at one end and empty at the other. This is perhaps the single most frequent cause of failure at present, and is especially problematic in that the resulting completed wireframe can appear correct. We aim to investigate faster and more reliable ways of determining the correct hidden topology of an object, starting with approaches aimed at correcting this conflicting local environment problem.
Adding additional faces to the completed wireframe topology for which the frontal geometry is already known is straightforward. I use repeated applications of Dijkstra's Algorithm to find the best loop of unallocated half-edges for each added face, where the merit for a loop of half-edges is based both on the number of half-edges required (the fewer, the better) and their geometry (the closer to coplanar, the better). I have not known this approach to fail when the input is a valid wireframe (which, as noted above, is not always the case).
Note, however, that this is not necessarily a full solution to the problem of adding faces to a wireframe, since I always start with at least one known face in addition to the wireframe.
Here are some completed wireframes with faces filled in.
Beautification of Solid Models
As can be seen from the illustrations in the previous section, even when topologically correct, the solid models produced may have (possibly large) geometric imperfections. They require beautification. More formally, given a topologically-correct object and certain symmetry and regularity hypotheses, we wish to translate these hypotheses into constraints, and update the object geometry so that it maximises some merit function based on the quantity and quality of constraints enforced.
In order to make this problem more tractable, we decompose it into determination of face normals and determination of face distances from the origin; once faces are known, vertex coordinates may be determined by intersection. The rationale for this partitioning [Kumar and Yu] is that changing face normals can destroy satisfied distance constraints, but changing face distances cannot destroy satisfied normal constraints.
However, there are theoretical doubts about this subdivision, related to the resolvable representation problem [Sugihara] of finding a resolution sequence in which information can be fixed while guaranteeing that no previous information is contradicted. For example, determining face equations first, and calculating vertex coordinates from them, is a satisfactory resolution sequence for many polyhedra, including all trihedral polyhedra. Similarly, fixing vertex coordinates and calculating face planes from them is a satisfactory resolution sequence for deltahedra and triangulated mesh models. [Sugihara] proved that:
Thus, there are two resolvable representation issues:
Currently, neither problem has been solved satisfactorily.
Thus, although there are objects which have resolution sequences, but for which determining face normals, and then face distances, and finally vertex coordinates, is not a satisfactory resolution sequence, the frequency of occurrence of such objects has yet to be determined. If low, the pragmatic advantages of such an approach are perhaps more important than its theoretical inadequacy.
My overall beautification algorithm is [Varley]:
I use numerical methods for constraint processing, as this seems to be the approach which holds most promise. Alternatives, although unfashionable for various reasons, may become more viable as the state of the art develops: see [Lipson and Shpitalni] for a discussion.
In addition to the resolvable representation problem, there is a further theoretical doubt about this approach. When attempting to satisfy additional constraints, it is necessary to know how many degrees of freedom are left in the system once previous, already-accepted, constraints are enforced. This apparently-simple problem appears to have no fully-reliable solution. One solution proposed by [Li et al] perturbs the variables slightly and detects which constraints have been violated. However, this is slow, and not necessarily theoretically sound either (for example, a constraint relating face distances A and B may allow them to move together, but not independently of one another).
Two differing approaches have been tried to the problem of finding whether or not a geometry exists which satisfies a new constraint while continuing to satisfy previously-accepted constraints.
The first encodes constraint satisfaction as an error function (the lower the value, the better-satisfied the constraint), and face normals and/or face distances as variables, using a downhill optimiser to minimise the error function [Ge et al, Langbein et al, Varley]. Such algorithms use a `greedy' approach, in which the constraint with the highest figure of merit is always accepted and enforced, and then for each other constraint, in descending order of merit: if the constraint is already satisfied numerically by the object, it is accepted; otherwise we attempt to adjust the variables numerically to accommodate the new constraint as well as all previous accepted constraints. If this succeeds, the new variable values are stored and the constraint is accepted; otherwise, the constraint is rejected and the previous variable settings are restored. The use of a downhill optimiser guarantees that progress is always downhill. However, progress can often be slow, and there is the possibility (unlikely in practice if the initial geometry is reasonable [Varley]) that the downhill optimiser will be trapped in a local minimum.
To speed up the algorithm, a logical reasoning stage may be used, capable of directly accepting or rejecting constraints which duplicate or contradict previously-accepted constraints. On one hand, I have had difficulties with this approach [Varley], in view of (i) the unsolved resolvable representation and degrees-of-freedom problems, and (ii) the difficulty of reasoning whether constraints of different types duplicate or contradict one another. Such reasoning becomes harder as extra constraint types are added to a system: even simple 2D systems allowing only distance constraints require 30 inference rules [Suzuki et al]; the number of inference rules required for 3D systems allowing both angular and distance constraints is certainly much greater. On the other hand, [Langbein et al] report considerable success with a logical reasoning approach used in a reverse engineering system. Logical reasoning to accept or reject constraints is clearly faster than an entirely numerical approach, and represents the current state of the art.
An alternative which may warrant further study is to replace the downhill optimisation step above by a faster approach which iteratively updates the variables (face normals and/or distances) geometrically, using their current values and geometric inferences based on the constraint set under consideration to predict next-iteration values. While it is harder to guarantee that such updates are converging towards a solution, by making direct use of geometric information they have the potential to be faster.
There has been recent progress in solving systems of geometric constraints using Cayley-Menger determinants (such as [Porta et al]), but such work has concentrated on problems where all constraints can be represented as distance constraints. It is not clear that it can be extended to all constraints relevant to sketching, in particular "macro"-constraints such as enforcement of a plane of mirror symmetry.
Apart from numerical methods, several other possible techniques may be relevant, too many to list here; two with unexplored potential are described. Firstly, although the simple constructivist approach of [Suzuki et al] cannot be recommended---the chains of reasoning required form so-called "loops" - more recent graph-based methods such as [Fudos and Hoffmann] may make it possible to disentangle these loops. Auxiliary construction methods such as those of Lipson et al and Benko et al are still experimental - there is, as yet, no reliable algorithm for deciding which auxiliary constructs to add - but have the merit that seem to correspond to the way a human would approach the same problem.
So, can computers interpret line drawings? Yes, to some extent, but they are nowhere near as skilful as human engineers. On the one hand, there is a whole category of line drawings (extrusions) which computers can interpret flawlessly. On the other hand, there is another category (those with extended K vertices) which, as yet, cannot be interpreted at all. In between, the proportion of objects which can be interpreted grows steadily as existing methods are refined and extended.
However, if there is to be a breakthrough, rather than incremental improvement, we must recall our aims: to duplicate the ability of humans to interpret line drawings. By and large, those aspects of this skill which have been identified have already been translated into algorithms. The fact that these algorithms are not enough makes it clear that there are more aspects of this skill still waiting to be identified.
The support of the Computer Science Department of Cardiff University and Japan Society for the Promotion of Science Fellowship number P03717, providing the funding for my own research described here, is acknowledged with gratitude.
H.G. Barrow and J.M. Tenenbaum, Interpreting Line Drawings as Three-Dimensional Surfaces, Artificial Intelligence 17, 75-116, 1981.
L. Bauer, Elimination with Weighted Row Combinations for Solving Linear Equations and Least Squares Problems, Handbook for Automatic Computation II, Linear Algebra, eds. J.H.Wilkinson and C.Reinsch, Springer-Verlag 1971.
P. Benko, G. Kos, T. Varady, L. Andor and R.R. Martin, Constrained Fitting in Reverse Engineering, Computer Aided Geometric Design, 19, 173-205, 2002.
M.B. Clowes, On Seeing Things, Artificial Intelligence, 2, 79-116, 1970.
P. Company, A. Piquer, and M. Contero, On the Evolution of Geometrical Reconstruction as a Core Technology to Sketch-Based Modeling, Sketch-Based Interfaces and Modeling: Eurographics Symposium Proceedings, 97-106, 2004.
E.W. Dijkstra, A Note on Two Problems in Connexion with Graphs, Numerische Mathematik I, 269-271, 1959.
S.W. Draper, Reasoning about depth in line-drawing interpretation, PhD Thesis, Sussex University, 1980.
I. Fudos and C.M. Hoffmann, A Graph-Constructive Approach to Solving Systems of Geometric Constraints, ACM Transactions on Graphics, 16(2), 179-216, 1997.
J.X.Ge, S.C.Chou and X.S.Gao, Geometric Constraint Satisfaction using Optimization Methods, Computer-Aided Design 31(14), 867-879, 1999.
I.J. Grimstead, Interactive Sketch Input of Boundary Representation Solid Models, PhD Thesis, Cardiff University, 1997.
D.A. Huffman, Impossible Objects as Nonsense Sentences, Machine Intelligence 6, 295-323, New York American Elsevier, 1971.
D.A. Huffman, Realizable Configurations of Lines in Pictures of Polyhedra, In eds. E.W.Elcock and D.Michie, Machine Intelligence, Vol. 8, 493-509, Ellis Horwood, 1977.
T. Igarashi, S. Matsuoka and H. Tanaka, Teddy: A Sketching Interface for 3D Freeform Design, in Proceedings of ACM SIGGRAPH '99, ACM Press, 409-416, 1999.
D.L. Jenkins, The Automatic Interpretation of Two-Dimensional Freehand Sketches, PhD Thesis, University of Wales College of Cardiff, 1992.
K. Kanatani, Group-Theoretical Methods in Image Understanding, Number 20 in Springer Series in Information Sciences, Springer-Verlag, 1990.
A.V. Kumar and L. Yu, Sequential Constraint Imposition for Dimension-Driven Solid Models, Computer-Aided Design 33, 475-486, 2001.
D. Lamb and A. Bandopadhay, Interpreting a 3D Object From a Rough 2D Line Drawing, In ed. A.E.Kaufman, Proceedings of the First IEEE Conference on Visualization '90, 59-66, IEEE, 1990.
F.C. Langbein, A.D. Marshall and R.R. Martin, Numerical Methods for Beautification of Reverse Engineered Geometric Models, Proc. GMP 2002.
F.C. Langbein, A.D. Marshall and R.R. Martin, Choosing Consistent Constraints for Beautification of Reverse Engineered Geometric Models, Computer Aided Design, 36(3), 261-278, 2004.
Y.G. Leclerc and M.A. Fischler, An Optimization-based Approach to the Interpretation of Single Line Drawings as 3D Wire Frames, International Journal of Computer Vision, 9(2) 113-136, 1992.
Y.T. Li, S.M. Hu and J.G. Sun, On the Numerical Redundancies of Geometric Constraint Systems, in ed. H. Suzuki, A. Rockwood and L.P. Kobbelt, Ninth Pacific Conference on Computer Graphics and Applications (PG'01), 118--123, IEEE Comp Soc Press, 2001.
H. Lipson and M. Shpitalni, Optimization-based Reconstruction of a 3D Object from a Single Freehand Line Drawing, Computer-Aided Design 28(8), 651-663, 1996.
H. Lipson, Computer Aided 3D Sketching for Conceptual Design, PhD Thesis, Technion-Israel Institute for Technology, Haifa, 1998.
H. Lipson, F. Kimura and M. Shpitalni, Solving Geometric Constraints by Auxiliary Constructions, 1999. Previously here, but now vanished.
H. Lipson and M. Shpitalni, Correlation-Based Reconstruction of a 3D Object from a Single Freehand Sketch, 2002 AAAI Spring Symposium on Sketch Understanding, 99-104, AAAI Press, 2002.
J. Malik, Interpreting Line Drawings of Curved Objects, International Journal of Computer Vision 1, 73-103, 1987.
T. Marill, Emulating the Human Interpretation of Line-Drawings as Three-Dimensional Objects, International Journal of Computer Vision 6(2) 147-161, 1991.
J. Mitani, A Study of 3D Sketching used for Aiding Engineering Product Design, MSc Thesis, The University of Tokyo, 2000. In Japanese.
C. Nakajima, An Inference Method of Three-dimensional Shape by Depth Perception from One Image, IPSJ Journal 40(8) (Special Issue on Meeting on Image Recognition and Understanding), 3179-3187, 1999. In Japanese.
P. Parodi, R.Lancewicki, A. Vijh and J.K.Tsotsos, Empirically-Derived Estimates of the Complexity of Labeling Line Drawings of Polyhedral Scenes, Artificial Intelligence 105, 47-75, 1998.
P. Parodi and V. Torre, On the Complexity of Labeling Perspective Projections of Polyhedral Scenes, Artificial Intelligence 70, 239-276, 1994.
D.N. Perkins, Cubic Corners, Quarterly Progress Report 89, 207--214, MIT Research Laboratory of Electronics, 1968.
J.M. Porta, L. Ros, F. Thomas and C. Torras, A Branch-and-Prune Algorithm for Solving Systems of Distance Constraints, Proceedings of the 2003 IEEE International Conference on Robotics and Automation, 2003.
M.M. Samuel, A.A.G. Requicha and S.A. Elkind, Methodology and Results of an Industrial Parts Survey. Technical Memorandum 21, Production Automation Project, University of Rochester NY USA, 1976.
E. Schweikardt and M.D. Gross, Digital Clay: Deriving Digital Models from Freehand Sketches, Automation in Construction 9, 107-115, 2000.
T.M. Sezgin and T. Stahovich, Sketch Based Interfaces: Early Processing for Sketch Understanding, 2001.
A. Shesh and B. Chen, SMARTPAPER: An Interactive and User Friendly Sketching System, Computer Graphics Forum 23(3), 301-310, 2004.
K. Sugihara, Machine Interpretation of Line Drawings, MIT Press, 1986.
K. Sugihara, Resolvable Representations of Polyhedra, Discrete and Computational Geometry 21(2), 243-255, 1999.
H. Suzuki, H. Ando and F. Kimura, Geometric Constraints and Reasoning for Geometrical CAD Systems, Computing and Graphics 14(2), 211-224, 1990.
T. Varady and R.R. Martin, Reverse Engineering, in: The Handbook of Computer Aided Design, eds. G. Farin, J. Hoschek, M.-S. Kim, 651--681, Elsevier, 2002.
P.A.C. Varley, H. Suzuki, J. Mitani and R.R. Martin, Interpretation of Single Sketch Input for Mesh and Solid Models, International Journal of Shape Modelling 6(2), 207-241, 2000.
P.A.C. Varley and R.R. Martin, The Junction Catalogue for Labelling Line Drawings of Polyhedra with Tetrahedral Vertices, International Journal of Shape Modeling 7 (1) 23-44, 2001.
P.A.C. Varley and R.R. Martin, Estimating Depth from Line Drawings, in ed. K.Lee and N.Patrikalakis, Proc. 7th ACM Symposium on Solid Modeling and Applications, SM02, 180-191, ACM Press, 2002.
P.A.C. Varley, Automatic Creation of Boundary-Representation Models from Single Line Drawings, PhD Thesis, Cardiff University, 2003.
P.A.C. Varley, Sketches of Polyhedral Solids, 2003.
P.A.C. Varley and R.R. Martin, Deterministic and Probabilistic Approaches to Labelling Line Drawings of Engineering Objects, International Journal of Shape Modelling 9(1), 79-99, 2003.
P.A.C. Varley, H. Suzuki and R.R. Martin, Interpreting Line Drawings of Objects with K-Junctions, Proc. Geometric Modeling and Processing 2004, Eds. S.-M. Hu, H. Pottmann, 249--358, 2004.
P.A.C. Varley, R.R. Martin and H. Suzuki, Making the Most of Using Depth Reasoning to Label Line Drawings of Engineering Objects, in ed. G. Elber, N. Patrikalakis and P. Brunet, 9th ACM Symposium on Solid Modeling and Applications SM'04, 191--202, 2004.
P.A.C. Varley, R.R. Martin and H. Suzuki, Can Machines Interpret Line Drawings?, in ed. J.F. Hughes and J.A. Jorge, Sketch-Based Interfaces and Modeling: Eurographics Symposium Proceedings, 107-116, 2004.
P.A.C. Varley, Y. Takahashi, J. Mitani and H. Suzuki, A Two-Stage Approach for Interpreting Line Drawings of Curved Objects, in ed. J.F. Hughes and J.A. Jorge, Sketch-Based Interfaces and Modeling: Eurographics Symposium Proceedings, 117-126, 2004.
T. Vetter and T. Poggio, Symmetric 3D Objects are an Easy Case for 2D Object Recognition, Human Symmetry Perception, ed C.W.Tyler, 349-359, 1996.
D.M. Waltz, Generating Semantic Descriptions from Drawings of Scenes with Shadows, Tech Rept AI-TR-271, M.I.T., Cambridge USA, 1972.
H.W. Yankee, Engineering Graphics, Prindle, Weber and Schmidt, 1985.
R.C. Zeleznik, K.P. Herndon and J.F. Hughes, SKETCH: An Interface for Sketching 3D Scenes, Proceedings of SIGGRAPH '96. 163-70, ACM Press, 1996.