Monday, August 12, 2013

Drawing Like a Child

Most non-photorealistic rendering techniques leave the proportions and projection of the representation essentially unchanged. Evidence from studies of children's drawings and naive adults (untrained in making art) shows that the human process of making images works very di fferently than any of these techniques. As Martin Gardner writes, "[A] child wants...and is perhaps driven to invent, graphic equivalents for those categories that occupy her thought processes; and so it becomes natural for her to develop a formula, or prototypical schema, which can represent or stand for the full range of instances of this category." It is only with artistic training and education about the e ffects of perspective, tricks for seeing the scene as blobs of color rather than individual objects, and measuring and copying proportions that artists are able to reproduce images in true proportion, the way that is most easily realized by automatic computer techniques.

This figure shows 21 images of a particular street scene drawn by children ranging from age 3 to age 13. The oldest children (one of the eight-year-olds, one of the 10-year-olds, and both children 11 and older) attempted to capture the strong perspective on the street by drawing converging rather than parallel edges. The younger children's drawings all show the street as seen from directly above. Yet none of them have ever seen this particular street from that perspective. Instead, it seems like the drawing process the children are engaged in is something like the following: 

  1.  Look at the picture.
  2. Notice and recognize the most salient objects. (This may be very different for various observers. Johnny (age 4) apparently noticed the vehicle, the fence, and the sidewalk, while Maggie (also age 4) noticed the two figures, the tree, two buildings, and the sidewalk and road.)
  3. Use pre-learned techniques for representing these objects, introducing only small variations to match what is present in the scene. (For example, Elena (age 6) draws conventional stick figures, but extends the arm of one around the shoulder of the other to match what she believed she saw in the image.)

The younger children have learned such a technique for drawing a house but not for a building, but use this representation because it is conceptually the closest they have. All but one of the children age 5 or older also copied the road markings, perhaps because they had the ability to represent them more or less accurately.
Even in the drawings of the older children, traces of this method are still present. In the 11 and 13-year-old drawings, attempts are made to capture the perspective on the buildings. Yet certain angles that ought to be acute in the perspective projection are instead drawn as right angles. It seems likely that this is due to the
fact that the artists knew these angles must be 90 degree angles because of experience with buildings in the past, and that this knowledge informed their drawing, forcing accuracy in other areas to be compromised as the artist tried to reconcile the two implicit perspectives. Daniel (age 10) spontaneously commented as he was doing the drawing that he "didn't know how" to draw the cars from an angle.
In 1997, J. Willats performed similar experiments in which young children depicted a scene of a die. They often did this by including all sides of the die they could see within a single containing rectangle. He wrote, "it suggests that analogies between picture production by either photography or computer graphics and human
picture production... are grossly oversimplified.... pictures can be derived from either object-centered descriptions or viewer-centered descriptions... the presence of characteristic anomalies...may provide the only available evidence about the nature of the production process." 
Discussing these experiments, Fredo Durand wrote, "This might seem like a very odd example due to the lack of skill. In fact, this is a caricatural but paradigmatic demonstration of a very fundamental principle of depiction: Depiction is not about projecting a scene onto a picture, it is about mapping properties in the scene to properties in the picture. Projection happens to be a very powerful means to obtain relevant mappings, but it is not the only one, and it is not necessarily the best one." 

Automatic Generation of Child-like Visual Representation
A simplified model of the production processes that children employ in making drawings of a scene was easily implemented within the automated image compre-hension framework. Using the scene parsing software, a description of the objects in the scene and some information about their spatial relationships is stored in a microtheory in the Cyc ontology. A pre-stored schema for generating images in each of those categories is associated with it. (In this case, it is simply a stored bitmap, but a more realistic model would store it as a series of hand motions to create particular shapes with strokes of the pen and the spatial relationships of those shapes.) Using these schemas and the information about how the objects are distributed around the image relative to each other, a child-like representation is then generated. An example generated image and its associated input is shown in the figure below.

Although it doesn't model everything a child is doing as she creates a depiction of a scene, it does capture one important aspect of that process. The point here is not to create a new way of generating cute images, but to suggest a new approach to NPR rendering that makes use of this important feature of human art. Consider the following goals of various artists:
  • A cartoonist may want to reduce the number of individuals in a crowd to simplify the depiction.
  • A painter may concentrate brushstrokes on the face of the focal character, while leaving the background depicted with broad strokes.
  • A painter who wants his charming landscapes to be hung above the couches in people's homes may leave out garbage blowing down the street and modern automobiles present in the scene.
  • A fantasy artist may wish to depict horses with wings.
Any NPR system that only considers the input image in terms of simple primitives such as contours and regions of color would be incapable of being directly extended to handle these kinds of problems. A system including image comprehension with a semantic knowledge base, however, could be extended to deal with any of these problems in a fairly straightforward way.

1 comment:

  1. For years now companies have found a competitive edge in oversaturating colors. This drives professional imaging guys nuts, since they strive for accuracy, and dislike the "cartoonish" look. But consumers love it.

    My personal opinion is that the oversaturation makes an image feel more real to us, since it bridges some of the gap between looking at a picture of a sunset and the real thing.

    So I think you're on the right track here. I'd love to see more examples of augmented sensing that presents images of things the way we experience them, rather than how the photons hit the sensor.