James Gurney (author and illustrator of Color and Light and Dinotopia) regularly explores the science behind various aspects of painting at his blog. Here he looks at a system that uses image parsing to choose from among various painting techniques in different parts of the same image.
Saturday, December 29, 2012
Sunday, November 11, 2012
visual sugar
This painting by Leonid Afremov has, a couple of times, appeared on the front page of the website Reddit. Reddit has a system for voting up submissions, so something that a lot of people like will rise to the top. It is relatively rare for any painting to be upvoted so highly, especially more than once.
From an artistic standpoint, the painting has little to offer. Everything that is good about this painting is good on the surface. It uses a variety of simple tricks to attract the viewer. It is the visual equivalent of McDonalds' food-- it tastes good because it has abundant fat and sugar, which have a direct connection to the pleasure centers in the brain. Here are a few of the tricks:
The colors are all saturated.
There is a rainbow progression from yellow at the center of the light, through orange, red, and purple to blue in the sky. This is a comforting pattern.
The lines of perspective and brighter center draw the eye to the couple, who are expressing love and comfort. (The artist doesn't really have a great feel for perspective, though. What is going on with that bench in the foreground?)
There is a cute dog.
The reflections in the puddles and canal make a kind of symmetry.
The palette-knife strokes allow pleasing contrasts between adjacent colors, and add a kind of visual texture to the image.
These are predictable responses of people to images. Good artists make use of the same techniques. Van Gogh's Starry Night, for example, uses some of the same techniques of contrast, saturation, and lines guiding the eye. I think that anyone, human or otherwise, learning to be an artist has to pass through the stage where they are learning these tricks. Eventually, they become tools that can be deployed as desired with an understanding of the effect they will have on the viewer, in order to get across a more subtle emotional message.
Wednesday, October 31, 2012
Machinamenta in Old English
Reading a little about the etymology in Lord of the Rings, I came across the word that they used in Old English to translate the Latin word "machinamenta." It was Orþanc, and that p/b shaped letter (called thorn) is pronounced 'th.' This meaning of ingenious devices and siege engines was the reason Tolkein picked the name Orthanc for the tower of Saruman (whose name similarly means 'cunning mind.')
Tuesday, October 23, 2012
Gödel and Leibniz
Gödel was fascinated by Leibniz's ideas, to the point that others felt he was obsessed: he checked out every book on Leibniz from his university library. He believed (correctly, I would say) that Leibniz's most important ideas (the characteristica universalis) had been nearly forgotten by society; but he also believed that this was due to a shadowy conspiracy meant to prevent the intellectual advancement of mankind. While one could make up a marvelous conspiracy theory about this, involving Newton, the Illuminati, the Invisible College, and so forth, it was more likely due to the fact that many of Leibniz's writings have never been published, and that Leibniz himself never completed the project.
At any rate, Gödel wanted to achieve Leibniz's dream of an exact, computational philosophy, able to come to provable conclusions. Gödel wrote, "There are systematic methods for the solution of all problems (also art, etc.)" Leibniz believed that the natural world arose out of a network of binary relations. This idea of a mathematical world underlying the world we see, a kind of Platonic realism, was appealing to Gödel as well, and he saw his work as pointing in that direction. Gödel like Leibniz, believed that the study of mathematics could tell us ultimate truths about the nature of reality. Since, as he proved, it is impossible to prove certain true facts about the mathematical universe, those truths must exist, he thought, somewhere outside of proof.
Gödel's more mathematical ideas were very important to people like Alan Turing, Stanislaw Ulam, and John Von Neumann. Gödel's famous proof of the incompleteness theorem needs to be able to make statements about mathematics using mathematics itself, and this required the invention of something very much like a programming language. Turing's key paper "On Computable Numbers, with an Application to the Entscheidungsproblem" uses the incompleteness theorem, proved five years before, to prove that it is impossible to decide algorithmically whether a given Turing machine will ever halt.
There are two main ideas I tried to get across in Machinamenta, that I laid out in the introduction. One is the idea of the kaleidoscope pattern, which I'm not going to go into here. The other is that the history of computers is not just the history of the development of mechanical math machines. There has also been, for a long, long time, a desire to make machines that can take ideas, and combine them with other ideas, to come up with new ideas. You see this in divination machines, which inspired Ramon Llull. Lllull's own devices inspired Leibniz to develop a much more ambitious and realistic plan. This in turn was taken up by Babbage and Gödel, who were direct influences on the people who built the first electronic computers. The dream of machine intelligence was already fully present through this chain of influence at the birth of the computer.
Thursday, October 4, 2012
AI and children's drawings
I've been reading about attempts to write software that approximates the steps that children go through when they create drawings. This gets at some of the fundamental differences between how machine currently generate images in "artistic" styles (using brushstroke filters in Photoshop, for example) and how an artist paints. By trying to shortcut past the early representational stages, we have failed to capture some of the important things about what it means for a person to create a painting.
One of the best papers I've found is called Thoughtful Drawings: A Computational Model of the Cognitive Nature of Children’s Drawing. It describes a piece of software called Rose (Representation Of Spatial Experience). The author states, "Rose is not intended to be a model of any part of a child’s mind. Rose is a representation of personal ideas about just a few vital elements of the human experience of drawing."
Rose takes as input a 3D form, composed of triangular surfaces.
From this form, it recognizes certain parts that are joined together-- the legs, the neck, and the tail all joined to the body, and the head joined to the neck. Each of these body parts is fit with a cylinder.
Then the program attempts to draw the shape. It attempts to create closed curves (by moving a "pen" around with an imperfect control algorithm) whose length and width are determined by the proportions of the cylindrical body parts, and are connected in the same way.
This projection doesn't take into account perspective, viewpoint or occlusion: it simply copies the connectivity of the graph of how the body parts join together. This seems to me to be how children begin to draw. We could test this by giving children an unfamiliar shape and seeing if the drawings tend to follow this rule. I would also like to know when a child will use a closed curve and when they will simply use a line to represent a part.
I would like to do something similar to this, but add in an extra step. I would like to give the program a computer vision capability that allows it to look at its own drawings and see how much they resemble realistic line drawings of the same subjects (or, more practically, semantic contour detection on photos of the same subjects). Depending on how good the resemblance is, it could choose what lines to keep and what lines to erase and try again. This extra judgement step is an important aspect of how people create art.
One of the best papers I've found is called Thoughtful Drawings: A Computational Model of the Cognitive Nature of Children’s Drawing. It describes a piece of software called Rose (Representation Of Spatial Experience). The author states, "Rose is not intended to be a model of any part of a child’s mind. Rose is a representation of personal ideas about just a few vital elements of the human experience of drawing."
Rose takes as input a 3D form, composed of triangular surfaces.
Then the program attempts to draw the shape. It attempts to create closed curves (by moving a "pen" around with an imperfect control algorithm) whose length and width are determined by the proportions of the cylindrical body parts, and are connected in the same way.
This projection doesn't take into account perspective, viewpoint or occlusion: it simply copies the connectivity of the graph of how the body parts join together. This seems to me to be how children begin to draw. We could test this by giving children an unfamiliar shape and seeing if the drawings tend to follow this rule. I would also like to know when a child will use a closed curve and when they will simply use a line to represent a part.
I would like to do something similar to this, but add in an extra step. I would like to give the program a computer vision capability that allows it to look at its own drawings and see how much they resemble realistic line drawings of the same subjects (or, more practically, semantic contour detection on photos of the same subjects). Depending on how good the resemblance is, it could choose what lines to keep and what lines to erase and try again. This extra judgement step is an important aspect of how people create art.
Tuesday, August 7, 2012
Monday, August 6, 2012
Pareidoloop
This is a program that satisfies, at a very simple level, the creation/evaluation loop that I felt would be necessary for any machine to make something we would be able to find creative. It randomly places triangles until it finds a face using face detection software. A few results are pictured above. I see a lot of character in these faces that I think wouldn't have shown up in a more direct synthesis from a face model.
The author is Phil McCarthy.
The face detection method is the really rich part of this program that does most of the heavy lifting. It is based on this paper: High-Performance Rotation Invariant Multiview Face Detection by Chang Huang, Haizhou Ai, Yuan Li, and Shihong Lao, from Tsinghua University. The paper advances both machine learning techniques and optimiztion methods to make it fast. The training set is Labeled Faces in the Wild. The faces in the training set form the algorithm's idea of what a face should be.
The program was inspired by Greg Borenstein running a face tracker on the flickr pool Hello Little Fella!, and by Evolution of the Mona Lisa by Roger Alsing.
The name of the program, Pareidoloop, comes from the word pareidolia used to describe our tendency to see faces in the clouds, or in the rocks. I put this poster by Bev Doolittle on my bedroom wall as a kid, that plays with this idea:

(The poster came from Ranger Rick magazine, which I had a subscription to for years. I learned a ton of science from that magazine; it built up a lot of the structural framework in my memory that later ideas I was taught about ecology, botany, and so forth attached to.)
The author is Phil McCarthy.
The face detection method is the really rich part of this program that does most of the heavy lifting. It is based on this paper: High-Performance Rotation Invariant Multiview Face Detection by Chang Huang, Haizhou Ai, Yuan Li, and Shihong Lao, from Tsinghua University. The paper advances both machine learning techniques and optimiztion methods to make it fast. The training set is Labeled Faces in the Wild. The faces in the training set form the algorithm's idea of what a face should be.
The program was inspired by Greg Borenstein running a face tracker on the flickr pool Hello Little Fella!, and by Evolution of the Mona Lisa by Roger Alsing.

(The poster came from Ranger Rick magazine, which I had a subscription to for years. I learned a ton of science from that magazine; it built up a lot of the structural framework in my memory that later ideas I was taught about ecology, botany, and so forth attached to.)
Subscribe to:
Posts (Atom)