Wednesday, February 22, 2017

Estimating photos from sketches


When I finished writing Machinamenta six years ago, I suggested some things that could be done to make artificial creativity go beyond simple kaleidoscope patterns. In many ways, deep learning software has surpassed the suggestions I put forward. Here is another example. 
This work comes out of the Berkeley AI research lab at the University of California, Berkeley. Alexei Efros is a familiar name-- he worked on image quilting and automatic photo pop-up and was at CMU (along with Martial Hebert and Abhinav Gupta) during the period I was working with them professionally.
The way this works is that a neural network is trained on pairs of images. The right hand image of the pair is a photograph of a cat; the left hand image is an automatic edge detection on the photograph, using the HED image detector. This means that no humans were needed to create the training data-- important because of how many training samples are needed. It does mean that the edges it is looking for are not necessarily the ones people perceive as most important, but modern edge detectors like HED do a lot better job of that than the Canny edge detector, which was the best available when I first started working on computer vision.
The sketch contains far less information than the photograph. The only reason it is possible to do this at all is that the system has a great prior model of what cats look like, and does its best to fit that model to the constraints of the sketch. I wonder if drawing a Siamese profile would be enough of a hint to give it Siamese coloration? What happens if you try to draw a dog, or a pig, or a house instead?

Try it out yourself, it's a lot of fun.

The original paper