Friday, June 12, 2020

Poetry generation with rhyme, meter, and phrase constraints

In mid-February 2020, I was stuck in a small, crowded airport for several hours on the way home from a meeting in upstate New York. (This was the last international meeting I would attend before Covid-19 shut everything down.) I had been working with GPT-2 to create text adventures in a fundamentally different way than AI Dungeon 2. So while I was in the airport and on the plane ride home, I wrote a program that generated rhyming poetry using GPT-2. I had already written another program to create rhyming word pairs using word2vec, so I knew that rhymes could be checked using the CMU pronunciation dictionary.
This first program worked on a token-by-token basis. Only whole words which were also tokens could be generated. The last token-word in the second line had to rhyme with the last token-word in the first line. The selection was made more efficient by dividing the list of words to be generated into sets of rhyming words that all rhymed with each other, a trick I had used for the word-pairs program earlier.
While the results were interesting, I was unsatisfied because it lacked meter constraints.So in May I again used the CMU pronunciation dictionary to generate all possible continuations  of a prompt over a certain probability using a depth-first search method. It followed the most probable continuations, and when it wasn't able to satisfy a meter, rhyme, or probability constraint, it backed up and tried again.
This worked pretty well, but it was slow. So again I created something like the rhyme sets, but this time for meter. Any word in the pronunciation dictionary that could be consistent with the next three stresses was allowed in the set associated with those stresses. Since line ends also needed to be accounted for, I also included one- and two-stress sets.
The biggest remaining problem was that sentences often ended in the middle of lines, and the poem had no notion of ending at the end of the poem-- it was as if it were clipped arbitrarily from a longer poem. So I added in what I call phrase constraints. This says that a line must end in some punctuation (? ,! , . , , , ; , or :) and that punctuation must be sufficiently probable. If not, it must back off and try again. This does an adequate job of making the meaningful phrases line up with the lines of the poem. (It prevents some perfectly acceptable poems from being generated, though.)
If you want to try it out, here is the github repository.


No comments:

Post a Comment