Project 2 – Twitter Haikus
Adam Kidder, Kevin Shih, Cassandra Yaple
I. Conceptual Overview:
This project aims to create art by challenging one of the deepest forms of poetic expression with what is considered one of the most pointed, shallow forms of communication on the web. By creating haikus computationally with machine learning algorithms we are teaching the computer how to create poetry. The artistic statement of this project is not simply a use of amazing and complex algorithms, but rather to see what kind of depth, if any, can be observed from using such a contrived form of media as the source of words. The project scrapes public twitter feeds for sentences to generate language models. Phrases of the correct length with the proper syllable count and perceived relevance are chosen to form the lines of the haiku. The goal is generate thought provoking, interesting or even controversial haikus from tweets that otherwise would be lost in the mix of endless blurbs of quick and dirty updates. The haikus will be presented by both tweeting them and a graphical representation of the algorithms at work. By tweeting these haikus back to the same space they are generated it provides a closure for what might have been; a dim flicker of light left to fade on a sea uniformity.
II. Technical Overview:
The language model was generated using Markov chains of length 3. In layman’s terms, this means that we look at the probability of any word appearing as the conditional probability of its appearance given the three words that appeared immediately before it. We experimented with chains of length 2 as well, and decided that a chain length of three gave slightly more meaningful phrases, and could be generated without too much repetition with less than 50 tweets.
We used a very simple algorithm to generate the actual lines of the haiku. To get a line of N syllables, we simply seed the model for phrases of length N in words. If a phrase has N syllables, we use it; if not, we generate a phrase of N-1 words. This process continues until the phrase length hits zero, in which we give up for now and try again later. As you can probably see, this method has a bias towards longer sentences. This is particularly important since if we were to start from the other side, we would often end up with haikus with one word on each line. One must also note that tweets are generally riddled with misspellings and missing spaces, so a single 7 syllable word could easily be several words strung together due to typing errors.
Another important component was filtering input for english. This was done simply by dropping any tweet with ascii values > 127 or any of the most common Spanish, Indonesian, and Portuguese words.
There are several ways that the observer can experience this project. We are tweeting our haikus back to twitter for them to get lost in the sea of other mindless twitter drivel. The twitter user name is ‘rehaiku’ and can be viewed at anytime. Additionally, we are storing a file of all the haikus generated. Here is an example of some of the notable haikus generated.
u didnt use them So / getting me this high I love / much to get rid of
to write about This / My timeline is always full / won the world series
win the House tonight / jill scott to sing wen I walk / The Fuck Off My hope
do time my friend In / go to bed or turn it off / lot of work to do
go to bed soon I / of my body that isn’t sore / sometimes I can’t run
Lastly, we have an animation of the project running with animations of the tweets being used to generate the haikus fading in and out in the background while the generated haikus are centered and bolded to draw attention to them on the screen. Below you can see a screen shot.
Link to video (low framerate due to opengl):
V. References and Inspiration
Stack Overflow computer haiku
An article on computer generated poetry: http://muse.jhu.edu/journals/esc/summary/v034/34.4.emerson.html
Hartman, Charles O. “Virtual Muse, Experiments in Computer Poetry,” Wesleyan University Press, 1995.
Library used for animations: http://ekeneijeoma.com/processing/ijeomamotion/
Library used for natural language processing: http://www.rednoise.org/rita/