Project Demo


The idea

This project required us to make art out of data visualization. I wanted my data to be personal. I wanted my piece to comment on the ways that digital communication can sometimes fall short.

My initial instinct was to use my keystroke data over several months. I had a program installed on both my laptop and desktop computer that tracked the frequency of every key, as well as mouse clicks and locations.

I wanted to take this data and with the use of a 3D printer, generate a model of a keyboard with each key's height representing its frequency on my keyboard. While I liked this idea, it didn't feel very personal, so I thought of ways that I could diverge from it.

My next approach was to take my text message history and create something interesting with it.

The source code can be found here.

Artist statement

I’ve often thought about the intricacies inherent in textual communication and how easily meaning can be lost. Nonverbal cues and subtleties comprise a large and important component of communication. These subtleties most often come in tonal and visual forms. When everything but the literal content of a message is stripped away, it can become difficult to understand or easy to misinterpret.

For this project, I took all of the text messages sent from and received by my phone over the last few months and represented them in such a way that we see the opposite: a solely visual and tonal expression of their contents. Every text message creates a visual based on a number of factors: the frequency of each letter, the length of the message, the time of day it was sent, whether it was sent or received, and the name of the recipient or sender.

For each message, the background of the visual is coloured according to the way that the sky usually looks at the time of day it was sent. The first and last initials of the sender or recipient determine the tint colour that is used throughout the piece. This tint is used to colour two things: the frame around the visual, and the dynamic shapes that represent the frequency of letters used in each message. A tone is then generated for each message based on its most frequently used letter. Finally, after visually and tonally representing each message, there is a brief delay equal in milliseconds to its length in characters.

If you watch closely as the generated imagery flickers, some trends reveal themselves, but they are difficult to pin down with any certainty. When the content of a message is obfuscated by tonal and visual elements, we must simply guess at its original meaning. This creates an interesting contrast to the original problem, wherein we know what the message means through its content but have no visual or auditory cues: now we have visual and audio cues but no context for them.

With this piece I hoped to make the audience think about the multitude of ways that we connect with one another and consider the subtle differences between them.


Technical summary

I found an application on the Google Play Store which allowed me to export my entire text message log into an XML file. Using Excel, I was able to convert this data into a CSV file. It was difficult to get the file to convert into anything but this format, and unfortunately under this format, the cells were in a strange structure.

Every message occupied 5 vertically aligned cells, but with no clear division except the names of the cells. I read the CSV file into Processing and created a quick workaround to the cell formatting issue. I created an array for each category of value and then simply had every fifth row assigned to the next space in that category's array. So for the "Name" value (sender or receiver of message), I assigned its first position the second value, its second the seventh, its third the twelfth, and so on.

Overcoming this hurdle made the rest of the program fairly straightforward to write. I decided that I wanted to create an abstract representation of each message, but without making everything completely random, as it would then lose all meaning.


I took a layered approach to the design aspects of the final product. Firstly, I decided to start with a portrait orientation for the visualization, to represent the way that text messaging usually looks.

The background layer comprised the context of the message. I divided the day into different segments, and assigned each segment a colour to represent the way that the sky looked during that portion of the day. Then I took the time that each message was sent, and based on which segment it fell into, the background of the scene became that colour.

Next, the name of each contact was taken, and split into its initials. The name "Daniel James" became DJ, and an RGB value was generated in a pseudo-random way. The R value was taken from the first initial: 26 (number of letters) was mapped to 255 (maximum colour value) using basic math. The same was done for the G value and the initial of the last name. Then for the B value, both initials were added together, and the resulting number from 2 - 52 was mapped to 255. The resulting RGB value was used to paint the rest of the elements in the final product. Using this method, every contact is given a seemingly random colour value based on their name, and these colours remain consistent.

The next layer of the design was the frame, which was painted with the RGB value from the last step. The frame was static, and created a nice visual balance for the rest of the piece. Lastly, the bars along the bottom and top were visual representations of every message sent or received, displayed in chronological order.

If a message was received by my phone, it generated bars along the top of the frame, and if a message was sent from my phone, it generated bars along the bottom. I made one bar for each letter, so there were 26 evenly-divided bars along both the top and bottom of the frame. Every bar was given a length representing that letter's frequency in each message. Not only does this technique allow the audience to see which letters are more frequently used in common English, it also shows the general length of every message.

I initially had 28 bars, with the extra ones representing the frequency of punctuation and numbers. I decided to take them out because the punctuation bar was always higher than the rest and would require too much normalization. I also liked the simplicity of only showing letter frequency.

Then, to add an auditory component to the final presentation, the most frequently used letter for each message was calculated and the message was assigned a tone to play based on this letter.

The final step was to cycle through each message, and pause for as many milliseconds as there were characters in the message before jumping to the next one.