Airplanes dance to Budapest airport flight data, along with other relevant data.
Or download a version in slightly better quality.
This plot contains about 20 variables.
I produced it for the data visualization challenge at SatRDay Budapest. Because the challenge focused on this particular dataset, I used the different sensory media to distinguish between the given data and the other data I acquired. The given flight data are presented only in the video, and few other variables are presented in the video; the music includes none of the given flight data.
The time dimension is the key by which the video and music variables are joined. Time is divided into months from January 2007 to June 2012 (66 months). Each month corresponds to 16/9 (1.778) seconds, eight beats of music, and eight frames of video. The whole song is thus 117 seconds long.
To compose the video I specify two sets of animations. The first set describes the outgoing trips for the present month, and the second set describes incoming trips. Each set is four frames long, so I get eight frames by playing them in series.
To compose the music, I generate four beats of music for each month. Then I play this twice, producing the eight beats per month. The music contains mostly year-level and quarter-level data, so adjacent months tend to sound similar to each other.
Airplanes are arranged in dancing circles, with dancing circles positioned on a scatterplot.
Each dancing circle corresponds to a city other than Budapest, to which flights are going and coming.
- Circle ~ city
- One dancing circle is one city
- Airplanes ~ flights
- One ✈ = 10 flights
- Airplane orientation ~ Flight direction
- Airplanes face into the circle for incoming flights and away from the circle for outgoing flights.
- Circle radius ~ number of flights
- Larger circles mean more flights.
- Front-back jump distance ~ passengers per flight
- If there are more passengers per flight, the circle will move in and out more.
- Left-right jump distance ~ cargo per flight
- If there is more cargo in the flight (by mass), the planes in the circle move side-to-side more. Most routes do not convey any cargo, and these planes thus do not move side-to-side.
- Hue ~ Schengen
- Blue Plane = Flight within Schengen Zone
- Brightness ~ Schedule
- Airplanes in darker color correspond to 10 scheduled flights, and airplanes in lighter color correspond to 10 non-scheduled flights. Most flights are scheduled, and I did not include partial airplanes are not included, so I always round in favor of non-scheduled flights.
Any particular frame displays data from a particular direction of travel (incoming or outgoing) and month. Each circle thus corresponds to a route-month, and the number of planes in the circle corresponds to the number of route-flights on that month.
In order to apply the jump aesthetics, I in fact made four frames for each route-month. The planes are all the way in on the first of these frames, all the way out on the third, and in the middle on the second and fourth.
The circle is positioned based on country-level data for its city. Cities in the same country has the same country-level data and thus form concentric circles.
- x ~ distance
- Distance between the country and Hungary, measured as the haversine result for the two country centroids.
- y ~ log(population)
- Logarithm of population of the other country
Recall that the same four beats of music are played twice per month. The main tune starts with a baseline note for one half-beat, and the remaining seven half-beats have pitches corresponding to proportional changes in statistics.
- Baseline note
- Proportional change in Hungary retail sales since last year
- Proportional change in Hungary imports since last year
- Proportional change in Hungary exports since last year
- Proportional change in Hungary hotel beds since last year
- Proportional change in Hungary other beds since last year
- Proportional change in Budapest population since last year
- Proportional change in Budapest dwellings since last year
Pitches are chosen along a just-tuned major seventh chord. For each 1% change, the note goes moves up or down by one step along the chord. (This linear approximation of percent change is appropriate only because the change each year is small.) For example, if imports go up 2% one year, the third note in the chord is played. If there is no change since the previous year, the percent change is 0%, and the baseline note is thus played.
- Percussion ~ Summer
- A low drum plays on every down beat. During the summer (high travel season), a high clap is added on every second up beat.
- Volume ~ Gross domestic product
- Volume of everything except the percussion is based on the change in GDP since the previous year (measured quarterly). It is high for larger increases, low for large decreases, and in-between when there is little change.
- Number of grace notes ~ Unemployment rate
- Half-beat grace notes can be added to every beat. A grace note is added for a particular beat if there is a successful result of a Bernoulli trial with probability equal to the normalized unemployment rate for the particular month.
- Pitch of grace notes ~ Direction of change in unemployment rate
- If a grace note is added to a beat, the pitch is chosen to be one step up or down (along the aforementioned chord) from the main pitch of the beat. It is one step up if the unemployment rate is going down, one step down if the unemployment rate is going up.
The given flight dataset
In the given flight dataset, each record is a route on a month. A route might have multiple flights per month. The variables are
- Other city
- Other country
- Incoming or outgoing
- Scheduled or not
- Number of passengers (total across all flights)
- Cargo weight (total cargo weight across all flights)
- Number of flights (flight on the day)
- Seat capacity (single airplane seat capacity times number of flights)
I used all of these variables in my analysis except for the seat capacity, as I thought it was unfair to cargo that there was a seat capacity but not a cargo capacity.
I suspect that each row in fact corresponds to an airport rather than to a city because large cities (with multiple airports) have multiple records for a particular month-scheduling-direction.
flights.sub <- subset(flights, month=='2009-08-01'& scheduled=='Scheduled'&direction=='Outgoing') print(tail(sort(table(flights.sub$other.city.id))), 20) # Timisoara Tirana Tirgu Mures Treviso Uzhgorod Valetta # 1 1 1 1 1 1 # Varna Venice Warsaw Wien Zagreb Zürich # 1 1 1 1 1 1 # Berlin Göteborg Milan Moscow Oslo Paris # 2 2 2 2 2 2 # Stockholm London # 2 3
One should be able to run the present analysis on modern operating system with the following extra packages.
- ffmpeg (for converting audio file formats)
- R (main composition language)
- grid library
You will need the following packages to build the data
- wget (for downloading data files)
- Gnumeric (ssconvert for making CSVs)
You will need an audio or video player to play the audio or video outputs. I used sox (audio files) and mplayer (for video).
I have saved the built data files. (Run make clean in the data directory to delete them.) so it is possible to run the analysis without the second set of packages.
I ran the analysis on OpenBSD 5.9.
With the aforementioned dependencies, the present analysis should be fully reproducible from the files in the repository. Run make, and then open /tmp/ugros.mkv in a video player.