Hopping Cities

Airplanes dance to Budapest airport flight data, along with other relevant data.

Legend

This plot contains about 20 variables.

Because the challenge focused on this particular dataset, I used the different sensory media to distinguish between the given data and the other data I acquired. The given flight data are presented only in the video, and few other variables are presented in the video; the music includes none of the given flight data.

Time dimension

The time dimension is the key by which the video and music variables are joined. Time is divided into months from January 2007 to June 2012 (66 months). Each month corresponds to 16/9 (1.778) seconds, eight beats of music, and eight frames of video. The whole song is thus 117 seconds long.

To compose the video I specify two sets of animations. The first set describes the outgoing trips for the present month, and the second set describes incoming trips. Each set is four frames long, so I get eight frames by playing them in series.

To compose the music, I generate four beats of music for each month. Then I play this twice, producing the eight beats per month. The music contains mostly year-level and quarter-level data, so adjacent months tend to sound similar to each other.

Video

Airplanes are arranged in dancing circles, with dancing circles positioned on a scatterplot.

Each dancing circle corresponds to a city other than Budapest, to which flights are going and coming.

Circle ~ city
One dancing circle is one city
Airplanes ~ flights
One ✈ = 10 flights
Airplane orientation ~ Flight direction
Airplanes face into the circle for incoming flights and away from the circle for outgoing flights.
Circle radius ~ number of flights
Larger circles mean more flights.
Front-back jump distance ~ passengers per flight
If there are more passengers per flight, the circle will move in and out more.
Left-right jump distance ~ cargo per flight
If there is more cargo in the flight (by mass), the planes in the circle move side-to-side more. Most routes do not convey any cargo, and these planes thus do not move side-to-side.
Hue ~ Schengen
Blue Plane = Flight within Schengen Zone
Brightness ~ Schedule
Airplanes in darker color correspond to 10 scheduled flights, and airplanes in lighter color correspond to 10 non-scheduled flights. Most flights are scheduled, and I did not include partial airplanes are not included, so I always round in favor of non-scheduled flights.

Any particular frame displays data from a particular direction of travel (incoming or outgoing) and month. Each circle thus corresponds to a route-month, and the number of planes in the circle corresponds to the number of route-flights on that month.

In order to apply the jump aesthetics, I in fact made four frames for each route-month. The planes are all the way in on the first of these frames, all the way out on the third, and in the middle on the second and fourth.

The circle is positioned based on country-level data for its city. Cities in the same country has the same country-level data and thus form concentric circles.

x ~ distance
Distance between the country and Hungary, measured as the haversine result for the two country centroids.
y ~ log(population)
Logarithm of population of the other country

Music

Recall that the same four beats of music are played twice per month. The main tune starts with a baseline note for one half-beat, and the remaining seven half-beats have pitches corresponding to proportional changes in statistics.

  1. Baseline note
  2. Proportional change in Hungary retail sales since last year
  3. Proportional change in Hungary imports since last year
  4. Proportional change in Hungary exports since last year
  5. Proportional change in Hungary hotel beds since last year
  6. Proportional change in Hungary other beds since last year
  7. Proportional change in Budapest population since last year
  8. Proportional change in Budapest dwellings since last year

Pitches are chosen along a just-tuned major seventh chord. For each 1% change, the note goes moves up or down by one step along the chord. (This linear approximation of percent change is appropriate only because the change each year is small.) For example, if imports go up 2% one year, the third note in the chord is played. If there is no change since the previous year, the percent change is 0%, and the baseline note is thus played.

Percussion ~ Summer
A low drum plays on every down beat. During the summer (high travel season), a high clap is added on every second up beat.
Volume ~ Gross domestic product
Volume of everything except the percussion is based on the change in GDP since the previous year (measured quarterly). It is high for larger increases, low for large decreases, and in-between when there is little change.
Number of grace notes ~ Unemployment rate
Half-beat grace notes can be added to every beat. A grace note is added for a particular beat if there is a successful result of a Bernoulli trial with probability equal to the normalized unemployment rate for the particular month.
Pitch of grace notes ~ Direction of change in unemployment rate
If a grace note is added to a beat, the pitch is chosen to be one step up or down (along the aforementioned chord) from the main pitch of the beat. It is one step up if the unemployment rate is going down, one step down if the unemployment rate is going up.

The given flight dataset

In the given flight dataset, each record is a route on a month. A route might have multiple flights per month. The variables are

  • Other city
  • Other country
  • Incoming or outgoing
  • Scheduled or not
  • Month
  • Number of passengers (total across all flights)
  • Cargo weight (total cargo weight across all flights)
  • Number of flights (flight on the day)
  • Seat capacity (single airplane seat capacity times number of flights)

I used all of these variables in my analysis except for the seat capacity, as I thought it was unfair to cargo that there was a seat capacity but not a cargo capacity.

I suspect that each row in fact corresponds to an airport rather than to a city because large cities (with multiple airports) have multiple records for a particular month-scheduling-direction.

flights.sub <- subset(flights, month=='2009-08-01'&
                      scheduled=='Scheduled'&direction=='Outgoing')
print(tail(sort(table(flights.sub$other.city.id))), 20)
# Timisoara      Tirana Tirgu Mures     Treviso    Uzhgorod     Valetta
#         1           1           1           1           1           1
#     Varna      Venice      Warsaw        Wien      Zagreb      Zürich
#         1           1           1           1           1           1
#    Berlin    Göteborg       Milan      Moscow        Oslo       Paris
#         2           2           2           2           2           2
# Stockholm      London
#         2           3

Dependencies

One should be able to run the present analysis on modern operating system with the following extra packages.

  • ffmpeg (for converting audio file formats)
  • R (main composition language)
    • grid library

You will need the following packages to build the data

  • wget (for downloading data files)
  • unzip
  • Gnumeric (ssconvert for making CSVs)

You will need an audio or video player to play the audio or video outputs. I used sox (audio files) and mplayer (for video).

I have saved the built data files. (Run make clean in the data directory to delete them.) so it is possible to run the analysis without the second set of packages.

I ran the analysis on OpenBSD 5.9.

Data sources

Aside from the given flight data, most of the other data come from the Hungarian Central Statistical Office. Full details are here.

With the aforementioned dependencies, the present analysis should be fully reproducible from the files in the repository. Run make, and then open /tmp/ugros.mkv in a video player.