D3.js Step by Step: Loading External Data

Reading in data from an external CSV file

September 14th, 2014

UPDATE (July 18, 2016): The code and API links in these tutorials have been updated to target D3 v4, which was a complete rewrite. The D3 wiki contains a breakdown of the changes from v3.

TL;DR

This post is part of a series that explores some key concepts in D3.js by building up an example, step by step, from a bare-bones pie chart to an interactive, animated donut chart that loads external data. For the enough-with-the-jibber-jabber-show-me-the-code types out there, here's a breakdown of the steps we'll be covering:

Step 0: Intro
Step 1: A Basic Pie Chart (Code | Demo)
Step 2: A Basic Donut Chart (Code | Demo)
Step 3: Adding a Legend (Code | Demo)
Step 4: Loading External Data (Code | Demo) ← You Are Here
Step 5: Adding Tooltips (Code | Demo)
Step 6: Animating Interactivity (Code | Demo)

NOTE: Because we're building things up step by step, the source code contains NEW, UPDATED and REMOVED comments to annotate the lines that have been added, altered or deleted relative to the previous step.

Up until now our dataset has been, for lack of a better word, contrived. It's time we used some real data: the Toronto Parking Ticket dataset . In particular, we'll use the volume of parking tickets by day of the week in 2012.

The dataset does not directly provide the weekday numbers; I used the excellent pandas and an IPython notebook to extract that information. You can find the notebook in the repo for this post. The CSV file that resulted from that extraction looks like this:

label,count
Monday,379130
Tuesday,424923
Wednesday,430728
Thursday,432138
Friday,428295
Saturday,368239
Sunday,282701

The first row defines the column names and then comes the data iteself. Rather than code this directly into our JavaScript, which would be infeasible for larger datasets, we'll use D3 to load the file . To do this we only need five new lines of code (and one of them doesn't really count):

// ...
// A bunch of code
// ...
var pie = d3.pie()
  .value(function(d) { return d.count; })
  .sort(null);

d3.csv('weekdays.csv', function(error, dataset) {  // NEW
  dataset.forEach(function(d) {                    // NEW
    d.count = +d.count;                            // NEW
  });                                              // NEW

  var path = svg.selectAll('path')
    .data(pie(dataset))
  // ...
  // A bunch of code
  // ...
  legend.append('text')
    .attr('x', legendRectSize + legendSpacing)
    .attr('y', legendRectSize - legendSpacing)
    .text(function(d) { return d; });

});                                                // NEW

First of all, we indent everything starting at the definition of path and wrap it in a callback to D3's csv() method, which takes a URL as the first parameter. In our case we're just pointing it at the weekdays.csv file that's in the same directory as the code itself. Loading an external file is an asynchronous operation, which is why we have to put all of the code that's dependent on the dataset inside a callback.

When D3 parses a CSV file, by default it assumes that the first row contains the column names, as is the case with our file. It uses those names as the keys in the key-value pairs on the object that it creates for each row. In other words, the dataset we end up with will have the same structure as the the dummy dataset we defined in Step 1, except that it will have seven entries.

One issue is that the values in our newly fetched dataset will all be strings initially. Because we need the counts to be numerical, in the first three lines after the CSV has loaded we iterate over our new dataset and cast each count field as a number. The fastest way to turn a string into a number in JavaScript is to stick a + in front of it. Once that's done we can carry on with the rest of the code, remembering, of course, to close off the callback and csv() function at the end.

One thing to note is that we don't need to alter any of the existing code (apart from sticking some of it inside a callback). The donut segments and the legend readjust to accordingly. We could, in fact, load in any CSV that has label and count fields and it would work. The only issue would be if there were so many rows that legend outgrew its donut hole.

Also worth mentioning is that the data file did not need to be a CSV. D3 also has methods for dealing with TSV files and files with arbitrary delimiters. Of course we aren't limited to D3's file parsers; they're handy but not required. As long as we have a dataset to feed D3 and can tell it how to access what it needs, we're golden .

We can now start adding some interactivity in the form of tooltips. But first you can peruse the full code for this step if you wish: