Drill-down visualizations can be a superior way to current a good deal of information in a digestible format. In this instance, we’ll make a graph of median home values by U.S. point out working with R and the highcharter offer.
Preliminary graph of median home values by point out (optimum and least expensive 10 states). Info from Zillow.
Just about every state’s bar will be clickable — the drilldown — to see information by county.
Immediately after clicking the bar for Massachusetts, a person sees median home values by Massachusetts county. Info from Zillow.
There are a few primary methods to generating a drill-down graph with highcharter:
- Wrangle your information into the necessary format
- Make a simple prime-level graph and
- Incorporate the drill-down.
If you want to comply with alongside, download point out- and county-level information sets for the Zillow House Price Index from Zillow at https://www.zillow.com/exploration/information/. I’m working with the ZHVI Solitary-Family Residences series.
To start with, load the packages we’ll be working with:
library(rio)
library(dplyr)
library(purrr)
library(highcharter)
library(scales)
library(stringr)
All can be set up from CRAN with put in.packages()
if you don’t currently have them on your procedure.
Take note that highcharter is an R wrapper for the Highcharts JavaScript library — and that library is only no cost for particular, non-business use (including tests it locally), or use by non-income, universities, or community faculties. For nearly anything else, including governing administration use, you need to buy a license.
Following, I import the point out and county CSV files into R with the pursuing code. (My CSV files are in a information subfolder of my project listing.)
states <- import("data/State_zhvi.csv")
counties <- import("data/County_zhvi.csv")
These files have hundreds of columns, one particular for each thirty day period setting up in 1996. I want to graph the most the latest information, so I glimpse for the identify of the last column with
names(states)[ncol(states)]
At the time I wrote this, that returned 2020-06-thirty
, which I’ll use as my MedianValue
column. I’d like to examine that price to the start out of the century, so I’ll also incorporate 2020-01-31
as a PriceIn2000
column.
Info wrangling
Here’s my code for generating a most recent_states information body, which I’ll use as a base for the graph:
most recent_states <- states %>%
find(c(Point out = StateName, MedianValue = `2020-06-30`, PriceIn2000 = `2000-01-31`)) %>%
arrange(desc(MedianValue)) %>%
mutate(
MedianValueFormatted = greenback(MedianValue),
PriceIn2000Formatted = greenback(PriceIn2000),
PctChangeFrom2000 = %((MedianValue - PriceIn2000) / PriceIn2000, accuracy = one)
) %>%
slice(c(one:10, 42:fifty one))
In addition to choosing and renaming columns, the code over arranges information by descending median price, adds a few new columns, and chooses the 10 optimum and least expensive point out rows by MedianValue
with slice()
.
The new columns MedianValueFormatted
and PriceIn2000Formatted
just add greenback indicators and commas to values. I’m only working with those to display nicely formatted values in the graph tooltips. You can format values in highcharter working with some JavaScript code rather, but I find it less difficult to do in R. Yet another new column calculates the % alter from 2000 to the most the latest price.
I use mostly equivalent code to make a most recent_counties
information body. Listed here I make confident to filter for rows that incorporate states in my most recent_states information. I also take away the word “County” from all the county names to save space on the graph. And, when I arrange by descending median price, I very first arrange by point out so counties are organized within just each point out group.
most recent_counties <- counties %>%
find(c(County = RegionName, Point out = StateName, MedianValue = `2020-06-30`, PriceIn2000 = `2000-01-31`)) %>%
filter(Point out %in% most recent_states$Point out) %>%
arrange(Point out, desc(MedianValue)) %>%
group_by(Point out) %>%
mutate(
County = str_take away(County, " County"),
MedianValueFormatted = greenback(MedianValue),
PriceIn2000Formatted = greenback(PriceIn2000),
PctChangeFrom2000 = %((MedianValue - PriceIn2000) / PriceIn2000, accuracy = one)
) %>%
ungroup() %>%
arrange(Point out, desc(MedianValue))
So considerably, this is usual R information wrangling. Following, though, I’ll make a particular drill-down model of the county information precisely for highcharter.
Info body for a graph’s drill-down
Just take a glimpse at the code below, and then I’ll crack down what is going on.
county_drilldown <- latest_counties %>%
group_nest(Point out) %>%
mutate(
id = Point out,
kind = "column",
information = map(information, mutate, identify = County, y = MedianValue),
information = map(information, list_parse)
)
To start with, I use dplyr’s group_nest()
operate to make a list column for each state’s row with just that state’s information. The result is a information body with two columns: point out and information. The information column has a information body for each state’s information by county, this kind of as the one particular below for Massachusetts:
Result of dplyr’s group_nest() operate.
Just about every entry in the information column is a information body with that state’s information by column.
But highcharter desires a small additional information and formatting to make a drill-down. The very first two strains of code underneath mutate()
in the over code group add an ID column — how the drilldown connects to the information one particular level up — and a kind column, for the kind of highcharter graph I want. The very first map()
line of code adds two columns to each information body in the list column: identify
for the county identify and y
for the county price.
The last line in mutate()
works by using the highcharter operate list_parse()
to change each information body in the information column into a list with a unique format, this kind of as the impression below displaying a list with information for the District of Columbia.
Result of the list_parse() operate for District of Columbia information.
Personalize the tooltip
My next phase is an optional one particular: Formatting the tooltip. Highcharter’s tooltip_desk()
operate lets you alter the default format and add additional columns than appear on your graph. This is one particular of the most straightforward strategies I have noticed to personalize tooltips for an R HTML widget. You make one particular vector with the class textual content and one more vector with the values, this kind of as
tooltip_class_textual content <- c("Median Value: ",
"Price in 2000: ",
"% alter from 2000: ")
tooltip_formatted_values <- c("point.MedianValueFormatted",
"position.PriceIn2000Formatted",
"position.PctChangeFrom2000")
my_tooltips <- tooltip_table(tooltip_category_text, tooltip_formatted_values)
That code will make a tooltip that seems a thing like the one particular displayed in the impression below.
Customized tooltip working with highcharter’s tooltip_desk() operate. Info from Zillow.
At last, we’re ready to make a graph!
Graph code
hchart()
is a highcharter operate to make a highchart item with a simple graph. In the code below, hchart()
usually takes as its arguments the information body, the chart kind, and then a ggplot-like aesthetic operate. Its selections incorporate the x column, the y column, and the drill-down column.
mygraph <- hchart(
most recent_states,
"column",
hcaes(x = Point out, y = MedianValue, identify = Point out, drilldown = Point out),
identify = "Median House Values",
colorByPoint = True
)
hc_drilldown()
adds that drilldown.
mygraph <- mygraph %>%
hc_drilldown(
allowPointDrilldown = True,
series = list_parse(county_drilldown)
)
Take note in the code over that the series argument desires list_parse()
yet again.
hc_tooltip()
works by using the tailor made tooltip format I designed earlier.
mygraph <- mygraph %>%
hc_tooltip(
pointFormat = my_tooltips,
useHTML = True
) %>%
hc_yAxis( title = "") %>%
hc_xAxis( title = "" ) %>%
hc_title(textual content = "Median House Values by Point out June 2020" ) %>%
hc_subtitle(
textual content = "Simply click bar to see by county. Resource: Zillow",
design and style = list(useHTML = True)
)
The useHTML = True
argument will make for a nicer tooltip format. Immediately after that, there’s code to take away the x and y axis labels, and to add a title and subtitle.
You can kind the identify of the graph item in your R console to see it displayed in RStudio or to add the graph to an R Markdown doc or Shiny application. You can also use the htmlwidgets package’s saveWidget()
operate to save the one graph as a self-contained HTML file, this kind of as
library(htmlwidgets)
saveWidget(mygraph, file = "my_drilldown_graph.html",
selfcontained = True)
Highcharter offer writer Joshua Kunst has a good deal additional information about what you can do with highcharter at the highcharter offer web page.
For additional R strategies, head to the Do Far more With R page at https://bit.ly/domorewithR or the Do Far more With R playlist on the IDG TECHtalk YouTube channel.
Copyright © 2020 IDG Communications, Inc.
More Stories
AI Skills in High Demand in Silicon Valley
Can Tech Unions Fight Back Against AI Job Loss?
AI Innovation for Sustainability: A Greener Future