Skip Navigation
Trulia Logo

Trulia Blog

Introducing choroplethr version 1.2.0

Introduction

Today I am pleased to announce that the latest version of choroplethr is now available on CRAN.  To install it you can type the following from an R console:


install.packages(“choroplethr”)

library(choroplethr)

You only need to install the package once.  But in each session that you want to use choroplethr you need to type the library command.

The most requested feature has been to give users more control over how maps are rendered.  Version 1.2 provides three functions  (get_acs_df, bind_df_to_map and render_choropleth) to address this.  Here are 4 examples of using those functions to extract meaning from maps.

Example 1: Showing states with population over 1M

The choroplethr_acs function, available since version 1.0, makes it easy to create maps from ACS data.  Consider the example of creating a map of the population of US states:


# see ?choroplethr_acs for an explanation of the parameters

choroplethr_acs("B01003", "state")

Image 1

This map is very informative, but no map can tell you everything.  For example, in this map you cannot tell which states have a population above or below 1 million residents.  The new features in choroplethr make this easy:


Use get_acs_df to get an ACS table as a data.frame.

df = get_acs_df("B01003", "state")

# Use bind_df_to_map to bind the data to a map

df.map = bind_df_to_map(df, "state")

# change the population from a number to a factor

# which shows whether the value is above or below 1M

library(Hmisc) # for cut2

df.map$value = cut2(df.map$value, cuts=c(0,1000000,Inf))

# use render_choropleth to render the resulting object

render_choropleth(df.map, "state", "States with a population over 1M", "Population")

Image2

Example 2: Comparing States and Counties with Populations over 1M

These features open a new door for analysis.  As a small example, let’s create a map that compares states with populations over 1M with counties that have populations over 1M:


# States with greater than 1M residents

library(Hmisc) # for cut2

df = get_acs_df("B01003", "state") # population

df.map = bind_df_to_map(df, "state")

df.map$value = cut(df.map$value, cuts=c(0, 1000000, Inf))

state.pop = render_choropleth(df.map, "state", "States with a population over 1M", "Population")

# Counties with greater than 1M residents

df = get_acs_df("B01003", "county") # population

df.map = bind_df_to_map(df, "county")

df.map$value = cut(df.map$value, cuts=c(0, 1000000, Inf))

county.pop = render_choropleth(df.map, "county", "Counties with a population over 1M", "Population")

library(gridExtra)

grid.arrange(state.pop, county.pop, nrow=2, ncol=1)

Image 3

Example 3: “The 1%”

One of the most talked about demographic features on the news today is “The 1%”.  This refers to individuals whose income is in the 99th percentile in a given year.  In this example we use the new features in choroplethr to highlight counties where the median family income is in the 99th percentile of all counties nationwide:


df = get_acs_df("B19113", "county") # median family income

df.map = bind_df_to_map(df, "county")

df.map$value = cut2(df.map$value, cuts=c(min(df$value), quantile(df$value, 0.99), max(df$value)))

render_choropleth(df.map, "county", "Counties with the Top 1% Median Family Income")</pre>

Image4

Example 4: ZIPs in California

As a final example, let’s consider trying to identify ZIP codes in California where the median age is between 20 and 30.  (In this post I use the word ZIP code, which is not technically correct, because it’s more widely understood.  The correct term to use here is ZCTA; I explain why in this blog post.)  Note that we can simply remove ZIPs which we are not interested in.


df = get_acs_df("B01002", "zip") # median age

df.map = bind_df_to_map(df, "zip")

ca_zips = render_choropleth(df.map, "zip", "CA ZIPs", "Median Age", states="CA")

df = df[df$value &gt;= 20 &amp; df$value &lt;= 30, ]

df.map = bind_df_to_map(df, "zip")

ca_zips_20s = render_choropleth(df.map, "zip", "CA ZIPs with median age between 20 and 30", "Median Age", states="CA")

grid.arrange(ca_zips, ca_zips_20s, nrow=2, ncol=1)

Image5

Conclusion

Hopefully these new features allow you to do more interesting work with visualizing and analyzing spatial information.  If you would like to share any of you work, please feel free to contact me on twitter or post on the choroplethr forum.