At any fast-paced and successful company, there are areas of your codebase that are not-so-affectionately referred to as “legacy code”, and Trulia is no different. We’re always innovating and have done some amazing innovative things under incredible deadlines.
As we grow, we continue to learn from our mistakes and have made some great advances in performance and scale in line with our engineering organization’s goals. However, we noticed that we were spending too much time on maintenance, and getting bogged down in legacy code when developing new features.
Let’s be honest: writing tests for legacy code is hard. It’s a familiar catch-22: in order to write tests for untestable code you have to refactor; but in order to refactor, you need tests.
Another concern that is obvious to us is that our code is quite heterogeneous. Different people in and across teams use different patterns, styles and OO approaches when programming.
Stability and product execution is important to us: we needed to do something better.
At Trulia we love doing awesome things with awesome people and decided to bring in two well-known trainers from thephp.cc: Sebastian Bergmann (the creator of phpunit) and Stefan Priebsch (enterprise & architecture expert).
Looking at the size of our frontend engineering team (over 80 and growing!), it was clear that with the breadth of topics we wanted to cover with 2 trainers, we needed to create a custom, week-long program for the Trulia Engineering team.
For Trulia, it was worth setting aside our roadmap for a few days to make sure our team keeps doing what we do best: writing great code. In fact, this attitude is old hat for us—every quarter we spend an entire week innovating on new product & engineering features.
Together with Sebastian & Stefan, we constructed a program that consisted of a series of presentations, workshops, pair-programming, coaching and even team-sized focus groups per team. Loving iteration as much as we do, we modified the content and structure of the sessions daily as we learnt from our observations & feedback.
Now, we have an increased shared knowledge of testing, SOLID principles, patterns and other best practices that we can use for existing and new features. This not only helps us write new code, but allows us to know when & how to make our legacy code better (more testable and reliable).
One thing we learned about our legacy code is that it’s actually not as bad as you might think. Sure, there may be pieces in the wrong places, but with small changes to how we work going forward, it will become more testable and maintainable over time.
At Trulia, we believe that small is beautiful. This philosophy covers both how we design software (small methods, classes have a single responsibility) and how we work (small teams, refactoring the smallest part).
We learnt a great deal from this program and it’s inspired many of us. Here’s some perspectives from the engineers in training:
“I learned a lot and at the same time [the training] gave more confidence into myself to not be afraid of doing the right thing.”
“I started writings tests for a class I’ve been working on, and I found several bugs.”
“Over time, as you build tests on top of tests we create more and more business value, by being able to implement new features faster, proclaim higher quality of code, and lower maintenance costs.”
“This code is probably run a billion times a day. Making it more efficient will have a huge impact.”
Of course, it doesn’t stop there. Investing in our team is something we take seriously, and is an ongoing process. We will continue to invest and grow our team with future trainings, coding clubs, engineering forums and other goodies. Having an architectural vision for our software is critical for a team of our size, as it provides a guiding light for our design & code. Last year, we saw first hand of the benefits of following a vision, when we designed and launched Object-Oriented CSS (OOCSS). Nicole Sullivan talks about some performance & efficiency gains we achieved on her blog.
We are also building a wiki of patterns & solutions to solve common problems that are inline with our vision. A focus group now meets regularly to iterate on our architectural vision and refine our coding guidelines to help us all apply best practices.
Sound like something that interests you? We’re hiring. Check out current opportunities at trulia.com/jobs.0 comments
Today I am pleased to announce several significant changes to the choroplethr package for R. If you deal with visualizing geographic data in R, you might find these changes to be useful. You can get the latest version of the package by typing the following from the R console:
Creating a choropleth map in R where Alaska and Hawaii appear as insets is a challenge. I have implemented a solution to this and it is now the default behavior when you call the functions ?choroplethr or ?choroplethr_acs with a data.frame with all 50 states. Here is an example:
library(choroplethr) data(choroplethr) choroplethr(df_pop_county, “county”, title=”2012 County Population Estimates”)
Previous versions of choroplethr did not render Alaska or Hawaii at all. For those who are interested in the technical details: choroplethr first renders the continental US, Alaska and Hawaii as three separate images and then combines them. If you are interested in the code, please type
and look at the examples. Some additional bookkeeping is required to have all three maps use the same scale.
The most requested feature for choroplethr has been support for country-level choropleths. Today I am happy to announce that choroplethr version 1.7 implements such support. This is a new direction for choroplethr and I hope to refine it as I get feedback from the community. Here is an example on how to use it.
data(country.names) df = data.frame(region=country.names, value=sample(1:length(country.names))) choroplethr(df, lod="world")
I originally attempted to simplify the creation of world choropleths by using the world map that ships with the maps package. However, this map is quite old, and even contains the USSR as a country. This means that modern data cannot be bound to the map, since modern data does not list the USSR as a region. To address this choroplethr now ships with a world map from Natural Earth Data. This map is not without its own problems, though: the resolution of the map seems to have made smaller countries such as Singapore disappear. For details type:
data(map.world) ?map.world data(country.names) ?country.names
I hope to continue to improve choroplethr’s support for world maps.
These improvements to choroplethr required me to learn a great deal about mapmaking outside the context of R. In order to help other R programmers make similar contributions I created a page on the choroplethr wiki titled Mapmaking for R Programmers. If you are interested in using R for a customized mapmaking project, this article might be useful for you.0 comments
Today I am pleased to announce that the latest version of choroplethr is now available on CRAN. To install it you can type the following from an R console:
You only need to install the package once. But in each session that you want to use choroplethr you need to type the library command.
The most requested feature has been to give users more control over how maps are rendered. Version 1.2 provides three functions (get_acs_df, bind_df_to_map and render_choropleth) to address this. Here are 4 examples of using those functions to extract meaning from maps.
The choroplethr_acs function, available since version 1.0, makes it easy to create maps from ACS data. Consider the example of creating a map of the population of US states:
# see ?choroplethr_acs for an explanation of the parameters choroplethr_acs("B01003", "state")
This map is very informative, but no map can tell you everything. For example, in this map you cannot tell which states have a population above or below 1 million residents. The new features in choroplethr make this easy:
Use get_acs_df to get an ACS table as a data.frame. df = get_acs_df("B01003", "state") # Use bind_df_to_map to bind the data to a map df.map = bind_df_to_map(df, "state") # change the population from a number to a factor # which shows whether the value is above or below 1M library(Hmisc) # for cut2 df.map$value = cut2(df.map$value, cuts=c(0,1000000,Inf)) # use render_choropleth to render the resulting object render_choropleth(df.map, "state", "States with a population over 1M", "Population")
These features open a new door for analysis. As a small example, let’s create a map that compares states with populations over 1M with counties that have populations over 1M:
# States with greater than 1M residents library(Hmisc) # for cut2 df = get_acs_df("B01003", "state") # population df.map = bind_df_to_map(df, "state") df.map$value = cut(df.map$value, cuts=c(0, 1000000, Inf)) state.pop = render_choropleth(df.map, "state", "States with a population over 1M", "Population") # Counties with greater than 1M residents df = get_acs_df("B01003", "county") # population df.map = bind_df_to_map(df, "county") df.map$value = cut(df.map$value, cuts=c(0, 1000000, Inf)) county.pop = render_choropleth(df.map, "county", "Counties with a population over 1M", "Population") library(gridExtra) grid.arrange(state.pop, county.pop, nrow=2, ncol=1)
One of the most talked about demographic features on the news today is “The 1%”. This refers to individuals whose income is in the 99th percentile in a given year. In this example we use the new features in choroplethr to highlight counties where the median family income is in the 99th percentile of all counties nationwide:
df = get_acs_df("B19113", "county") # median family income df.map = bind_df_to_map(df, "county") df.map$value = cut2(df.map$value, cuts=c(min(df$value), quantile(df$value, 0.99), max(df$value))) render_choropleth(df.map, "county", "Counties with the Top 1% Median Family Income")</pre>
As a final example, let’s consider trying to identify ZIP codes in California where the median age is between 20 and 30. (In this post I use the word ZIP code, which is not technically correct, because it’s more widely understood. The correct term to use here is ZCTA; I explain why in this blog post.) Note that we can simply remove ZIPs which we are not interested in.
df = get_acs_df("B01002", "zip") # median age df.map = bind_df_to_map(df, "zip") ca_zips = render_choropleth(df.map, "zip", "CA ZIPs", "Median Age", states="CA") df = df[df$value >= 20 & df$value <= 30, ] df.map = bind_df_to_map(df, "zip") ca_zips_20s = render_choropleth(df.map, "zip", "CA ZIPs with median age between 20 and 30", "Median Age", states="CA") grid.arrange(ca_zips, ca_zips_20s, nrow=2, ncol=1)
Hopefully these new features allow you to do more interesting work with visualizing and analyzing spatial information. If you would like to share any of you work, please feel free to contact me on twitter or post on the choroplethr forum.0 comments
Hi, My name is Peter Black and I’m the lead geospatial engineer at Trulia. We’ve been making some interesting maps here at Trulia, displaying crime heatmaps, a commute tool that selects homes within a travel time polygon, and home value estimates down to the parcel level. Today, I’m writing to tell you the why’s and how’s of our most recent series on natural hazards.
When Hurricane (ok ok, it was an extra tropical storm) Sandy slammed into the New Jersey shoreline on October 30th, I watched with horror and tried to stay in contact with my loved ones and friends in harms way. Seeing the awful damage that resulted cemented my feeling that I had to incorporate maps on natural hazards that I knew were readily available from various federal sources into the Trulia experience. Doing so would open up a new avenue for millions of people to better understand the natural world and the risks they face when they’re making the decision on where to buy a home.
There are many types of natural hazards of course, and we couldn’t possibly put all of them on Trulia. So we chose the five hazards that have caused the most damage in the past few decades. These are: hurricanes, floods, earthquakes, tornadoes, and wildfires. Fortunately there is excellent data available for each hazard, mostly from federal sources. In compiling the data, I noticed some interesting things. For example, why was the Charleston South Carolina area at risk for earthquake? As it turns out, there was a magnitude 7.3 that shook the area in 1886.
There have been tornados in my neck of the woods in northern California. Southern New Jersey (the pine barrens) is at risk for a forest fire.
Given these revelations, it only made us work harder to create a new audience for this insightful information. We noticed that there wasn’t any really good mashup for all of the historical information around hurricanes and tornadoes. So for each, we took the historical track data along with their attributes, and assigned them to an underlying nested hexagon grid. Once that was accomplished, we classified the data and created a really cool visualization of historical hotspots for each hazard. I stress the word historical intentionally since we have no idea where the next hurricane or tornado will hit. Our intention is to solely show where the storms have hit in the past 60 years or so, when this meteorological data became more reliable and sophisticated due to the advent of technologies like radar (in the 40s) and satellite imagery (in the 60s).
I hope you enjoy the maps. They are pretty informative and provide an interesting tool for homebuyers that can help people make more informed choices. I’d like to thanks my excellent team of engineers whose talent and professionalism are truli-amazing, the awesome pr crew we have, as well as the senior management team at Trulia who supported this idea from its inception.0 comments
When we first implemented map markers on our Android apps we were just using an image to highlight each location. Over time we wanted to add the specific price as well. As I started working on an implementation of that feature I was hard pressed to find any articles or examples of what we wanted to accomplish. So I had to start from scratch and figure out a method by myself. The initial implementation is currently in use on our Trulia Rentals android app. However, there are some deficiencies with that version and some of the code is sub-optimal. In this post I will detail the new method I am working on (which will be ported into all of our android apps).0 comments
With the last major release of our Android app we added in Street View capability on property detail pages. The response we got from users was extremely positive but we couldn’t help but notice a few issues that really degraded a user’s experience.
First, we assumed (incorrectly) that the built in Street View app would be available on all Android phones. And how did that work out? The stack traces in the Market crash reports told the whole story: some people didn’t have the app installed.0 comments
July 2011 marked the anniversary of Trulia’s Innovation Days. We are extremely excited about our innovation program and wanted to take this opportunity to share some details and insights into what these Innovation Days entail.
It all started with the recognition that innovation is very important when creating a winning company, and that it can really blossom when actively nurtured and developed. We truly believe innovation is free spirited and should not be bound by strict rules! The program evolved following a simple principle: “Pave sidewalks where people walk”. This statement helps the program to retain its grass roots, unbound and organic characteristics and leverages the power of our teams without over-management.0 comments