Humphries lays out some of the thorny issues in sorting fact from fiction:
When you look at a home, does it have three bedrooms or four bedrooms? Two sources say four; the user says three. How do you resolve those? That's a lot of what we've done over the years, create the systems to do that. ... If you sell a house for $100,000, we'll get it from an MLS system, but we'll also get it from the public recorder from the deeds and reconcile those facts together. It's a huge effort around bringing in a lot of disparate data and creating a single representation.
Internally we call that our living database of all homes. We view that as a core competitive asset, in that right now we have the single best representation anywhere of real estate facts, because ours is a superset of every other source that's out there and it's a superset that's been resolved and reconciled.
Zillow uses machine learning algorithms to look at those internal representations, clean them up, and make sophisticated inferences about which facts are outliers and which might be duplicates. The high quality of the result presents some interesting B-to-B possibilities for monetization, but for now, says Humphries, Zillow is "better served by freely distributing all of our data to whoever wants it versus charging folks." The company even offers free access to a home valuation API.
Besides, Zillow mainly seems focused on delivering a positive user experience. Going forward, Thind says, the company wants to continue to improve its home value indices, rent Zestimates, and other home valuation models. Web clickstream data is analyzed on a regular basis to help UI designers optimize usability, and Zillow plans to add new personalization features for both buyers and sellers.
To me, Zillow is prime example of the effort required to address one vertical slice of big data and produce a quality result. Not only is the Web far from being easily machine readable, but a great deal of accessible data is uneven, while much remains stuck in nondigital form. The "digital transformation" we keep hearing about will occur only when every domain gets the kind of intensive treatment Zillow has applied to U.S. homes.
Sign up for CIO Asia eNewsletters.