We’ve already written a post about real estate online market and how to develop an app like Zillow. This time though, we’ll dig into data and technology stack that the leading US real estate companies such as Zillow, Redfin, and Realtor.com are using. Perhaps, this can help you figure out how things really work underneath the beautiful interfaces of real estate apps and websites. I hope this won’t scare you off the real estate app development.
What’s the difference between Zillow, Realtor.com and Redfin?
The “Big Three” real estate traffic leaders are Zillow, Trulia, and Realtor.com. But since Zillow and Trulia have joint forces, we can consider them as one entity and pay some attention to another big fish in the US, a residential real estate company Redfin.
Zillow was created in 2006 by Rich Barton and Lloyd Frink, former Microsoft executives and founders of the leading travel website Expedia.
Redfin was founded in 2004 by David Eraker, Michael Dougherty, and David Selinger. The company was a pioneer of map based real estate search even before Google Maps appeared.
Founded back in 1995, Realtor.com is a grandfather of the whole online real estate industry. It’s the official site of the National Association of REALTORS® and operated by Move, Inc., a subsidiary of News Corp, which belongs to Rupert Murdoch.
I don’t know if it’s a coincidence or fate, but Redfin and Zillow are both based in Seattle, while the dwelling of Trulia and Realtor.com is San Francisco Bay Area.
Unlike Zillow and Realtor.com, Redfin is an actual brokerage who pays its agents a base salary, and makes money when users buy or sell homes with its real estate agents.
Realtor.com and Zillow (including Trulia) are real estate aggregators whose customer buys ads on their sites. In other words, they sell advertising back to the realtors who they took the property from in the first place.
Accuracy of data
The listings on Realtor.com and Redfin are supposed to be accurate and up-to-date, because these websites are directly linked to the Multiple Listing Service (MLS) where agents list homes for sale. According to the Redfin’s study “The Accuracy of Real Estate Websites,” the MLS’ feeds appear on the brokerage websites as soon as they are listed by an agent and get updated every 15 to 30 minutes.
The same study shows that it takes about seven days for a home listed on the MLS to appear on Zillow, because the company mostly relies on individual agents and brokers to repost their listings.
Where do Zillow, Realtor.com and Redfin get their data from?
The real estate data in the US comes from the individual real estate agents and real estate brokers who are members of local associations of Realtors. This data can be accessed through the "Multiple Listing Service" (MLS) which Realtors associations publish for their local communities.
The good thing is MLSs give permissions to real estate portals like Zillow, Trulia, Realtor.com and others to republish the MLS lisiting feeds on their media channels.
The bad thing is MLSs are quite hard to access. There is no public API you can use to just pull the data from there, and they restrict membership and access to real estate brokers and their agents, which means that a person selling his/her own property (For Sale By Owner) cannot put a listing for the home directly into an MLS.
It’s important to note that public property details should comply with certain rules established by the National Association of Realtors (NAR) and encompassed by IDX (Internet Data Exchange) – a standard search site.
Is MLS the only source of real estate data?
There are lots of sources that real estate aggregator companies can use to get the listings of properties. You would only need to do three things to get access to those sources: get licensed in every state, join MLS, and integrate the data which doesn't follow a consistent standard across the 900+ MLSs. Anyway, here is where Zillow goes to make a living:
- Agents and brokers
Zillow is doing fine attracting real estate agents and brokers to their website. After they welcomed Trulia in their online empire, rumor has it, they control more than 70% of online real estate searches. They call it disruption in the US.
- Local MLS
Trulia and Zillow get access to local MLS by negotiating data sharing/syndication agreements. Redfin, on the other hand, doesn’t have any of that headache. As a brokerage, it’s free to use data from its membership in the Realtors MLS.
In this regard, it seems like Zillow is taking the hard way. After all, it could’ve become a brokerage and pull the data directly from an MLS. However, becoming a brokerage imposes a lot of legal obligations on a company because then they would need to participate in commissions by representing buyers and sellers within their MLS, which basically means they will have to employ agents and manage them across the whole country.
- Listings syndication platforms
Because there are hundreds of MLSs all over the US, it’s extremely burdensome to negotiate an agreement with everybody, especially once you operate nationwide. That’s why the largest aggregator websites enter data sharing agreements with big national real estate companies to get their data directly.
But the largest aggregation of real estate listing data (for sale listings) in the US is called ListHub. The company belongs to Move Inc. (also owns Realtor.com). Last year Move acquired Point2, another large listing syndication platform.
It seems Move finally decided to take care of Realtor.com by fighting its competitors. LitHub is terminating its listing agreement with Zillow and Trulia, which means the whole ocean of real estate data will disappear forever from the merged Zillow-Trulia. Guess what’s next in this battle?
- Other types of listing data
Zillow also allows other companies and individuals that represent For Sale By Owner listings, new constructions, and rentals to post their real estate listings on the website for free.
- Make Me Move listings
The properties on Zillow don’t necessarily have a “for sale” label. The owner of a property can state the price they'd be willing to sell their home for, without actually putting it on the market. It’s a great way to measure interest from potential buyers and even get the right offers sooner.
All of the above-mentioned sources of the real estate data feed from MLS, except the last two. Potentially, FSBO is a great opportunity for emerging real estate businesses in the US.
How can I access real estate databases?
The biggest problem with MLS listings isn’t even the license. It’s a different implementation of listings databases across MLSs. In order to enable data exchange between an MLS database and your website you need to implement IDX (Internet Data Exchange). There are a couple of options to do it.
One is by using an old FTP server, which will force you to download the entire database of listings in bulk each time you want to update your records.
The other option is to use RETS (Real Estate Transaction Standard), a framework used to give brokers, agents and third parties access to listing and transaction data. It allows you to download listings in more manageable segments, and only those that have been recently added or changed will need to be updated. RETS is XML based and there’s a wide variety of open source tools and libraries for working with RETS data. However, they still aren’t perfect.
There are third party services that provide their APIs that normalize data flows from MLSs into a standard model. You can check out Spark, SimplyRets, Optima Express IDX Plugin for WordPress sites, and iHomeFinder.
You will also need to build a robust backend to be able to properly manage the data. Redfin’s main development languages are Java and Python and they use SQL Postgres database on the backend. They also use Eclipse integrated development environment, Git version control system, Bamboo for continuous integration and build delivery, and JIRA for project management.
Judging from the job offers on Zillow’s website, the company uses LAMP stack, as well as Node.js, Python, Java, and C++.
Property listings aren’t the only data that a real estate app needs
Other than properties, a real estate app or a website generally has the data about neighborhoods, schools, points of interest, demographics, past sales history and other info which improves home buying and selling experience.
Zillow provides APIs for neighborhood information for their listings and a dataset that contains neighborhood boundaries. Trulia also provides API derived from the aggregated data on real estate listings, trends and search activity. Here are other important APIs with geo and neighborhood information:
GeoNames has a very comprehensive dataset of over 10 million geographical names.
Google offers tools for real estate professionals.
Data.gov site in the US provides an entire catalogue with over 8.000 datasets with anything from crimes to demographics to high school graduation rates.
What’s more, almost all cities in the US have open data portals with a variety of different information that you can use in your app. For example, here is New York open datasets, here is the city of Chicago data portal, and here is Los Angeles data catalogue and developer’s resources.
Real estate app development is such a challenging thing to do! But now you know what makes it possible for huge companies like Zillow, Redfin and Realtor.com to pull the real estate data into their apps and websites. This might be a lot easier to do for you, though. Hope our small research in the US real estate industry and technology was helpful!