I'm psyched to see companies offering 'data kits' to ease the data burden of modeling and ML tasks that drive real estate/property investments. But as someone who has spent years building models and testing various investment hypotheses, I'm disappointed by what's behind the glam of press releases. If you want to model the price of real estate assets and use those models to make investment decisions, you're definitely doing it wrong if you're overly focused on property-specific data.
To demonstrate why, let's start w a simple example. Imagine a home. Any home. Put that home in a nice neighborhood in California. Then copy that home and put it in an equally nice neighborhood in Nebraska. We all know the home in Nebraska is gonna sell for a lot less, and we instinctively have a sense why-- location, location location. We can capture the impact of location on price by including a variable in our model that represents the state. When we do this, we will find that that variable is going to do a lot of work in our model bc it's a significant source of variation in the model that's correlated with our outcome of interest. (For nerds: it's going to take a lot of the explanatory power out of the error term.)
Now let's take it one step further and imagine that, rather than comparing prices on the same home across states, we want to do it within a single state. Or even better, within a single neighborhood. Again, imagine a house and then make an exact copy of it. Then put each house in a different area of a single neighborhood. We're controlling for state (both houses have the same value for the variable 'state'), along with all property features (the houses are replicas of one another). But we should not stop there, unless we're convinced that the houses should be valued equally. If we do this, we'd have to be absolutely convinced that location does not matter to pricing. Maybe one home has great access to highways making it great for commuters. Maybe the other is near a cemetery. If you are not capturing these features and others like them, you are implicitly saying that location does not matter, which we all know no investor really believes.
So what does this mean? If your data kit doesn't include contextual data about a location, you are going to miss a lot of the important factors that impact price. You could be leaving money on the table, or taking too much off. Either way, there's alpha to be had.
If you build SFR models and want to improve them, reach out and join some of the best in private equity real estate investing who already have :).
About Iggy
Without Iggy, building innovative user-facing products and tools with neighborhood and geographic data requires sourcing and buying fragmented and unwieldy datasets, hiring specialized geospatial analysts/data scientists to work with them, and engineers to bring what they build to prod. It’s complicated, expensive, and slow.
Iggy brings data about neighborhoods to your product development stack and lets you build innovative products and experiences in a fraction of the time by completely eliminating the need to source, preprocess, analyze, and aggregate individual and incomplete spatial datasets so you can do what you do best.
Look to the recent past for signals of future growth.
A company’s store locations reflect its overall site selection strategy. If your strategy is similar, look to them.
Sure, the food is important but who doesn't want to pair good food with an epic view? OpenTable now lets you do so!