How does your CRE brokerage’s data quality score?

Data is the oil that the CRE industry runs on. Q: How can you be sure that your brokerage engine is running smoothly? A: You check the oil. So this is what we did as a part of a (confidential) brokerage’s data migration. And we reproduced the results for you to understand where the industry bar is – for this…
CRE brokerage data quality

We all know that data is the oil that commercial real estate (CRE) runs on. Secondly we also know that high quality data is immensely valuable. Data is one of the four reasons, probably the most important reason, why your customers talk to you: junk data = junk advice.

As the increasingly popular saying goes:

Data is the new gold

Your business is the provision of CRE advisory and consulting services – the successful outcome: deals. Therefore, on top of your day-to-day deal-making activities, you are also required to manage CRE data.  But not just any data, instead complex, inter-related property, unit, business and contact data. Some of which may be living in separate silos, or worse, be duplicated across systems. Further, CRE data volumes are high, and your data changes frequently.

From deal makers down to the back-office team, this CRE data quality responsibility, at best, is a frustrating distraction. At worst, data is poorly handled. Bad data equals risk of bad advice, missed deals and embarrassment.

How bad is your data? How good is it?

CRE is one of the world’s most data-complex industries. Consequently, is incredibly difficult for humans, without powerful data tools and specialist processes, to maintain CRE standard data volumes by hand.

So, up front, there is no shame in the findings below!

To arm you with industry data quality benchmarks, here are the results of a CRE data quality intervention we did for a (confidential) brokerage. We would regard this as a normal CRE brokerage:

  • Team comprises a mix of management, brokers and back office personnel
  • Data is processed by hand.
  • Business has a history of legacy data.
  • Data was stored in historically disconnected systems, resulting in multiple data “version of truth” issues.

For ease of analysis, results are broken down by data asset family.

Properties data

Properties is where your lease, image, document, property attribute and unit data lives. This data family also store businesses (tenants or owners, property managers) and contacts. High quality property data is the cornerstone of any successful CRE brokerage.

This database comprised 8.2K properties, ranging across South Africa. At take on, the quality was unknown. Here are the findings:

CRE brokerage property data quality

Significant observation: 1.5K properties, with vacancies, were missing. This incompleteness of properties was creating major deal risks to the brokers, and affecting their quality of service to customers. Customers trust brokers to give them all options, and missed introductions can result in the ultimate fail: lost deals. Moreover, incomplete information on hand also causes deal delays while stock has to be gathered and checked.

Talking to the chart above, and unpacking issues around existing property data assets:

  • 54% of properties were incorrect – in parts or whole. For example: out of date property name, wrong category, wrong address, wrongly-captured suburb, wrong co-ordinates, wrong owner.
  • 25% of the data was junk – mostly due to duplicate data. I.e. the same property was captured twice.

At conclusion, the brokerage is holding a fraction under 8K properties. (The adding of extra properties was excluded from this phase of the project)

For more info on the property interventions, please see the bottom of this page.

Vacant units data

Vacancies (aka listings or vacancies) are those parts of a property that are current or due to become vacant, and marketed for occupation by tenant. To clarify, these are the space options that brokers present to a tenant when looking for space.

Units are the “heart beat” of your leasing marketing processes and provide valuable market intel.

At take-on, the database had units, of unknown quality, totalling 10.7K. The findings here were also quite astonishing:

CRE property brokerage vacancies data quality

Of the 8.2K existing units identified as junk

  • 99% were simply out of date (no longer vacant)
  • While the balance were duplicates and miscaptures

6.1K vacant units were simply missing. I.e. they would not be introduced to potential tenants, or returned in vacant search results. A large part of this is due to property funds whose vacancies were either not being received or captured by the brokerage.

1.4K existing units were corrected for issues ranging from incorrect GLA, to wrong asking rental, incorrect expenses, wrong availability date etc.

However, after processing, the end result is 8.5K units, at 100% data quality.

For more info, please see below.

Businesses data

CRE is a B2B industry. Businesses are either customers or partners. These businesses help you to earn money, and pay you money.

Businesses can be tenants, and/or owners (including legal entities), and/or property managers, and/or brokerage partners/competitors, and/or service providers.

This database had 5.4K businesses (organisations) at take on. These results made for more easy reading:

CRE brokerage data quality businesses

The 56% good score comprises those business records that were good plus those businesses that were missing information, and were fleshed out (with no corrections). It ignores data flagged for investigation.

Of the 3.6K businesses that saw data interventions:

  • 837 were corrected (outdated name, spelling error on capture)
  • 2,526 were fleshed out (i.e. were previously incomplete)
  • 1,070 were removed as junk (duplicate, obviously junk – e.g. contact (human) instead of business)
  • 123 businesses (relating to properties and contacts) were simply missing

This brokerage is now holding 4K high quality business data assets.

Contacts data

CRE is a relationships game. Humans are the decision-makers for these businesses.

This database held 1.1K contacts at take on. The findings follow:

CRE brokerage data quality contacts

The 55% good quality data score comprises those existing contacts that were good plus those existing contacts that were missing information, and were fleshed out (with no corrections).

Outcomes- translating into 1K reliable contacts:

  • 275 contacts were added – for properties with vacant units that were missing contacts.
  • 241 obvious duplicate contacts were removed.
  • 24 contacts were corrected (changed contact name, updated telephone and cellphone, corrected email and employer)

The CRE data quality good news

After the data quality intervention, all the brokerage’s data is now clean and can be trusted.

With data green-ticked, the client could turn on the Gmaven vacancies feed.

Consequently, this company’s vacancies data is now super-reliable, and updated within a maximum of 3 days of receipt.

Deal makers are now deal making with faster deal turnaround times, at lower risk of errors. Back office team members are focusing on their higher value tasks, freed from manually managing the volume and velocity of vacancies data.

This efficient brokerage is now running on data gold – with limited frustrations and business risk.



Approximately 1.3K records (or 3.9% of total), could not be processed (insufficient data clues for deletion, correcting or fleshing out). Such data was passed in Excel format to the client for investigation and resolution internally.


CRE data quality interventions fall into two categories:

  • At point of capture data quality (prevention is better than cure), taking place on a record-by-record basis.
  • Retrospective data refurbishments or overhauls, in batch.

As long as data quality controls at-point-of-data-capture are not in place (see more below), “data debt” is a growing uncontrollable.

After-the-fact data quality interventions (e.g. data data overhauls), without data tools, algorithms, BPO and reference data assets, are near impossible to do successfully in-house.


With the brokerage now running Gmaven, users have to be very determined to capture poor data into their system.

Why? A lot of processes run under the bonnet to ensure CRE data quality:

  • Duplicates are proactively identified using both fuzzy matching and referencing master data assets at point of capture.
  • Green ticks identify data assets that correspond to unique identifiers.
  • Data changes are time-stamped and allocated to users, and live in well structured audit logs.
  • Field-level filters ensure that data is captured in certain formats, is standardised and satisfies definitions.
  • Data capturers are assigned ownership of the data assets, with data provenance recorded.
  • Drop downs and auto-completes eliminate finger trouble. Wizards and positive property completeness feedback gently steer and rewards data completeness.


Once your data is uniquely identified, it is possible to enrich your data assets at relatively low cost. For example:

  • Property records: add legal ownership info, erf sizes, purchase prices and dates, debt funder, property size and attributes data.
  • Business records: add ownership info, contact info, industry classifications, websites, address info.
  • Contact records: add job title, Linkedin, birthday, photo, directorships.

About the author

Related posts

5 reasons CRE website projects
CRE Innovation
CRE website project? 5 reasons they fail. And how to win

We have seen more CRE website failures than we should have. These failures are expensive, time consuming and terrible for the reputation of those involved. Here is our stab at laying out the pitfalls. We unpack where things go wrong, but, most importantly provide, in our opinion, the questions to ask and actions to take to avoid failure.

Geospatial CRE
CRE Innovation
Geospatial data: what, how, why

Geospatial is a new age buzzword that may seem either intimidating or irrelevant – possibly both. In reading this, you will realise its approaches and solutions are intuitive to you – the below simply gives you the words. At its heart, geospatial is an accessible tool and perspective for solving previously complex problems.

Opportunity cost CRE
CRE Innovation
What does “opportunity cost” mean for CRE

Here we attempt to apply the timeless, and immensely powerful, economic concept of opportunity cost to CRE. Using a video and story, we attempt to illustrate what a powerful efficiency lever opportunity cost can be. If you can apply the principles, we feel confident you will see the benefits in your business, and in your bank account!

Leave a Reply

Your email address will not be published. Required fields are marked *

Fill out this field
Fill out this field
Please enter a valid email address.
You need to agree with the terms to proceed