Standardized data is a humongous feat to achieve. There are articles written by data specialists around the globe, speaking of the complexity incurred in standardizing a dataset.
Business-trade-name, being the most important identifier of a POI, needs to be standardized.
For example, here are just a few ways of how the famous retail chain 7 ELEVEN is represented by our major vendors like Dun & Bradstreet and TomTom:-
No matter which DBMS is being used, taking all the above occurrences into consideration would require the end customer to manipulate his SQL query with the usage of wild cards and the keywords in all possible forms.
With the help of latest technologies like Natural Language Processing, String-Matching Algorithms like Cosine & Levenshtein, pySpark, Jupiter notebook, etc, we introduced the BRANDNAME column which provides the customer with a single standardized business-trade-name for a POI.
Brandname Standardization has been successfully applied in 24 major Countries and 3000 distinct Brandnames.