Entity resolution is the process that resolves entities and detects relationships. Key benefit: the ability to have 360-degree profiles for all significant business elements, understand the situation and prepare to Predict it on next Phase. The common entity resolution process that fully covered by our service TEAM include four stages: recognize, resolve, relate and scoring. Usually all these operations are performed over NoSQL storages or traditional databases.
- Recognize (Extraction)
At the first stage, it must be recognize the data by collecting, validating, optimizing, and enhancing the incoming identity data. During this stage have been made cleanse and standardize the data values, as well as perform data quality checks on the data to protect the integrity of the entity database. Usually it may be completed at the Data Collection & Preparation Phase. - Resolve (Deduplication)
At the second stage, the process resolve identities into entities by uses variety sophisticated algorithms to compare the data values in the incoming identity record against existing entities in the entity database to determine if they are belongs to the same entity. If the appropriate entity or entities have been resolved, they will be enriched by new record. Otherwise, the incoming identity forms new entity in the database. Strongest challenges will be: naming ambiguity, data quality and heterogeneity, clustering methods. - Relate (Integration)
At the third stage, also have been complete the relationship detection process, which detects relationships between identities and entities and generates alerts for relationships of interest. In this case, the strongest task will be to take the right rules and algorithms for defining an entity hierarchy. - Scoring (Verification)
At the fourth stage, the system computes how closely the attributes for an incoming identity match the attributes of an existing entity. The results of this computational analysis are scores that the system uses to resolve identities into entities and detect relationships between entities.
The Entity Resolution is one of the most sophisticated field in Big Data.
Company have a wide and unique experience here and will glad to answer your question and help to solve your business tasks.