Back to Top

 

Disambiguation

Named Entity Disambiguation

AlchemyAPI employs a sophisticated 'entity disambiguation' mechanism to resolve detected companies, locations, and people to a unique "instance."

Why Disambiguation?

Human language is not exact. Text referring to the city "Roanoke" can mean "Roanoke, Virginia" or "Roanoke, Texas," depending on the surrounding context. Organizations and companies often have multiple nicknames, name variations, or common misspellings. Famous persons ("Michael Jackson") often share a name with many non-famous individuals.

AlchemyAPI Entity Disambiguation

AlchemyAPI disambiguates text in much the same way as a human. When a potentially ambiguous entity is encountered, the surrounding text is examined for contextual cues. A series of statistical algorithms are combined with a huge data-set describing the world's objects, individuals, and locations. Our entity disambiguation system is extremely robust, capable of resolving hundreds of entity types -- more than any other commercially-available system.

What are Contextual Cues?

These are "hints" that help disambiguate an entity. Our disambiguation engine employs tens of millions of hints describing traits of the world's objects, individuals, and locations. We employ a variety of public and non-public data-sets. Hints vary depending on the specific type of entity being disambiguated. For example, when disambiguating people, we utilize information on a person's career, where they're located, who they work for, and so on. For companies: key executives, notable products, industry, location, etc.

How Do I Use It?

Disambiguation is enabled by default, so you don't have to do anything to use it. If for some reason you desire to disable disambiguation, you may utilize the following HTTP parameter with your API calls:

    disambiguation=0

NOTE: Disambiguation is currently available for English-language content only. Support for disambiguating entities in other languages is currently in development.

What Happens Next?

Whenever an entity is successfully disambiguated, additional information is returned in API responses. This includes the fully resolved, disambiguated entity name, and if available, the entity's website and geographic coordinates. AlchemyAPI also returns subType information for many disambiguated named entities; SubTypes provide detailed ontological mappings for an entity, for instance identifying a Person as a Politician or Athlete. The following is an example disambiguation response (for "Ted Kennedy"):

        <entity>
            <type>Person</type>
            <count>1</count>
            <text>Ted Kennedy</text>
            <disambiguated>
                <name>Ted Kennedy</name>
                <subType>Politician</subType>
                <subType>Senator</subType>
                <website>http://kennedy.senate.gov/</website>
                <dbpedia>http://www.dbpedia.org/resource/Ted_Kennedy</dbpedia>
                <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f800000000014938b</freebase>
                <opencyc>http://sw.opencyc.org/concept/Mx4rxSpKAsqREdaSDQACs6MxsQ</opencyc>
                <yago>http://mpii.de/yago/resource/Ted_Kennedy</yago>
            </disambiguated>
        </entity>

The following is an example disambiguation response (for the city of "Morristown, New Jersey"):

        <entity>
            <type>City</type>
            <count>1</count>
            <text>Morristown</text>
            <disambiguated>
                <name>Morristown, New Jersey</name>
                <website>http://www.morristown-nj.org/</website>
                <geo>40.7989 -74.478526</geo>
                <dbpedia>http://www.dbpedia.org/resource/Morristown,_New_Jersey</dbpedia>
                <freebase>http://rdf.freebase.com/ns/guid.9202a8c04000641f80000000000e5c19</freebase>
                <census>http://www.rdfabout.com/rdf/usgov/geo/us/nj/counties/morris_county/morristown</census>
                <geonames>http://sws.geonames.org/5101427/</geonames>
            </disambiguated>
        </entity>