Back to Top

 

Entity Extraction

Looking for the named entity extraction docs? They can be found here.

Entity Extraction API

Named entities specify things such as persons, places and organizations. AlchemyAPI's named entity extraction is capable of identifying people, companies, organizations, cities, geographic features and other typed entities from your HTML, text or web-based content.

Entity extraction can add a wealth of semantic knowledge to your content to help you quickly understand the subject of the text. It is one of the most common starting points for using natural language processing techniques to enrich your content.

AlchemyAPI's named entity extraction is based on sophisticated statistical algorithms and natural language processing technology. It is unique in the industry with its combination of multilingual support, linked data, context-sensitive entity disambiguation, comprehensive type support and quotations extraction.

Want to test out our entity extraction API?

Named Entity Extraction Example:

One year ago, several hours before cities across the United States started their annual fireworks displays, a different type of fireworks were set off at the European Center for Nuclear Research (CERN) in Switzerland. At 9:00 a.m., physicists announced to the world that they had found something they had been searching for for nearly 50 years: the elusive Higgs boson. Today, on the anniversary of its discovery, are we any closer to figuring out what that particle's true identity is? The Higgs boson is popularly referred to as "the God particle," perhaps because of its role in giving other particles their mass. However, it's not the boson itself that gives mass. Back in 1964, Peter Higgs proposed a theory that described a universal field (similar to an electric or a magnetic field) that particles interacted with.

excerpt from: http://abcnews.go.com/Technology/god-particle-higgs-boson-year/story?id=19574423

Entity Relevance Sentiment Type Linked Data
Peter Higgs 0.98893 positive Person DBpedia | Yago
European Center for Nuclear Research 0.69407 neutral Organization Website | Lat:46.23,Lon:6.06 | DBpedia | Yago | OpenCyc | GeoNames
United States 0.461032 neutral Country Website | DBpedia | Yago | OpenCyc | CIA Factbook
Switzerland 0.445847 neutral Country Website | Lat:46.83,Lon:8.33 | DBpedia | Yago | OpenCyc | CIA Factbook

Features:

Sentiment Analysis

AlchemyAPI provides the ability to extract entity-level sentiment (positive or negative statements). Using sentiment analysis can help identity the content that refers to an entity in a positive or negative manner.

Coreference Resolution

The entity extraction API can resolve the coreferences (i.e. he, she, the company, etc.) into detected entities. AlchemyAPI's powerful technology understands pronouns and the specific entities they link to.

Entity Types

An entity type describes what an entity is, such as a person, a city or a company. Additionally, many entities have sub-types that provide further description for disambiguation. For example, an entity can be a person, but it can also be a musical artist or an author. AlchemyAPI is capable of identifying hundreds of entity types and sub-types, and a full list is located here: Supported Entity Types

Disambiguation

Content can be ambiguous because human language is not exact. Is it Michael Jackson the pop star? Or is it Michael Jackson the writer? When a potentially ambiguous entity is found, the surrounding text is examined for contextual cues. Complex statistics and big data are used to determine which entity is likely correct. AlchemyAPI identifies the correct Michael Jackson by looking for cues on his career, where he's located, notable achievements, etc.

Linked Data

The associated linked data for each disambiguated entity is included in the response. Use linked data to access additional semantic information and further enhance your content. Learn more about AlchemyAPI's linked data support.

Quotation extraction

AlchemyAPI is able to identify quotations and link them to a specific entity. For instance, if the text contains Bob said, "this is a quotation," Bob would be identified as an entity and his quote would be linked to him. Note: quotation extraction is currently available for English and French languages only.

Response Formats

The entity extraction API can return either JSON, XML or RDF formatted data. The response formats are designed to be flexible to meet the needs of your application.

Language Support

AlchemyAPI supports named entity extraction for content written in the following 8 languages: English, French, German, Italian, Portuguese, Russian, Spanish, and Swedish. These are the native languages for more than 1.3 billion people.