AlchemyAPI provides easy-to-use facilities for performing concept tagging on any posted web page.
Posted content is normalized/cleaned (removing ads, navigation links, and other unimportant content), the primary document language is detected, and concept tagging is performed automatically.
These API calls may be utilized to process posted (uploaded) webpages and other HTML content. If you are processing content hosted on a publicly accessible website, consider using our URL processing calls instead.
Description: The HTMLGetRankedConcepts call is utilized to extract a relevancy-ranked list of concept tags for a posted HTML document. AlchemyAPI will extract text from the posted HTML document structure (ignoring navigation links, advertisements, and other undesireable content), and perform concept tagging operations.
Endpoint: http://access.alchemyapi.com/calls/html/HTMLGetRankedConcepts
| http argument | parameter description | ||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| apikey | your private api key
(required parameter) |
||||||||||
| html | HTML document content (must be uri-argument encoded)
(required parameter) |
||||||||||
| url | HTML document URL (must be uri-argument encoded) (optional parameter, for response tracking purposes.) |
||||||||||
| maxRetrieve | maximum number of concept tags to extract (default: 8)
(optional parameter) |
||||||||||
| outputMode | desired API output format Possible values: xml (default) json rdf (optional parameter) |
||||||||||
| jsonp | desired JSONP callback (optional parameter, requires "outputMode" to be set to json) |
||||||||||
| linkedData | whether to include Linked Data content links with identified concept tags. Possible values: 1 - enabled (default) 0 - disabled (optional parameter.) |
||||||||||
| showSourceText | whether to include the original 'source text' the concept tags were extracted from within the API response. Possible values: 1 - enabled 0 - disabled (default) (optional parameter) |
||||||||||
| sourceText | where to obtain the text that will be processed by this API call. AlchemyAPI supports multiple modes of text extraction: web page cleaning (removes ads, navigation links, etc.), raw text extraction (processes all web page text, including ads / nav links), visual constraint queries, and XPath queries. Possible values:
(optional parameter) |
||||||||||
| cquery | a visual constraints query to apply to the web page. Constraint queries enable API operations to be performed on a targeted area of a web page, such as a story title or product description. (optional parameter, used when sourceText is set to 'cquery'. must be uri-argument encoded) |
||||||||||
| xpath | an XPath query to apply to the web page. XPath queries enable API operations to be performed on a targeted area of a web page, such as a story title or product description. (optional parameter, used when sourceText is set to 'xpath'. must be uri-argument encoded) |
<results>
<status>REQUEST_STATUS</status>
<url>DOCUMENT_URL</url>
<language>DOCUMENT_LANGUAGE</language>
<text>DOCUMENT_TEXT</text>
<concepts>
<concept>
<text>DETECTED_CONCEPT</text>
<relevance>DETECTED_RELEVANCE</relevance>
<website>WEBSITE</website>
<geo>LATITUDE LONGITUDE</geo>
<dbpedia>LINKED_DATA_DBPEDIA</dbpedia>
<yago>LINKED_DATA_YAGO</yago>
<opencyc>LINKED_DATA_OPENCYC</opencyc>
<freebase>LINKED_DATA_FREEBASE</freebase>
<ciaFactbook>LINKED_DATA_FACTBOOK</ciaFactbook>
<census>LINKED_DATA_CENSUS</census>
<geonames>LINKED_DATA_GEONAMES</geonames>
<crunchbase>CRUNCHBASE_WEB_LINK</crunchbase>
</concept>
</concepts>
</results>
{
"status": "REQUEST_STATUS",
"url": "DOCUMENT_URL",
"language": "DOCUMENT_LANGUAGE",
"text": "DOCUMENT_TEXT",/text>
"concepts": [
{
"text": "DETECTED_CONCEPT",
"relevance": "DETECTED_RELEVANCE",
"website": "WEBSITE",
"geo": "LATITUDE LONGITUDE",
"dbpedia": "LINKED_DATA_DBPEDIA",
"yago": "LINKED_DATA_YAGO",
"opencyc": "LINKED_DATA_OPENCYC",
"freebase": "LINKED_DATA_FREEBASE",
"ciaFactbook": "LINKED_DATA_FACTBOOK",
"census": "LINKED_DATA_CENSUS",
"geonames": "LINKED_DATA_GEONAMES",
"musicBrainz": "LINKED_DATA_MUSICBRAINZ",
"crunchbase": "CRUNCHBASE_WEB_LINK",
}
]
}
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:aapi="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#"
xml:base="http://rdf.alchemyapi.com/rdf/v1/r/response.rdf">
<rdf:Description rdf:ID="DOCUMENT_HASH">
<rdf:type rdf:resource="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#DocInfo"/>
<aapi:ResultStatus>REQUEST_STATUS</aapi:ResultStatus>
<aapi:Language>DOCUMENT_LANGUAGE</aapi:Language>
<aapi:URL>DOCUMENT_URL</aapi:URL>
<aapi:DocText>DOCUMENT_TEXT</aapi:DocText>
</rdf:Description>
<rdf:Description rdf:ID="DOCUMENT_HASH-CONCEPT_NUM">
<rdf:type rdf:resource="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#ConceptOccurrence"/>
<aapi:Doc>DOCUMENT_HASH</aapi:Doc>
<aapi:Relevance>DETECTED_RELEVANCE</aapi:Relevance>
<aapi:Name>DETECTED_CONCEPT</aapi:Name>
<aapi:URL>WEBSITE</aapi:URL>
<aapi:Geo>LATITUDE LONGITUDE</aapi:Geo>
<owl:sameAs rdf:resource="LINKED_DATA_DBPEDIA"/>
<owl:sameAs rdf:resource="LINKED_DATA_YAGO"/>
<owl:sameAs rdf:resource="LINKED_DATA_OPENCYC"/>
<owl:sameAs rdf:resource="LINKED_DATA_FREEBASE"/>
<owl:sameAs rdf:resource="LINKED_DATA_FACTBOOK"/>
<owl:sameAs rdf:resource="LINKED_DATA_CENSUS"/>
<owl:sameAs rdf:resource="LINKED_DATA_GEONAMES"/>
<owl:sameAs rdf:resource="LINKED_DATA_MUSICBRAINZ"/>
<owl:sameAs rdf:resource="LINKED_DATA_CRUNCHBASE"/>
</rdf:Description>
</rdf:RDF>
| field name | field description | ||||||||||||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| status | success / failure status indicating whether the request was processed. Possible values: OK ERROR |
||||||||||||||||||||||||
| language | the detected language that the source text was written in. | ||||||||||||||||||||||||
| url | http url information was requested for. | ||||||||||||||||||||||||
| relevance | relevance score for a detected concept tag. Possible values: (0.0 - 1.0) [1.0 = most relevant] |
||||||||||||||||||||||||
| text | the detected concept tag. | ||||||||||||||||||||||||
| linked data | linked data for the detected concept tag (sent only if linkedData is enabled)
|
||||||||||||||||||||||||
| statusInfo | failure status information (sent only if "status" == "ERROR"). Possible values: invalid-api-key page-is-not-html |