AlchemyAPI provides easy-to-use facilities for extracting structured data from any web page: Post (upload) any content directly for analysis. These calls automatically process the posted web page, extracting the desired structured data. These API calls may be utilized to process posted (uploaded) webpages and other HTML content. If you are processing content hosted on a publicly accessible website, consider using our URL processing calls instead.
Description: The HTMLGetConstraintQuery call is utilized to extract structured data from a posted HTML document. AlchemyAPI will analyze the posted HTML document structure, extracting the desired structured data.
Endpoint: http://access.alchemyapi.com/calls/html/HTMLGetConstraintQuery
| http argument | parameter description |
|---|---|
| apikey | your private api key
(required parameter) |
| html | HTML document content (must be uri-argument encoded)
(required parameter) |
| url | HTML document URL (must be uri-argument encoded) (optional parameter, for response tracking purposes.) |
| cquery | the constraint query to execute
(required parameter) |
| outputMode | desired API output format Possible values: xml (default) json rdf (optional parameter) |
| jsonp | desired JSONP callback (optional parameter, requires "outputMode" to be set to json) |
<results>
<status>REQUEST_STATUS</status>
<url>DOCUMENT_URL</url>
<queryResults>
<queryResult>
<resultText>DETECTED_TEXT</resultText>
<resultURL>DETECTED_URL</resultURL>
</queryResult>
</queryResults>
</results>
{
"status": "REQUEST_STATUS",
"url": "DOCUMENT_URL",
"queryResults": [
{
"resultText": "DETECTED_TEXT",
"resultURL": "DETECTED_URL"
}
]
}
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:aapi="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#"
xml:base="http://rdf.alchemyapi.com/rdf/v1/r/response.rdf">
<rdf:Description rdf:ID="DOCUMENT_HASH">
<rdf:type rdf:resource="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#DocInfo"/>
<aapi:ResultStatus>REQUEST_STATUS</aapi:ResultStatus>
<aapi:URL>DOCUMENT_URL</aapi:URL>
<aapi:CQueryResultText>DETECTED_TEXT</aapi:CQueryResultText>
<aapi:CQueryResultURL>DETECTED_URL</aapi:CQueryResultURL>
</rdf:Description>
</rdf:RDF>
| field name | field description |
|---|---|
| status | success / failure status indicating whether the request was processed. Possible values: OK ERROR |
| url | http url information was requested for. |
| resultText | some extracted text (structured data). |
| resultURL | an extracted URL (structured data). |
| statusInfo | failure status information (sent only if "status" == "ERROR"). Possible values: invalid-api-key page-is-not-html |