Back to Top

Text API

Text API: Language Detection

AlchemyAPI provides easy-to-use facilities for classifying any posted (uploaded) text content by language. Post (upload) any textual data directly to our service for analysis.

Posted content is normalized / cleaned (removing ads, navigation links, and other unimportant content), and the primary language of the contained text is identified. These API calls may be utilized to process posted (uploaded) textual content. If you are processing content that is HTML-formatted or is hosted on a publicly accessible website, consider using our URL processing calls and HTML calls instead.

API Call: TextGetLanguage

Description: The TextGetLanguage call is utilized to detect the language utilized within a posted text document.

Endpoint: http://access.alchemyapi.com/calls/text/TextGetLanguage

Parameters:

http argument parameter description
apikey your private api key

(required parameter)
text Text document content (must be uri-argument encoded)

(required parameter)
url Text document URL

(optional parameter, for response tracking purposes. must be uri-argument encoded)
outputMode desired API output format

Possible values:
xml (default)
json
rdf

(optional parameter)
jsonp desired JSONP callback

(optional parameter, requires "outputMode" to be set to json)

Response Format (XML)

<results>
    <status>REQUEST_STATUS</status>
    <url>DOCUMENT_URL</url>
    <language>DETECTED_LANGUAGE</language>
    <iso-639-1>ISO_639_1_CODE</iso-639-1>
    <iso-639-2>ISO_639_2_CODE</iso-639-2>
    <iso-639-3>ISO_639_3_CODE</iso-639-3>
    <ethnologue>ETHNOLOGUE_URL</ethnologue>
    <native-speakers>NUM_NATIVE_SPEAKERS</native-speakers>
    <wikipedia>WIKIPEDIA_URL</wikipedia>
</results>

Response Format (JSON):

{
    "status": "REQUEST_STATUS",
    "url": "DOCUMENT_URL",
    "language": "DETECTED_LANGUAGE",
    "iso-639-1": "ISO_639_1_CODE",
    "iso-639-2": "ISO_639_2_CODE",
    "iso-639-3": "ISO_639_3_CODE",
    "ethnologue": "ETHNOLOGUE_URL",
    "native-speakers": "NUM_NATIVE_SPEAKERS",
    "wikipedia": "WIKIPEDIA_URL"
}

Response Format (RDF):

<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
                 xmlns:aapi="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#"
                 xml:base="http://rdf.alchemyapi.com/rdf/v1/r/response.rdf">
    <rdf:Description rdf:ID="DOCUMENT_HASH">
        <rdf:type rdf:resource="http://rdf.alchemyapi.com/rdf/v1/s/aapi-schema#DocInfo"/>
        <aapi:ResultStatus>REQUEST_STATUS</aapi:ResultStatus>
        <aapi:URL>DOCUMENT_URL</aapi:URL>
        <aapi:Language>DOCUMENT_LANGUAGE</aapi:Language>
        <aapi:ISO-639-1>ISO_639_1_CODE</aapi:ISO-639-1>
        <aapi:ISO-639-2>ISO_639_2_CODE</aapi:ISO-639-2>
        <aapi:ISO-639-3>ISO_639_3_CODE</aapi:ISO-639-3>
        <aapi:Ethnologue>ETHNOLOGUE_URL</aapi:Ethnologue>
        <aapi:NativeSpeakers>NUM_NATIVE_SPEAKERS</aapi:NativeSpeakers>
        <aapi:Wikipedia>WIKIPEDIA_URL</aapi:Wikipedia>
    </rdf:Description>
</rdf:RDF>

Response Fields:

field name field description
status success / failure status indicating whether the request was processed.

Possible values:
OK
ERROR
url http url information was requested for.
language detected language for the specified http url.

For a list of all languages (90+) that are detected, click here.
iso-639-1 ISO-639-1 code for the detected language.

For more information on ISO-639-1, click here.
iso-639-2 ISO-639-2 code for the detected language.

For more information on ISO-639-2, click here.
iso-639-3 ISO-639-3 code for the detected language.

For more information on ISO-639-3, click here.
ethnologue Link to Ethnologue containing information on the detected language.

For more information on Ethnologue, click here.
native-speakers Number of persons who natively speak the detected language.

Language statistics courtesy of Wikipedia.
wikipedia Link to the Wikipedia page for the detected language.
statusInfo failure status information (sent only if "status" == "ERROR").

Possible values:
invalid-api-key
content-exceeds-size-limit

API Notes:

  1. Calls to TextGetLanguage should be made using HTTP POST.
  2. HTTP POST calls should include the Content-Type header: application/x-www-form-urlencoded
  3. Posted text documents can be a maximum of 150 kilobytes. Larger documents will result in a "content-exceeds-size-limit" error response.
  4. A minimum of 15 characters of text must exist within the posted document to perform language detection.