Back to Top


Text Extraction

Text Extraction API

AlchemyAPI provides easy-to-use mechanisms to extract page text and title information from any web page.

A HTML page cleaning facility is provided, which normalizes and cleans HTML content (removing ads, navigation links, and other unimportant content), enabling extraction of only the important article text.

API endpoints are provided for performing text/title extraction on Internet-accessible URLs and posted HTML files.

Extracted meta-data may be returned in XML, JSON, and RDF formats. More information on text extraction API response formats.