Text analysis can be applied to a wide range of applications and industries. Two of the most common and powerful are social media monitoring and social media analytics. These applications are about gathering information generated on social media outlets, then analyzing the data for insights and trends.
Since many of our customers have created applications to monitor social media and create analytics, we’re fortunate to have learned a lot about this space. To share some of that knowledge, we released a social media monitoring solution page in our resources section, and the first tutorial in our developers section.
Text Analysis for Social Media Monitoring and Social Media Analytics provides a high-level overview of how social media monitoring works, and addresses two challenges of social media monitoring: finding all the data and figuring out what it means.
We’re excited to unveil our first developer tutorial that walks through all the steps you’ll need to create an application that analyzes Twitter sentiment, including the source code for an example Python application on GitHub. While the solution's page is a high-level discussion, the Analyzing Twitter Sentiment Tutorial shows exactly what steps a developer will need to create the basic scaffolding for a social media analytics application.
In the on-going effort to make our services easier for our developers to use, we’ve created 4 new getting started with AlchemyAPI guides. These guides are intended for developers, and walk through the required steps to get up and running with AlchemyAPI. Areas covered include getting a key, downloading an SDK, configuring the SDK, and running example code.
Since all of our text analysis functions are accessed via a web REST API, we’re pretty agnostic to your programming language choice. So that’s why we’ve created several guides. Each guide is focused on getting started with a particular programming language, with the four guides covering Python, PHP, Ruby and Node.js. Look for more guides in the future as we create these for the other languages that we support with an SDK.
You can access the new guides here:
The International Biosecurity Intelligence System, or IBIS, is a project that aims to detect and monitor emerging biosecurity issues. The results are primarily used by life scientists studying biosecurity for governments, but can also be used by farmers and other industry professionals to keep updated on biosecurity developments.
The first phase of the system that focuses on aquatic animal health is already live, and you can check it out at: http://aquatic.animalhealth.org/. Since news about threats to aquatic animal health can come from any source, anywhere in the world, the system relies heavily on automation. Everyday it scours the internet for information about emerging aquatic animal diseases, using a combination of RSS feeds, search engines, industry journals and twitter. Then, using AlchemyAPI, it extracts the title, text, author and locations from the content, which are fed into it’s decision model to determine if the content is relevant to aquatic animal health. The end result is a list of articles pertaining to aquatic animal biosecurity issues, as shown in the screenshot below.
Emerging Aquatic Animal Health Threats via http://aquatic.animalhealth.org/
This is a great example of how the complex task of analyzing an incredible amount of data can be automated with the help of AlchemyAPI. This project is managed by the University of Melbourne, and like all resource-constrained institutions, manually reviewing all the data would be impossible. So all of us here at AlchemyAPI are proud that our services can help such an important project get off the ground.
GigaOM recently created an article on The Gigaom guide to deep learning: Who’s doing it, and why it matters. It’s a great article that gives an overview of what deep learning is all about, some common applications, startup companies and industry giants that are using it, and the possibilities for future developments.
In deep learning, computers learn by creating hierarchical representations using stacked neural network layers. Each new layer creates an abstraction of the layers below to understand more complex ideas. For example, in networks designed to perform facial image recognition, the first layers may detect lines or edges, while later layers are responsible for more abstract reconstructions of shapes, such as eyes or noses. Finally, the last layers put these abstract shapes together to form the face.
At AlchemyAPI, we use these deep learning techniques to power our text analysis and image recognition capabilities. By using massive amounts of data, GPU hardware and deep neural networks, we are able to increase both the number of things we understand and the accuracy of our systems. As the GigaOM article points out we are in good company; Facebook, Google, IBM, Microsoft and Yahoo, and a growing number of startups including AlchemyAPI, are spending considerable R&D resources on developing deep learning technology.
Back in June of this year, we wrote a blog post about Article Optimizer, a tool created by Zack Proser. Article Optimizer analyzes your content for keyword density and suggests trending keywords and royalty-free images to help you produce more engaging content and get more traffic. A lot goes into the process of writing good content, and this tool aims to help make it easier.
You can check out Article Optimizer for yourself at: http://www.article-optimize.com/
We recently sat down with Zack (virtually of course) and asked him a few questions about Article Optimizer. Zack has been creating content for the web for several years now, and was initially overwhelmed by the number of “checkboxes” you had to hit to create successful content. Things like search demand, keywords, links, additional semantic info and which kinds of images and video to include. However, over time he came to the realization that the most successful content is informative, useful and fun to read.
So Article Optimizer was created as a tool to let the author focus on writing good content, and it will make sure you check all those boxes. In the background, Article Optimizer handles things like keyword density, figuring out the trending keywords, showing you competitive content, etc. Here’s a portion of the report that was created for this blog post, and it even creates a shareable link of the report.
Since Article Optimizer does some heavy duty text analysis, Zack went with AlchemyAPI to help power the underlying technology for his app. He had used our API on a few past projects so he knew it would be a great fit for Article Optimizer. In fact, Zack “feels that Alchemy is a great choice whenever you want to imbue your program with a higher level of intelligence regarding the data it is working with.”
One place where AlchemyAPI is especially helpful is finding related images. Zack uses our text categorization API to figure out the high level category of the content, and then uses that to find related images. On why this approach is better than just using keywords, Zack said, “I find the text categorization feature to be very accurate at determining what the entire article is about, so using this to search for images returns much better results than when I tried searching on the top keywords parsed from the text.”
For the future, Zack is working on more tools to help writers. On the horizon is a fully featured content development environment to make writing high quality content even easier. We’re looking forward to using that tool too!
Writing headlines is a sacred journalist art form. Headlines all at once must grab the reader’s attention, convey what the article is about, but not give away too much, all in an effort to get the content read. What tricks do journalists use? What sells, good news or bad news? Do negative headlines capture more attention than positive headlines?
It may be that the latter question has been at least partially answered. According to a recent project created by a team of hackers from Dev Boot Camp - Chicago, negative headlines are much more commonly used than positive ones. The team analyzed over 140,000 headlines over several years from major news sites, including: CNN, Fox, MSNBC, HuffPost, Yahoo News and The Onion. They used AlchemyAPI’s sentiment analysis API to calculate how positive or negative the headlines were from each source, then made a great visualization tool. Below is a screenshot of their analysis.
The headline sentiment of Fox, HuffPost and Aggregate
From this plot, nearly all of the headlines are on the negative side of the graph, with just a couple of short spikes into the positive range. You can also get some insight into how different news organizations write their headlines. For example, in the screenshot I chose to show two very different news sources, Fox News and the Huffington Post. While both organizations write headlines that are similar to the aggregate, i.e. typically negative, Fox is generally a little bit more negative than the aggregate, while HuffPost is generally a little bit more positive than the aggregate. While this doesn’t convey anything about the quality or objectiveness of the writing, it possibly sheds at least a little bit of light on the style of how the headlines are written.
This is a very interesting use of sentiment analysis, and likely a much more in-depth analysis could be done on this data than we have space for here. An interesting addition to this data set would be to correlate this data with keywords or entities to see how each news site writes headlines for specific events or people over time.
If you’d like to view the data yourself, you can check out the project here: http://headlines-and-data.herokuapp.com/
Note taking in school has largely remained the same for hundreds of years. You listen to the lecture and write down the important stuff. Whether that was just on paper or on your laptop, it’s pretty much the same. Enter "Know Your Notes," a hack created over the weekend at the recent hackMIT hackathon. It’s a note taking application that utilizes AlchemyAPI to extract semantic information about your notes. It reads your notes, and identifies the concepts, entities and keywords, and links to related wikipedia articles where you can add to your understanding of the topics. Here’s a screenshot of the application:
This is a great example of how you can use the power of natural language processing to get a more comprehensive understanding of topics you’re interested in. Additionally, Know Your Notes has another great trick up its sleeve. It will read your notes, and use the relations extraction API to pull out the Subject -> Action -> Object relations. From these relations, it automatically generates flash cards where it hides one of the parts, as shown below.
This hack was an excellent example of how to utilize AlchemyAPI to enhance your content. Nice work to the team that created this application!
The relatively recent availability of personal data tracking products such as FitBit or Jawbone, plus a smartphone in nearly everyone’s pocket, is making the quantified self a reality. The quantified self is a lifestyle where practically everything you do is recorded and analyzed, like calorie intake, steps walked, hours slept, miles ran, etc. At the recent hackMIT hackathon, a team created a quantified self tracking tool that’s based on free-form text input. The user simply enters “I ran 7 miles today” or “I lifted 212 pounds 4 times,” and LogYourDay uses AlchemyAPI’s keyword extraction API to understand the text. Here’s a screenshot of the user interface:
LogYourDay - a quantified self tool based on free form text
Once you’ve accumulated some data, and AlchemyAPI has extracted the keywords, it’s time for the other half of quantified self: analysis. The team created a simple plotting interface where you can grab the keywords you’re interested in and plot them over time. This makes it easy to see how many miles you ran each day last week, or the average number of hours you sleep each night.
An example a possible analysis
Pretty impressive, especially considering the team had just 24 hours to create this application. This hack is a great example of how you can take unstructured, free form text and with just a simple API call, turn it into useful data suitable for visualizations. What other ways can you think of to take random strings of text and turn it into helpful graphics?
At the recent hackathon at the University of Michigan, MHacks, there were dozens of applications that were built with the help of AlchemyAPI. One that stood out was Pyre, a website generation app that very quickly builds a basic website. The user simply types in the type of site they would like to build in a free form text box, and Pyre generates the template. Examples include, “build me a marketing website” or “I’d like to start a blog.” The user clicks go and automatically has a template designed for their use case.
The Pyre Homepage (Pyre.co )
Pyre is able to translate this unstructured, free form text into an actionable command using AlchemyAPI’s keyword extraction API. From the keywords in the text box, Pyre is able to understand what the user is asking for, and can automatically generate the desired site. This is an excellent example of how an application developer will have important information contained within an unstructured object, and as soon as some structure is added it’s easy to take action.
Note that Pyre is still a work in progress, and it obviously can’t account for every type of website that a user may desire. For example, when we requested Pyre to build “a social media website for dogs,” it unfortunately couldn’t build that so I just got a default template instead. Alas, maybe nobody could build that anyways.
At the recent hackMIT hackathon, a team created an interesting mashup of data to help determine if people become more negative on Twitter if it’s raining outside. With the help of weather data from WeatherUnderground, Tweets from across the country via the Twitter API, and AlchemyAPI’s sentiment analysis API, the team was able to create this diagram:
Interesting visualization from http://cond.in/hackmit/
This visualization shows that while a little bit of rain has a mixed effect on Twitter sentiment, lots of rain quickly turns the average Tweet more negative. That makes sense. Here in Colorado where it’s very dry we generally welcome the rain. But you can definitely have too much of a good thing, and it does get old if it drags on for more than a couple of days.
This is an example of the interesting analysis possibilities when you mash up quality data sources with AlchemyAPI’s power text analysis capabilities. With more and more data becoming available through a multitude of quality APIs, what other interesting insights are waiting to be discovered?
An interactive version of the above plot can be found at: http://cond.in/hackmit/