You want to hone your technical skills by participating in hackathons, but you don’t necessarily want to pull an all-nighter or travel to compete. Have you heard of ChallengePost? It's an online community that brings hackers together to team up on projects and inspire one another.
We’re amazed by the creative apps being built by programmers using API mashups. And, we're humbled that Alchemists around the world have shared more than 55 projects with the ChallengePost community. From helping publishers enhance multimedia content with product recommendations (YFly) to developing an app that quickly skims an entire article to pull related information from Wikipedia (Skimmer), these smart applications use natural language processing and computer vision in entirely new ways.
Here are a few examples from the Alchemy board to awaken your inner hacker.
When looking for a new job, it can be pretty difficult to get noticed. The application, Jobify increases a job seeker's chances of landing the elusive 'dream job' by providing suggestions to optimize their LinkedIn profile based on the desired job's description. Using AlchemyAPI, the app compares the description to an applicant’s profile by extracting relevant keywords from the job description, providing the applicant with an overall profile fit score and recommending keywords to add.
Ever wish you could predict the future? Civic Sentiment is an app that does just that by using social media sentiment analysis to forecast political outcomes with increased precision.
Young voters can greatly influence political outcomes with their votes so it is clear why campaign managers want to predict their actions before all of the ballots are in. To do so, it is essential to engage voters in the channels they prefer, such as Twitter. With 60% of Twitter users between the ages of 18-35 (PEW Research), it is a goldmine of data that helps portray the emotions and intentions of young voters. By tracking hashtag sentiment on social media, Civic Sentiment can help predict how young voters feel about political candidates, issues and events without implementing time-consuming surveys and focus groups.
Imagine that you just completed the dreaded spring cleaning. Now, you can relax... Not so fast. What will you do with all of the junk you collected? The good news is that there are plenty of places willing to take your old stuff. The bad news is that many of them only accept certain items.
Finding the right junk graveyard can be time-consuming and frustrating, but now there's an app for that. Unload helps you ‘take the load off’ by using natural language processing, state-of-the-art mapping tools and other technologies to provide access to the nearest and most suitable waste management facility for users.
It’s your turn! Visit ChallengePost to learn about upcoming hackathon challenges or to join a hack team. Then, register for a free API Key and fuel your innovation with natural language processing and computer vision.
In just 24 hours, the world’s 3 billion internet users performed 2.8 billion Google searches, viewed more than 5.5 billion YouTube videos and sent nearly 500 million Tweets (stats courtesy of Internet Live Stats). That’s not just ‘big data’, that’s massive data. And the vast majority of it is unstructured -- data (such as emails, chats, articles, etc.) intended for human consumption and not designed for computers to process.
Internet Live Stats on December 9, 2014 show the activity of the world's internet users.
Over the past several years, businesses dealing with tremendous amounts of data have shifted their focus. Time that was once dedicated to poring over charts, tables, and spreadsheets is now spent seeking intelligent ways to automate data analysis and connect the dots between what consumers are saying across all channels.
This shift in the unstructured data approach is happening because of the wide availability of technologies that are faster, adaptable and more accurate. Services built on deep learning and artificial intelligence, like AlchemyAPI, AT&T Speech and others, are moving from research labs to enterprise organizations that want to increase their agility and better serve their customers.
Businesses of all sizes are already using deep learning to transform real-time data analysis. Big players like Netflix, Google News and Amazon employ deep learning to understand users’ activities and preferences to then recommend movies, articles and products that they might like. There are also success stories from companies like AdTheorent, a real-time bidding advertising platform that uses deep learning to power a predictive modeling engine that significantly improves click-through rates.
Recently, we partnered with Janet Wagner (@webcodepro), a data journalist, full stack developer and contributor on ProgammableWeb, to gather six practical deep learning use cases that demonstrate how these technologies integrate into businesses of all shapes and sizes.
In this new ebook, we discuss:
If you want to create smarter applications that make sense of your data, download Deep Learning: 6 Real World Use Cases. You’ll get ideas and inspiration for solving your unstructured data challenges from businesses that have been in your shoes.
Do you ever think about embracing your inner geek, channeling creative energy into the vision you have for the next killer app? One neophyte developer, Phil Han, has taken on that challenge.
Phil’s day job is in marketing for a technology company. When he was younger, he was interested in programming but didn’t pursue it. Ultimately, Phil decided to learn to code because he was inspired to build an application that helps visualize the top news stories around the world and improve his ability to grasp the technologies that are present in an app developer’s life. The result is Headlyne.me and, in his words, his experience has “escalated exponentially” his appreciation for the work that app devs take on.
I found Phil on Medium, a web site where you can share “little stories that make your day better and manifestos that change the world.” In the spirit of full disclosure, one reason for writing about Phil’s post is that his app (and professional skills) development includes using AlchemyAPI. On top of that, I’m surrounded by developers and often wonder what it would be like to walk a few miles in their coding shoes. I contacted Phil to get his thoughts on coding, the value of marketers learning to code and the effort involved with bringing his first app to life. Here’s what we discussed:
What sparks my interest is a sort of “birth of an app developer” story that you tell and that you also are in marketing. What are your thoughts on strengthening the bond between marketers and IT/app development?
In tech, marketing sometimes gets sidelined. The process can be top-down, i.e. the features are developed and then use cases are made. Most of the job of a marketer is translating those features into tangible benefits for the end user and showing them how it can make their life easier or better. In other words, when I interact with technical folks, it’s my job to understand what a feature does, so the business decision maker can understand its benefits to their company. To build on this, I felt the necessity to learn coding, and the easiest way to do that is through expressing my ideas as an app.
When you first thought about building an app that would enable visualizing top news around the world, did you think of it as unstructured data analysis?
At first, this wasn’t obvious. I assumed that each article title had some sort of metadata associated with it that would carry an indication of location. But it didn’t. That’s when I discovered AlchemyAPI, which would automatically take entities and organize them for me, using the Keyword Extraction API. All I had to do was filter for location-related entities (e.g. countries, states, etc.), cherry-pick the most relevant ones and then compile those into a list that could be run through the Google GeoCoding API.
What are your tips for others who are just getting started with analyzing data?
Have a clear idea of what kind of data you want. For me, this was easy – I needed coordinates. But the problem I encountered with AlchemyAPI was that it would only provide coordinates if the entity was disambiguated. That meant I had to devise a workaround (i.e. feeding the most relevant location to the Google GeoCoding API) so I could render the markers on the globe.
When I think about my experience in terms of marketing, I’ve taken valuable lessons in error identification. For example, if something’s not working in my automated social programs, if I’m not getting the response rate I want, I try to identify the root cause – is it the content that’s not interesting? Am I using the wrong keywords or targeting incorrect people? On a more macro level, I associate more with the Kanban method where I try to keep up with multiple marketing objectives simultaneously and then attend to the more mission-critical collateral or program that needs to be completed.
What surprised you most about your project?
That I was able to get it to work at all, after seven weeks of intense coding and a phenomenal amount of help from a couple of my friends. Also, that I needed to learn so many frameworks and languages (Python, Flask, the AlchemyAPI platform, BeautifulSoup, WebGL Earth, Git, Heroku, and Jinja2 in particular) just to bring this project to fruition.
At one point in your post on Medium, you write: “After annoying their customer service representative numerous times…” How would you characterize your interaction with AlchemyAPI?
Overall, extremely helpful. They provided a minimum, viable code of the AlchemyAPI query and helped me learn the syntax to actually extract the information I needed. My support contact was able to identify a way of extracting article titles by querying for a specific metadata tag associated with each article title, which helped me get the first version running. (Props to Mr. Josh Holmgren!)
Any additional information you’d like to add?
Full files and source code are available at @mrbriskly. Also, as I wrote in Medium, “I’ve learned now how hackers/web devs work: find solutions to one problem, one step at a time, rather than try to plan out an entire project in advance. I don’t know how you guys pull it off, web devs. But my appreciation for your work has escalated exponentially.”
How much of a geek do you need to be to develop apps? I’m not sure I’ve found an answer to that question. But I do know, after communicating with Phil, that I’m in a great position should I decide to take on app development. After all, I have a great team of app developers immediately available to consult with, free access to our APIs and an constant inspiration by the great work our users do.
As programmers, developers and hackers, we're all seeking solutions for our big data challenges. Whether our data comes in the form of emails, tweets, phone calls, blog posts, news articles or the elusive carrier pigeon, we want to understand what’s being said and implied – and we want to do it quickly.
There are lots of technologies to try and it can be time-consuming to spin up a new one “just to see” if it works. It’s always better if we can borrow a bit from our community and tweak it to fit our own needs. Sometimes we just need a little inspiration...
(Insert drumroll here) We’re excited to share a sample email analysis application using email-as-a-service provider SendGrid and AlchemyAPI. This example was created by Kunal Batra (@kunal732), a Developer Evangelist at SendGrid. Recently, he began a Code Challenge where he committed to learning 15 new technologies in 15 days. Yikes!
On Day 4 of his challenge, Kunal explores natural language processing and showcases how you can discover the really interesting parts of your incoming email. There are many ways a business can use this idea ranging from automatically responding to messages (hello, customer support!) to categorizing messages to classifying messages and more.
Here are the items you need to get started:
Check out Kunal’s full post to see the example in its entirety and get instructions on how to test his application yourself.
When we read articles or web pages, we are able to quickly infer and make connections between concepts and topics found within the text. For example, if you read a blog discussing the differences between “Android” and “iPhone,” you immediately recognize that the author is writing about popular “smartphones” without an explicit reference in the article.
But, how do we make connections on a massive scale when our businesses have so much data (we’re talking gigabytes, terabytes, and petabytes) in a variety of formats? At AlchemyAPI, we start with a well-formed ontology, also known as a knowledge graph. These databases show relationships between different people, places and things and “connect the dots” between them.
A knowledge graph used in conjunction with machine learning can help you associate topics to answer common business questions like:
Advertising: How do I align advertising with on-page content to increase click-through rates and purchases for my clients?
Data Management: How do I organize and tag all of the internal documents produced by my organization?
Content Recommendation: How do I serve up content that my readers are interested in and get them to read another article and increase their time on my site?
What sets a good knowledge graph apart from a great knowledge graph is how they are trained. A great knowledge graph should be trained programmatically, not curated by humans.
Through training, a knowledge graph learns how to associate terms from a variety of common patterns in text (known as Hearst Patterns). Then, several machine learning steps are used to understand words with multiple meanings. The structure and redundancy of the system allows it to self-correct.
Superior knowledge graphs train on a variety of syntactic patterns
and understand how humans naturally relate topics within text.
When combined with text and image analysis services such as Image Tagging, Entity Extraction, Keyword Extraction or Concept Tagging, a knowledge graph acts as the foundation that exposes hierarchies most relevant to your content so that you can explore and select the correct parent/child terms to enhance your results. Then, take those results and use them to deliver highly relevant advertising, group and tag internal documents or recommend content that your readers are searching for.
In the first post of our “Dispelling Common NLP Myths” series, we revealed the truth about the common belief that the human labeling of data is the only way to teach and train complex algorithms. In this post, you'll learn some key points to remember when evaluating which tools and services are right for your business. Be sure to ask the tough questions to all possible partners and test each solution thoroughly to make the best decision for your particular use case.
According to Forrester Research*, 76% of developers utilize open source and freemium technology. And for obvious reasons too. They are easily accessible and economically feasible. While free tools are sufficient for certain projects, there are a few key things to consider before moving forward with these types of NLP services.
The truth is freemium NLP tools are readily available and can be effective for simple applications. At AlchemyAPI, we provide a free tier of our web services to give developers the power to test, play and get a feel for NLP. However, when it comes to processing data on the scale of petabytes and terabytes, free systems are limited. Several of the companies we partner with experienced this struggle first-hand. “We extensively explored free tools and developing our own NLP applications. But the out-of-the-box tools only give us about 70% of the solution,” explains Jonathan Morgan of CrisisNet, a firehose global crisis data provider.
Most companies lack the computational horsepower and network infrastructure to reach the level of sophistication required to close the gap. And rightfully so, they are laser-focused on their business, not building a system from the ground up! Unfortunately, many companies spend too much time and money before they realize that these solutions don’t solve their data challenges.
Aaron Chavez, VP of Engineering at AlchemyAPI provides four reasons why free tools may not provide a competitive advantage for businesses:
1) Unmaintained systems and their associated libraries are not sophisticated enough to keep up with high volumes of data.
“NLP is an extremely active area of research. There are always new components to add to the system. A good system requires regularly incorporating recent advancements and performing maintenance to keep the system up-to-date, even for what is publicly known,” Aaron says.
2) Many free systems are trained on and require clean data to function properly.
“When all of the data that you’re benchmarking against is written in proper English, like research papers and political news, you’re not truly training your system on text that businesses want to analyze and that may be written poorly. For example, in social media channels such as Twitter, businesses find a lot of short, poorly written text. A system won't automatically understand hashtags or text-speak," states Aaron.
Aman Naimat, co-founder of Spiderbook, experienced this when experimenting with various NLP systems. “The problem I ran into was that most NLP and named entity recognition algorithms were developed using pristine datasets, hand-curated for test suites. Those algorithms cannot accurately analyze the content you find on the Web, which is not perfectly written articles, blog posts, or tweets.”
3) Many widely available free services are typically academic in nature.
“Academic goals are often much different than business goals. In order to use these systems for business intelligence, fundamental changes must be made to the way business problems are approached,” Aaron remarks.
4) Many free systems are generally designed to run on a variety of platforms and intended for a broad audience.
“One great thing about SaaS is that infrastructure can be standardized. When you choose a program that selects its hardware and writes the software to take advantage of it, you can build systems that handle much heavier processing,”
He concludes, “Despite its limitations, sometimes they [free services] can be a good fit for businesses. Finding the right APIs and integrating them can be time-consuming, but ultimately you should do thorough testing and decide what system works best for your business.”
This series will continue with another post debunking the third NLP myth: NLP systems are too expensive to customize for my specific industry and use case.
*Download the full Forrester Research report here
Customer feedback, competitor information, legal filings, press releases and other data intended for human consumption contain valuable information for an organization. But, it is nearly impossible to employ enough humans to read, comprehend and share all of it with economic practicality, speed or accuracy.
Perhaps that’s why Gartner estimates that more than 85% of Fortune 500 companies will fail to effectively utilize unstructured data to their competitive advantage by 2015. And if the world’s largest organizations struggle with this problem, we’re sure that everyone else does too. But, the good news is that there’s a solution – natural language processing.
Natural language processing (NLP) is a field of computer science in which a neural network of algorithms digests raw text and extracts critical knowledge for businesses to use for competitive advantage. In plain English, it’s a really smart way to make sense of all of the email, chat, social and other text inundating your business. A wide range of industries experience its benefits already (from advertising to social media monitoring and public relations to sales intelligence), and more explore it everyday.
Gone are the days where natural language processing and machine learning were reserved for computer scientists and big-budget organizations. With the rise of easy-to-use, cloud services, it’s now economic and practical.
As with any game-changing technology, misconceptions abound. In this series, we’ll expose and destroy four of the most pervasive myths surrounding NLP.
For NLP to work, algorithms are developed that recognize the who, what, when, and where in unstructured content. Traditionally, these systems have been trained through painstaking human annotation. This method is extremely tedious and often inefficient. A misconception exists that this manual, human labeling is the sole way to teach and train algorithms. The truth is that this arduous task is eliminated with unsupervised learning. Instead, advanced systems are automatically trained based on models that become increasingly knowledgeable with more use.
“The data that was once intended exclusively for human consumption can now be understood by machines that have been taught to process that information,” shares Aaron Chavez, Chief Scientist at AlchemyAPI.
It can be incredibly exciting to watch a computer mimic human behavior and return results from data that would take a human weeks or months to read. Leading companies realize myths for what they are and choose cloud NLP solutions that help better target consumers and predict their behavior.
Stay tuned for our second post in the series where we'll debunk the myth: “I can use open source systems for my business.”
What were a cow, Charlie Chaplin, and an old-time bartender doing at AlchemyAPI? They were gathered to celebrate Halloween, of course! We said “boo” to all the critics that claim that this holiday is “just for kids,” and got our fill of sugary fun. Decked out in costumes, our team proves that scientists, engineers, sales reps and marketers can enjoy Halloween with the best of ‘em.
With our Q3 release and Face Recognition API launch behind us, we figured it was time for a little celebration. Before we knew it, our “Boos and Brews” Halloween bash was born. What costumes would we see? The secrecy was killing us!
On October 31, everyone gathered around the company tennis table dressed from head-to-toe in creative costumes. Mario, Walter White, Bruce Willis and the Crocodile Hunter (with a cute puppy croc in tow) all made appearances for our costume contest. Only after a hilarious catwalk tiebreaker, prizes were awarded for the scariest costume (Josh impersonating another Alchemist, Devin), funniest costume (Jake as William Wallace from Braveheart) and the best overall costume (Nicholas as Walter from The Big Lebowski).
While our team remains focused on creating the best platform possible, we can always find some time to let loose and have a little fun. With the holiday season upon us, let the festivities begin!
AlchemyAPI's Team of Ghouls and Goblins on Halloween
Top-performing sales people spend a lot of time gathering information to get to know their prospects and their prospects’ businesses. They carry out background research - on Linkedin, Twitter, community forums, company websites, news articles and the list goes on - to understand the company, the department, and the people they hope to build a relationship with. Many use CRM (customer relationship management) tools to handle the routine tasks associated with the sales process.
Unfortunately, while CRM solutions are good for tracking the progress of a sale, they are inept when it comes to actually help close the deal. Even if a sales rep can adequately manage all of their tasks, there is still too much content for one person to digest and use. But, what if they had a system that automatically processed all of the deal-closing business intelligence and served it up in an easy-to-use interface?
Spiderbook, a start-up headquartered in San Francisco, was founded by Aman Naimat and Alan Fletcher to solve those problems. If the adoption rate for their service is any indication, all signs point to a rousing success.
Aman and his team of three fellow NLP developers built SpiderGraph, which uses AlchemyAPI’s Keyword Extraction, Entity Extraction and Language Detection REST APIs to forge business intelligence based on everything from the public-facing records like press releases, websites, blogs, PR and digital marketing content to private business profiles accessed through partnerships with data services providers.
“We go beyond traditional CRM by using natural language processing and named entity recognition to understand businesses,” Aman explains. “We are curious to know how they partner, details on acquisitions, the products they sell, branding, SEC listings and even the types of resources that they look for in job posts."
Spiderbook’s story describes how the team at Spiderbook is seeking to change the way sales people “connect the dots among companies, people, partners, products and documents.”
A lot of buzzwords pass through everyday business conversations. You may even call “buzzword” -- a buzzword. From “synergy” to “move the needle” to “bleeding edge,” the shelf life of most buzzwords is fairly short. But they exist for a reason. Even after jargon fizzles out and new terms take over, the meaning endures. A present-day example that we see often is “actionable data”. It's one that I hear often from our users as well as my colleagues. It came about around the same time that “Big Data” became pervasive.
What makes data actionable? Our CEO, Elliot Turner would say that it's the algorithms developers use to translate data and build smarter applications. Signal Noise, an information design agency in London that specializes in allowing “people to make sense of an increasingly data-driven world” has a similar position. Recently, they challenged people to create “static, motion and interactive visualizations which explain how algorithms work and reveal the hidden systems and processes that make the modern world tick.”
Signal Noise’s campaign caught the eye of Matt Turnbull, a UX/UI designer from London. In addition to being part of the team that designed the Nokia S 40 Full Touch Phone Interface, he has used his design skills and self-proclaimed, hobbyist-level programming capabilities to develop a generative music visualizer. His visualizer creates an identical representation each time it "hears" a song, with the colors reflecting the emotional context, as you can see in the example to the right.
Matt shared how he learned about text mining and AlchemyAPI. “Initially, Signal Noise’s introduction to their exhibition had me intrigued: ‘Almost every part of our lives, from medicine to music, is now shaped, informed or controlled in some way by algorithms. They have become one of the most powerful forces shaping the 21st century, but remain invisible and impenetrable to all but a few.’ After a bit of research, I discovered text mining algorithms. They’re the unsung heroes of the Internet. What particularly fascinated me is discovering how these algorithms actually read, an act that’s typically very human.” A contact at Signal Noise recommended Alchemy's REST APIs, and Matt was soon equipped with an API Key and the algorithms he needed to build his app.
Matt’s interactive application goes beyond showing how algorithms read by applying natural language processing to reflect sentiment and meaning while also returning named entities (people, places, etc.). His data visualization application employs these services…
...to analyze some well-known speeches from Martin Luther King, Jr., John F. Kennedy and Travis Bickel (a character from the movie Taxi Driver). In one example, Matt feeds Dr. King’s I Have a Dream speech into his application, which results in:
This shows a visualization of a fragment of Martin Luther King Jr.'s I Have a Dream speech.
The application also generates a sentiment graph, which shows both positive and negative context:
This chart displays the sentiment results of this portion of Dr. King's I Have a Dream speech.
JFK’s speech announcing the U.S.’s commitment to send astronauts to the moon reveals even more positive sentiment, shown both in the representation of his face and the associated graph:
This displays the portion of JFK's speech that was analyzed.
A visual representation of the output from JFK's speech.
He also chose to analyze a very emotionally charged chunk of dialogue from the movie Taxi Driver, the famous scene in which Bickle says, “You talkin’ to me? You talkin’ to me? Then who the hell else?” That analysis gives us:
The above charts displays Travis Bickle's memorable speech in the movie Taxi Driver.
The above charts displays the sentiment of Travis Bickle's speech.
“Once I had the AlchemyAPI Language SDK in the right place, connecting to it was blissfully easy,” Matt remarks. I asked him what surprised him most about his application. “The penny dropped when I sent Alchemy’s API a news article and it returned all of the people and places within the article before I’d finished the first sentence! In a world where roughly 80 percent of information is stored as text, the speed and effectiveness of Alchemy Language could be put to some amazing uses.”
One of those uses was to fascinate attendees at the 2014 London Design Festival. You see, it was there that Matt’s app, called Crawling for Context, was included among other projects showcasing how algorithms worked. Of course, he was curious to find out how people would react. “A lot of people stood there seemingly dumbfounded,” he remembers. “In hindsight, I could have done more to explain exactly what was going on! From those that did get it, the feedback was great. Thanks to Alchemy’s quick response time, the final exhibition piece allowed people to tweet to a hashtag and have their tweets analyzed on screen live and in front of them. When people realized this, the play factor became great. Tweets ranged from news article excerpts to ‘happy birthday so-and-so’ and included ‘David Hasselhoff’ and ‘get me a beer’.”
Turnbull's app in action at the 2014 London Design Festival.
A big thanks to Matt for letting us tell his story and for showing that there's a human quality to the technology of algorithms. In fact, we'll discuss that idea in future blogs featuring Elliot Turner. Stay tuned!