Application of Text Classification and Clustering of Twitter Data for Business Analytics in Python

Application of Text Classification and Clustering of Twitter Data for Business Analytics in Python

Abstract:

In the recent years, social networks in business are gaining unprecedented popularity because of their potential for business growth. Companies can know more about consumers' sentiments towards their products and services, and use it to better understand the market and improve their brand. Thus, companies regularly reinvent their marketing strategies and campaigns to fit consumers' preferences. Social analysis harnesses and utilizes the vast volume of data in social networks to mine critical data for strategic decision making. It uses machine learning techniques and tools in determining patterns and trends to gain actionable insights. This paper selected a popular food brand to evaluate a given stream of customer comments on Twitter. Several metrics in classification and clustering of data were used for analysis. A Twitter API is used to collect twitter corpus and feed it to a Binary Tree classifier that will discover the polarity lexicon of English tweets, whether positive or negative. A k-means clustering technique is used to group together similar words in tweets in order to discover certain business value. This paper attempts to discuss the technical and business perspectives of text mining analysis of Twitter data and recommends appropriate future opportunities in developing this emerging field.

Download