Analyzing Twitter Data With Storm In HDInsight

Nischal S shows how to configure an HDInsight cluster to process tweets, followed by loading them into a Power BI dashboard:

When we need to process streams of real-time data, Storm is a great contender. Examples of streaming data are the number of consumer clicks and navigations on a website, IIS or user logs, IoT data, and social network information. In all these scenarios, we use real-time data processing. Apache Storm can process real-time unbounded streams of data.

The term “unbounded” defines streams of data with no start or end. Here, the processing of data is continuous and in real-time. Twitter is a good example. Twitter data is continuous, has no start or end time, and is provided in real-time by millions of Twitter users around the world.

Storm wouldn’t rank in my top three technologies for doing this, but it certainly does the job.

Related Posts

Azure SQL Database and Extended Events

Dave Bland shows how to set up and read an extended event file on Azure SQL Database: This first step when using T-SQL to read Extended Files that are stored in an Azure Storage Account is to create a database credential.  Of course the credential will provide essential security information to connect to the Azure […]

Read More

Overriding Spark Dependencies

Landon Robinson shows how to override a Spark dependency located on the classpath: This doesn’t draw the line exactly where the method changed from private to public, but generally speaking:– gson-2.2.4.jar: the method is private, and therefore too old for use here– gson-2.6.1: the method is public, and works fine.– Somewhere between the two, the […]

Read More

Categories

August 2017
MTWTFSS
« Jul Sep »
 123456
78910111213
14151617181920
21222324252627
28293031