Streaming Twitter API, Big Data and AC Milan vs FC Barcelona
These days we are working on a project based on monitoring of tweets through the Twitter Streaming API. This API allows you to open a connection to Twitter and start receiving tweets that meet the search criteria, in our case containing certain keywords. Using this API we can get all the tweets published on Twitter. The standard API search does not offer all tweets, it is rate limited and it is not in true real time.
At this stage of project development, we need to perform several tests, mainly to assess whether the system we’re designing is capable of processing large amounts of data (tweets) per second.
On the occasion of the Champions League match AC Milan vs FC Barcelona, we thought that this might be a good opportunity to monitor different keywords associated with the game. During the match, 30 minutes before and 30 minutes after, we opened a streaming connection to the Twitter API to read all the tweets with these keywords:
milanbarça, milanbarcelona, forçabarça, forzamilan, milan-barça, milan-barcelona, milan-barca, milan-fcb, milan vs barça, milan vs barcelona, milan vs barca, messi, ibrahimovic
To perform this monitoring, we developed a small console application written in C # and based on this code written by Shannon Whitley (@swhitley). We stored the tweets in a Microsoft SQL Server database. For this test we decided not to use MSMQ queues.
The result was great and we stored in our database 83,582 tweets during the 172 minutes that the connection to the Twitter API was open, which means an average of 8.09 tweets per second.
Read more
March 29, 2012 1 Comment

