Collecting and analysing web data in real time

Hadoop, ElasticSearch, high performance storage, the Cloud for my Business Intelligence

The context

Belogik is a SaaS software solution that can collect, gather and analyse any kind of machine data from the web in real time.
The solution is based on Elastic Search technology and the Python programming language. The whole application is designed to use Amazon APIs.

The problem

The Belogik application requires a large quantity of resources to centralise and analyse all of the lines of logs collected from clients’ IT services. Analysing data feeds in real time requires implementing a calculating platform that is able to automatically adapt to unpredictable client activity (buzz, events, etc.).
The other problems with this activity are related to storage: the exponentially growing volume, and the (I/O) performance requirements related to real-time activity is considerable.
The starting investments required for building their own platform were not part of Belogik’s business model.

Outscale’s answer

Initially, the Belogik application was developed on the Amazon Web Services platform. For data ownership reasons, Belogik wanted its data to be located in France

Migration was made easier by Outscale’s AWS-compatible APIs on the one hand, and similar pricing on the other.

Supplied capacity

  • 50 Cores,
  • 200 GiB of RAM,
  • 15 To of NetApp high-performance data

 

badge-xeon-vert

Discover how Intel® technologies accelerate Big Data: intel.com/bigdata

More information on intel.com/xeone5