What is Data Streaming?

A data streaming is a continuous, endless data flow that has no beginning or end that can be utilized in the real-world and acted upon without download it. Data streams are generated by all types of sources, in various formats and volumes such as the applications, server logs, network devices, GPS, transactions made in a bank, website crawling activity, and so on.


Data Stream Life Cycle

Data streams play a key part in the world of big data by providing real-time analyses, data integration, and data ingestion. The life cycle events of a data stream have three components.

Let us see each component in detail.

    1. Create

    It is a very important component of the data stream life cycle as it produces a stream of data from multiple sources such as server logs from servers on which applications are hosted, behavioral data in the form of click stream, page views from various business applications, social data coming from various social sites, various sensors (IoT) producing different parameters. Many other sources are producing data at a very high speed.

    2. Collect

    This component is used to collect data and make it available for further processing. Apache Kafka is used to achieving this capability. Other options are also there such as ActiveMQ, HornetQ to achieve this capability.

    3. Process

    This component is used to process collected stream data to create a meaning full insights which in turn helps to make a better decision.


    data stream life cycle

Data Stream Use Cases

In every industry, there is a use case of the data streaming, and the capability to integrate, analyze, troubleshoot, and/or predict data in real-time, at a massive scale, creates new use cases. An organization can take valuable insights on data that is in motion along with batch processing.

Let us see some use cases of Data Stream.

  • Location data.
  • Fraud detection.
  • Real-time stock trades.
  • Marketing, sales, and business analytics.
  • Customer/user activity.
  • Monitoring and reporting on internal IT systems.
  • Log Monitoring: Troubleshooting systems, servers, devices.
  • Security Information and Event Management: analyzing logs and real-time event data for monitoring, metrics, and threat detection.
  • Machine learning and A.I.: Analyzing the past and present data together is bringing new possibilities.