Stream Computing Introduction

Chanming
Aug 10, 2020

Stream Computing emerges when we perform computational tasks at the edge nodes or the incoming stream is too fast and huge to store locally. In this context, we cannot temporarily store the incoming stream elements and process them in memory. Stream Computing is relevant when we have very limited space to store stream elements, or we do not even consider storing them at the first place. Inspecting the performance, accuracy and formally prove them is the main task of analyzing a stream algorithm.

Backgrounds

In the field of Algorithm Design, we have seen the basic principle rules of improving an algorithm, which is a programmatic procedure to solve a collection of problems, consists of a trade-off between space complexity and time complexity. That is to say, if we wish to improve the algorithm by reducing the computational time, in most of the time we have to sacrifices the space complexity. Of course it may not be the case if there exists a more proper abstraction to the problem which could improve the algorithm without any sort of sacrifices. However, if we are inspecting an specific algorithm, or there is no known better algorithms in the field, we have to make choices.

Stream Computing is in the context of trade-offs, but in another form, which is the algorithm accuracy. Stream Computing emerges in a background of stream elements come in as a sequence, and we wish to answer questions like did element ‘k’ appear before? How many times it appear? With some limits on answer accuracy, say we wish to impose a restriction on the error rate or the false positive rate, the Stream computing studies the data structures and computational procedures that can solve these problems in stream context. We do not have enough space to store all the stream elements, how do we design an algorithm that can process these stream elements fast, and response with a relatively good enough answer? This is the topic we discuss in stream computing.

So in this case.