Increasing amount of data-intensive applications also show a huge improvement for data storage and processing methods. Managing the amount of data is not possible with traditional methods, because of these methods' capacity, performance, nor storage are not sufficient to meet the needs of present-day applications. Tasks related to subject area (eg. transmission, data aggreation, data transformation, data storage, data retrieval), have many and many new applications with a modern approach that can be used a variety of criteria to provide an optimal service. There are a lots of alternative, with advantages and disadvantages and all of them provide an optimal solution for a given problem. Typical challenge of this subject is the staggering amount of offline data processing and an extract of the rapid aggregation of making them searchable online applications.
My goal is to provide an efficient framework to process a data stream and make it searchable. This task involves data-processing needs (memory, CPU) which can not be so obvious.
I should make a summary about the problems, a list of technologies with their characterization. The measurements should support these lines. It is important that the measurements level playing field. Knowing the test results it is easier to design a definitive architecture with a specific data source, subject to the maximum limits of the technology.