EARLY MATERIALIZATION – METHOD OF ACCESS TO DATA STORE WITH MAPREDUCE TECHNOLOGY
A new method of access to the data store with the scheme "star" in an environment MapReduce was developed. It includes three sequentially executed tasks. The first task is building the mask of dimensions of the fact table, the second – performs the intersection of these masks, and the third task performs grouping, aggregation and obtaining the final result. The advantages of this method over existing approaches was identified. A model for query execution time to the repository has been proposed. As an example, an estimate of the time of the query Q3 of the test set TPC-H was obtained, which confirmed the effectiveness of this method.
Keywords: MapReduce technology, data warehouse, early materialization, query execution time