TIME ESTIMATION METHOD OF THETA-JOIN OF DATABASE TABLES BY MAPREDUCE
There are two strategies of copying (duplication) of tuples in multitable theta-join by MapReduce, investigated in the article: strategy with Hilbert curves and the interval strategy. Their primary use options were defined here. The authors derived formulae for estimating the running time for theta-join taking into account the processor, disk and plex components. The article gives practical example and considers the behavior of processor time during combination of tables’ patches in database sites.
Keywords: time estimation, theta-join, MapReduce technology, Hilbert curve, interval strategy