In the era of big data, can database systems efficiently handle ever-increasing amounts of information? This study explores algorithms for computing the equijoin of two relations in systems with large main memories, aiming to optimize performance in data-intensive environments. The research proposes and evaluates new algorithms designed to leverage the benefits of increased memory capacity for faster data processing. The paper presents a hybrid algorithm combining hash-based approaches that outperforms traditional methods like sort-merge, particularly when the available memory is a significant fraction of the data size. Even in virtual memory environments, the hybrid algorithm demonstrates superior performance. The researchers also describe how filters, Babb arrays, and semijoins—popular tools for improving join efficiency—can be integrated into their algorithms, providing flexibility for diverse applications. These advances contribute to the ongoing effort to optimize database systems for handling the challenges of modern data management.
Published in ACM Transactions on Database Systems, this study aligns directly with the journal's focus on database management and optimization. By investigating algorithms for join processing in large-memory systems, the paper addresses a fundamental challenge in database performance and contributes to the ongoing development of efficient data processing techniques, a key area of interest for the journal's audience.