Are joins in relational databases always reliable? This paper delves into the theory of joins in relational databases, a fundamental operation for answering queries by combining data from multiple relations. It addresses the critical issue that the result of a join may not always meet expectations. Efficient algorithms are presented to determine whether the join of several relations has the intuitively expected value (is lossless) and whether a set of relations has a subset with a lossy join. The algorithms assume that all data dependencies are functional, enhancing the efficiency of database operations. Techniques are extended to handle cases where data dependencies are multivalued. By focusing on the conditions for lossless joins, the research aims to improve the reliability and accuracy of database queries. The implications of this work are significant for database design and query optimization. It provides tools and methods to ensure that joins produce correct and meaningful results, improving data integrity and database performance.
Published in ACM Transactions on Database Systems, this article aligns with the journal's focus on database theory and systems. The paper’s exploration of lossless joins in relational databases supports the journal’s emphasis on ensuring accuracy and reliability in database operations. It provides valuable insights for database professionals, making a substantial contribution to the field.