+ Reply to Thread
Page 1 of 3 123 LastLast
Results 1 to 10 of 22

Thread: Using Set Theory to Understand SQL Joins

  1. #1
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Using Set Theory to Understand SQL Joins

    One of the areas of SQL that causes some people issues is understanding how joins work in queries. To help make this clear, I'm going to start with the underlying concept of all of SQL: set theory.

    A set is simply a collection of 0 or more objects. A set with 0 objects is called the empty set, and is represented as {}. The set of integers would be {..., -3, -2, -1, 0, 1, 2, 3, ...}. The set of furniture in my office is {desk, chair, bookshelf, file cabinet}. The objects listed in the set are called elements of the set (so bookshelf is an element of the set of furniture in my office).

    In SQL, a table corresponds with a set, and a record corresponds with an element. So if table1 has three records, we can think of table1 as a set and each record as one of the three elements in it.

    Sets have a lot of different properties and relations. Two important ones that relate to SQL are subset and cross product. A subset is a (possibly) smaller set whose elements can only come from the elements of the original. If A = {1, 2, 3, 4}, then the following are all subsets of A: {}, {1}, {2, 3}, {1, 2, 3, 4}. The empty set is a subset of ALL sets. Every set is a subset of itself.

    The cross product of two sets is a new set that contains ordered pairs made of the elements of the originals. The ordered pairs' elements have to be in the same order as the order of the sets. For example: if A = {1, 2, 3} and B = {4, 5, 6}, then AxB = {(1,4), (1,5), (1,6), (2,4), (2,5), (2,6), (3,4), (3,5), (3,6)}. Note that while a set doesn't care about order ( {1,2,3} = {1,3,2} ), an ordered pair DOES, so (1,4) <> (4,1).

    So what does this have to do with SQL? EVERYTHING! Let's look at two sample tables:
    Code:
    table1
    t1_id, t1_subid, t1_data
    ------------------------
    1      56        first
    2      23        second
    3      72        third
    
    table2
    t2_id, t2_subid, t2_data
    ------------------------
    1      23        fourth
    2      56        second
    3      72        fifth
    Sometimes we want data from both tables. In that case we have two options, 1: do a bunch of queries to get the necessary data, 2: use a join. In the raw, a join simply acts like a cross product on the records.

    SELECT * FROM TABLE1 JOIN TABLE2 would return the following records:
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    1      56        first    1      23        fourth
    1      56        first    2      56        second
    1      56        first    3      72        fifth
    2      23        second   1      23        fourth
    2      23        second   2      56        second
    2      23        second   3      72        fifth
    3      72        third    1      23        fourth
    3      72        third    2      56        second
    3      72        third    3      72        fifth
    Notice this is the same as doing a cross product. The problem is that this may be returning more than you wanted. There are two ways to restrict the results: with an on clause and a where clause. Let's say we wanted to only return the results where table1.t1_id = table2.t2_id. There are two ways we could do this:
    SELECT * FROM TABLE1 JOIN TABLE2 ON TABLE1.T1_ID = TABLE2.T2_ID
    or
    SELECT * FROM TABLE1 JOIN TABLE2 WHERE TABLE1.T1_ID = TABLE2.T2_ID
    In either case, the result would be:
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    1      56        first    1      23        fourth
    2      23        second   2      56        second
    3      72        third    3      72        fifth
    If we wanted t1_data to match t2_data, however, things get interesting. A where clause will simply look at ALL the results and select those that match. So
    SELECT * FROM TABLE1 JOIN TABLE2 WHERE TABLE1.T1_DATA = TABLE2.T2_DATA
    will return:
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    2      23        second   2      56        second
    A join, however, has four options. A plain join is sometimes called an "inner join". There is also a "left outer join", a "right outer join", and a "full outer join". The issue is that you may need some of the table from one of the tables, even if there isn't a match! For example, if table1 represents customers and table2 represents products on order, you may want a list of all your customers, regardless of whether there's a product on order or not. Here is what each join means, along with an example result set:
    inner join: returns all the matches
    left outer join: returns all the matches, left-hand records that don't have a match are matched with NULL
    right outer join: returns all the matches, right-hand records that don't have a match are matched with NULL
    full outer join: returns all the matches, any record from either side that doesn't have a match is matched with NULL
    SELECT * FROM TABLE1 INNER JOIN TABLE2 ON TABLE1.T1_DATA = TABLE2.T2_DATA
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    2      23        second   2      56        second
    SELECT * FROM TABLE1 LEFT OUTER JOIN TABLE2 ON TABLE1.T1_DATA = TABLE2.T2_DATA
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    1      56        first    NULL   NULL      NULL
    2      23        second   2      56        second
    3      72        third    NULL   NULL      NULL
    SELECT * FROM TABLE1 RIGHT OUTER JOIN TABLE2 ON TABLE1.T1_DATA = TABLE2.T2_DATA
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    NULL   NULL      NULL     1      23        fourth
    2      23        second   2      56        second
    NULL   NULL      NULL     3      72        fifth
    SELECT * FROM TABLE1 FULL OUTER JOIN TABLE2 ON TABLE1.T1_DATA = TABLE2.T2_DATA
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    1      56        first    NULL   NULL      NULL
    NULL   NULL      NULL     1      23        fourth
    2      23        second   2      56        second
    3      72        third    NULL   NULL      NULL
    NULL   NULL      NULL     3      72        fifth
    Finally, you can combine the results of a join clause and a where clause to do some interesting things. Let's say that every record in TABLE1 and TABLE2 is supposed to have at least one match, but our database got out of wack when a script crashed. To find the records that do NOT have a match, we can do this:
    SELECT * FROM TABLE1 FULL OUTER JOIN TABLE2 ON TABLE1.T1_DATA = TABLE2.T2_DATA WHERE TABLE1.T1.ID IS NULL OR TABLE2.T2_ID IS NULL
    This will create the matches above, and then look for where a match didn't really happen:
    Code:
    table1 join table2
    t1_id, t1_subid, t1_data, t2_id, t2_subid, t2_data
    --------------------------------------------------
    1      56        first    NULL   NULL      NULL
    NULL   NULL      NULL     1      23        fourth
    3      72        third    NULL   NULL      NULL
    NULL   NULL      NULL     3      72        fifth
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  2. CODECALL Circuit advertisement
    Join Date
    Always
    Posts
    Many

     
  3. #2
    Jordan Guest

    Re: Using Set Theory to Understand SQL Joins

    Wow, impressive and very comprehensive. I know a lot of people have trouble with joins when learning SQL. This should help them a lot!

  4. #3
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: Using Set Theory to Understand SQL Joins

    Seems to be quite long, I will read it when I can. But from Jordan's comment and quick look over the tutorial. It looks really good.
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

  5. #4
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Using Set Theory to Understand SQL Joins

    I could probably add some Venn Diagrams to further explain the join process... but I forgot to when I was typing it up.
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  6. #5
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: Using Set Theory to Understand SQL Joins

    Why not create a part 2?
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

  7. #6
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Using Set Theory to Understand SQL Joins

    To be honest, I think what I've got here is probably clearer. Unfortunately, ON clauses and WHERE clauses, in particular, can be confusing when dealing with joins.
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  8. #7
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: Using Set Theory to Understand SQL Joins

    I just finished school so I am not doing nothing, so I read your tutorial.

    Great tutorial mate, no matter where I went and tried to learn SQL Joins it just made no sense. This tutorial on the other hand, was the most clearest joins tutorial ever! All I can say is well done. Even though it doesn't matter, I will +rep you
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

  9. #8
    Join Date
    Jul 2006
    Posts
    16,491
    Blog Entries
    75
    Rep Power
    143

    Re: Using Set Theory to Understand SQL Joins

    I wrote it because I've seen a lot of intelligent people get scrambled by joins.
    Programming is a branch of mathematics.
    My CodeCall Blog | My Personal Blog

  10. #9
    Join Date
    Mar 2008
    Location
    The North Pole
    Posts
    13,174
    Blog Entries
    13
    Rep Power
    114

    Re: Using Set Theory to Understand SQL Joins

    You have probably seen a lot of stupid people get scrambled by joins too.

    Quote Originally Posted by Jordan View Post
    Good members, like yourself, stick around and post for ages to come!
    Mr. Xav | Blog | Forums

  11. #10
    Join Date
    Sep 2008
    Location
    Australia
    Posts
    4,834
    Blog Entries
    10
    Rep Power
    51

    Re: Using Set Theory to Understand SQL Joins

    Of course, joins are a breeze now.
    jQuery Selectors Tutorial - jQuery Striped Table tutorial - jQuery Events - jQuery Validation
    Sorry if I don't post as often as I did, I'll try to get here as much as possible! I'm working my bum off to get this scholarship and other stuff!

+ Reply to Thread
Page 1 of 3 123 LastLast

Thread Information

Users Browsing this Thread

There are currently 1 users browsing this thread. (0 members and 1 guests)

Similar Threads

  1. A new moderator joins the team
    By John in forum Announcements
    Replies: 16
    Last Post: 10-25-2010, 09:28 PM
  2. SQL joins
    By chili5 in forum Database & Database Programming
    Replies: 4
    Last Post: 05-18-2009, 07:55 AM
  3. Joins vs. Separate Queries
    By BASHERS33 in forum Database & Database Programming
    Replies: 18
    Last Post: 05-07-2009, 08:32 AM
  4. Joins
    By NeedHelp in forum Database & Database Programming
    Replies: 2
    Last Post: 08-02-2006, 03:35 PM

Tags for this Thread

Bookmarks

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts