Welcome to the Treehouse Community

Want to collaborate on code errors? Have bugs you need feedback on? Looking for an extra set of eyes on your latest project? Get support with fellow developers, designers, and programmers of all backgrounds and skill levels here with the Treehouse Community! While you're at it, check out some resources Treehouse students have shared here.

Looking to learn something new?

Treehouse offers a seven day free trial for new students. Get access to thousands of hours of content and join thousands of Treehouse students and alumni in the community today.

Start your free trial

PHP

Stephen Printup
seal-mask
.a{fill-rule:evenodd;}techdegree
Stephen Printup
UX Design Techdegree Student 45,252 Points

What argument goes between JOIN and ON?

Hello- I passed the challenge, but I am still curious about the arguments name between JOIN and ON? Such as

JOIN <?> ON <joiningTable.relationshipID> = <table.relationshipID>

I don't see a pattern in Media_Genres and Genres, so it would make it difficult to recreate my own query.

Thanks.

The original question was:

We will be writing ONLY the SQL query for this challenge.

The library database contains a Media table with the columns media_id, title, img, format, year and category. It also contains a Genres table with the columns genre_id and genre. To join these tables, there is a Media_Genres table that contains the column media_id and genre_id

Add to the following SELECT statement to JOIN the Media table and the Genres table using the joining table Media_Genres.

SELECT * FROM Media WHERE media_id=3;

NOTE: You will need to add the table to the WHERE clause so that the media_id column is not ambiguous.

My answer was:

SELECT * 
FROM Media
JOIN Media_Genres ON Media_Genres.media_id = Media.media_id
JOIN Genres ON Media_Genres.genre_id = Genres.genre_id
WHERE Media_Genres.media_id=3;

5 Answers

K Cleveland
K Cleveland
21,839 Points

Ah, I understand your question. To answer it, the <?> is a table. The reason why you use JOIN is to get rows back that shows columns from multiple tables. In this case, you want to see columns from two tables -- Media and Genres. But there is nothing that connects these two tables.

Enter the joining table: Media_Genres. This table has a media_id which corresponds to the media_id in the Media table. Media_Genres also has a genre_id column that corresponds to the genre_id column in the Genres table.

SELECT * 
FROM Media
JOIN Media_Genres ON Media_Genres.media_id = Media.media_id
WHERE Media_Genres.media_id=3;

If you run the query like this, you'll get back rows from the Media table where media_id = 3. In those rows, you'll not only see all the columns from the Media table, but you'll also see all the columns from The Media_Genres table including the genre_id column.

You will not see any columns or information from the Genres table.

SELECT * 
FROM Media
JOIN Media_Genres ON Media_Genres.media_id = Media.media_id
JOIN Genres ON Media_Genres.genre_id = Genres.genre_id --now adding in this line to join the Genres table
WHERE Media_Genres.media_id=3;

When you run this query, you'll get back rows that have columns from the Media, Media_Genres, and Genres table. This is because when media_id = 3 in the Media table, Media_Genres has a matching media_id. It also has a column genre_id. This genre_id column is also in the Genres table.

Think about this query:

SELECT Genres.genre_id FROM Media
JOIN Media_Genres ON Media.media_id = Media_Genres.media_id 
JOIN Genres ON Media_Genres.genre_id = Genres.genre_id --added this line
WHERE Media_Genres.media_id=3;

What do you think you'll see? Will you see information from the Media or Media_Genres table? Or will you only see information from the Genres table?

I hope this helped and apologies in advance for any typos, haha. Good luck! :)

Stephen Printup
seal-mask
.a{fill-rule:evenodd;}techdegree
Stephen Printup
UX Design Techdegree Student 45,252 Points

Ohhh, the Genres table is inside the Media table. We get the Media table out of the box in the FROM Media statement.

So it's like a pointer to a table where columns are that you want to compare. You need all the columns that you want to compare, from tables without pointers (Media_Genres, Genres), so that you can get back rows with matching relationshipIDs.

Since the Genres table only has a genres_id and not a media_id, we have to put the Media.media_id JOIN above the Genres.genre_id JOIN. This is because the initial JOIN (Media_Genres) returns the rows with the initial matching relationshipIDs (Media_Genres.media_id = Media.media_id), with the columns from the Media table and rows where media_id = 3, which include the .genre_id. Returning the results of the Genre table at the top would not include a .media_id.

If we put the JOIN Genres statement first such as

JOIN Genres ON Media_Genres.genre_id = Genres.genre_id

we get back an SQL Error: no such column: Media_Genres.media_id

However, if we put the JOIN Genres first such as

JOIN Genres, Media_Genres ON Media_Genres.genre_id = Genres.genre_id

we get back results, but no error.

We get the error because we don't have the pointer to the Media_Genres table yet. Having the JOIN Media_Genres statement first, we do have access to the Media table, because of the pointer in the FROM statement.

In summary, each table needs a pointer and the sequence of them depends on the columns they return. Also, in this case:

  1. The second JOIN depends on the first JOINs .genre_id column. The pointer toward Genres gives access to the Genres table .genre_id to compare relationshipIDs from the returned rows of the first JOIN.
  2. The Media table pointer is in FROM statement, so it doesn't need a pointer in the first JOIN.

Interesting Code that does not work, but theoretically could. This is why we need 2 JOIN statements to begin with:

SELECT * FROM Media
JOIN Media_Genres, Genres ON Media_Genres.media_id = Media.media_id, Media_Genres.genre_id = Genres.genre_id
WHERE Media_Genres.media_id=3;
K Cleveland
K Cleveland
21,839 Points

I'm not sure if I understand your question correctly. Would you mind linking to the actual challenge? From what I can tell from what you've written, the <?> represents the table you'd like to join on.

--SELECT * FROM Table1 
SELECT * FROM Media
--JOIN Table2 
JOIN Media_Genres
--ON Table1.ColumnName = Table2.ColumnName
ON Media.media_id = Media_Genres.media_id 
--JOIN Table3 
JOIN Genres 
--ON Table2.ColumnName = Table3.ColumnName
ON Media_Genres.genre_id = Genres.genre_id
Stephen Printup
seal-mask
.a{fill-rule:evenodd;}techdegree
Stephen Printup
UX Design Techdegree Student 45,252 Points

Thanks, yes it is a table, but the Media_Genres table is a joining table and the Genres table is just a table, so I am confused why they are both used in the query in the same place. The way I'm interpreting it, it sound something like 'compare the things to the right of ON to the rules on the left.' But that is not the case with Genres, because it is not a joining table or something. Obviously, what I just said was wrong, in which lies the question: How do you describe what you put where I placed the question mark?

Thanks again.

K Cleveland
K Cleveland
21,839 Points

Ohhh, the Genres table is inside the Media table. We get the Media table out of the box in the FROM Media statement.

The Genres table is a table. The Media table is also a table,. These two tables are independent of each other. One is not inside the other. The Media_Genres table is a special type of table, called the joining table. But, it's still just a table.

So it's like a pointer to a table where columns are that you want to compare.*

Sure, I think you can consider them like pointers, but I think it's much more useful to think of them in terms of the relationships between tables.

Since the Genres table only has a genres_id and not a media_id, we have to put the Media.media_id JOIN above the Genres.genre_id JOIN.*

Yes. You're linking the Media and Genres table via the Media_Genres joining table.

In summary, each table needs a pointer and the sequence of them depends on the columns they return.*

In a general sense, yes. But also, look at this:

SELECT * FROM Genres
JOIN Media_Genres 
ON Genres.genre_id = Media_Genres.genre_id
JOIN Media 
ON Media_Genres.media_id = Media.media_id
WHERE media_id  = 3; 

The only difference between the original query and this one is that you'll see the results from the Genres table first, then Media_Genres, then Media. But I could also write the query like this:

SELECT Media.* FROM Genres
JOIN Media_Genres 
ON Genres.genre_id = Media_Genres.genre_id
JOIN Media 
ON Media_Genres.media_id = Media.media_id
WHERE media_id  = 3; 

And I would only see the results from the Media table. That's why I like thinking of this more in terms of relationships between tables, and not just pointers. Thanks and again excuse the typos. :)

Stephen Printup
seal-mask
.a{fill-rule:evenodd;}techdegree
Stephen Printup
UX Design Techdegree Student 45,252 Points

Thanks for you help. It has been enlightening.

I want to point out, so I don't miss anything, that the following code produces the error: SQL Error: ambiguous column name: media_id

SELECT * FROM Genres --pointer to Genres
JOIN Media_Genres --pointer to Media_Genres
ON Genres.genre_id = Media_Genres.genre_id --no reference to media_id, so error can't be here
JOIN Media --pointer to media
ON Media_Genres.media_id = Media.media_id --ahh here is the error. The ref to Media_Genres was in the previous JOIN and thus does not apply in this join. 
WHERE media_id  = 3; 

The video had this similar schema:

I now see that Genres is not inside Media, but the relationship exists in a joining table not pictured name Media_Genres

Also, in the following code produces the same error, but for different reasons: SQL Error: ambiguous column name: media_id

SELECT Media.* FROM Genres --pointer to Genres
JOIN Media_Genres --pointer to Media_Genres
ON Genres.genre_id = Media_Genres.genre_id --no media_id
JOIN Media --pointer to Media, BUT since ^this^ JOIN doesn't have a media_id column, it doesn't return the media_id column from <this< Media pointer
ON Media_Genres.media_id = Media.media_id -- this is where the error is
WHERE media_id  = 3; 

Was there something that I'm missing from when you said(?):

That's why I like thinking of this more in terms of relationships between tables, and not just pointers.

K Cleveland
K Cleveland
21,839 Points
SELECT Media.* FROM Genres --pointer to Genres
JOIN Media_Genres --pointer to Media_Genres
ON Genres.genre_id = Media_Genres.genre_id --no media_id
JOIN Media --pointer to Media, BUT since ^this^ JOIN doesn't have a media_id column, it doesn't return the media_id column from <this< Media pointer
ON Media_Genres.media_id = Media.media_id -- this is where the error is
WHERE Media.media_id  = 3; --changed media_id to Media.media_id. 

In both spots, it's ambiguous because I didn't specify which table I'm using when I query for WHERE media_id = 3. That's what copying and pasting will get you, haha.

That's why I like thinking of this more in terms of relationships between tables, and not just pointers.

No, not really. What I'm trying to point out is that while you can think of JOINs as "pointing" to a table, you're also relating the tables together based on column and then creating some set from that. I know the video doesn't talk about it, but JOIN(INNER JOIN) is one of many types of JOINs:

SELECT * 
FROM Media
JOIN Media_Genres 
ON Media_Genres.media_id = Media.media_id

SELECT * 
FROM Media
LEFT JOIN Media_Genres 
ON Media_Genres.media_id = Media.media_id

The first query will give you back rows only if that row has a media_id column that matches the media_id column in Media_Genres. The second query will give you back everything from the Media table, including WHERE Media_Genre.media_id IS NULL (ie there is no matching media_id value in the Media_Genres table).

Run those two queries and see what happens. :)

From that diagram, actually, you can see that the Media table and the Genres table actually do have a relationship. Media has a column called genre_id. This is called a foreign key. The Genres table has a column called genre_id. That's the primary key and it uniquely identifies that record. Since the genre_id foreign key is on the Media table, we know that these two tables are related, and can be joined on genre_id.

Hope I've explained myself well. Most of the SQL I know I learned directly from TreeHouse, if you're interested. And again, apologies for all the typos/errors if you see some.