Sometimes SQL code gets really complicated! We can’t always do everything we want in one neat statement. Before we delve into deleting data, we’re going to take a break and look at one of the trickier SQL concepts that we’ll use on a regular basis.

What Is a Subquery?

A subquery is essentially an imaginary table. We can create a SQL query that logically groups data and then use it as part of another query as if it were a table. Subqueries can get kind of difficult to read, but they are completely essential to certain processes. For instance, our next obvious goal is learning how to delete data from SQL. Without knowing how to use subqueries, deleting data can get very confusing. Let’s take a look at some of the ways we use subqueries and look at some SQL examples.

Subqueries In Place of a Table

Subqueries

Here’s our data model for the Class database we’ve been using. Let’s say we need to look at what teachers have classes with what students. Figuring this out involves every table in our data model. We could simply start joining them all together, but we might make a mistake creating so many joins. Instead, let’s break them up into two separate subqueries. We’ll start with making a query that gives us the Class Name, Class ID and teacher’s first and last name.

SELECT ClassID, ClassName, Teachers.FirstName, Teachers.LastName
FROM Classes
INNER JOIN ClassInstance
ON Classes.ClassID = ClassInstance.ClassID
INNER JOIN Teachers
ON ClassInstance.TeacherID = Teachers.TeacherID

This query will show us what teachers teach what classes. We also have the ClassID, which will be handy for comparing with the list of what students are in each class. We’ll start by most basic subquery:

SELECT * 
FROM ( SELECT ClassID, ClassInstanceID, ClassName, Teachers.FirstName, Teachers.LastName
FROM Classes
INNER JOIN ClassInstance
ON Classes.ClassID = ClassInstance.ClassID
INNER JOIN Teachers
ON ClassInstance.TeacherID = Teachers.TeacherID
) TeachersClasses
 

This query returns the exact same thing as the one above. As you can see, instead of supplying a table in our FROM clause, we put an entire query in its place! We also gave our query a name TeacherClasses. We can now use this subquery the same way we’d use a real table. You might be asking why anyone would want to do this? We’re about to see a great reason in the next section.

Subqueries With Joins

The subquery comes in very handy when you want to combine two queries as if they were tables. In our example above, we now have half the equation. We have a query that shows us what Teachers teach what Classes. Now, we need to figure out how to join students up to the right teachers.

Let’s begin with a separate query. Once we know we have two working subqueries, we can combine them and move forward:

SELECT StudentClasses.ClassInstanceID, Students.FirstName, Students.LastName
FROM StudentClasses
INNER JOIN Students
ON StudentClasses.StudentID = Students.StudentID

This query gives us the ClassInstanceID and the name of the student. We don’t really need to join to Class since we already have the class name in our other query. Since we’re now sure our two queries are returning the right data, we can join them together.

SELECT TeachersClasses.ClassName, TeachersClasses.FirstName, TeachersClasses.Lastname, StudentClasses.FirstName, StudentClasses.LastName
FROM ( SELECT ClassID, ClassInstanceID, ClassName, Teachers.FirstName, Teachers.LastName
FROM Classes
INNER JOIN ClassInstance
ON Classes.ClassID = ClassInstance.ClassID
INNER JOIN Teachers
ON ClassInstance.TeacherID = Teachers.TeacherID
) TeachersClasses
INNER JOIN ( SELECT StudentClasses.ClassInstanceID, Students.FirstName, Students.LastName
FROM StudentClasses
INNER JOIN Students
ON StudentClasses.StudentID = Students.StudentID) StudentClasses
ON TeachersClasses.ClassInstanceID = StudentClasses.ClassInsanceID
ORDER BY TeacherClasses.ClassName, TeachersClasses.LastName, StudentClasses.LastName

It might look like a lot, but really the bold parts are the only thing we’ve added. Let’s analyze it piece by piece.

First off, the SELECT list has aliases that treat our subqueries as if they were tables. We can no longer call the ClassName Class.ClassName. This is because SQL treats everything in the parenthesis as a new table called TeachersClasses. Since some of the columns have the same name, we need to add the subquery name to the front of it so SQL knows if we mean student’s first name or teacher’s first name.

Next, notice how we have an INNER JOIN that is sort of wedged between the two tables. If these were actual tables, the statement would look like this:

SELECT TeachersClasses.ClassName, TeachersClasses.FirstName, TeachersClasses.Lastname, StudentClasses.FirstName, StudentClasses.LastName
FROM TeachersClasses
INNER JOIN StudentClasses
ON TeachersCalsses.ClassInstanceID = StudentClasses.ClassInstanceID

That’s all! Later, we’ll learn how to make permanent subset aliases that we can use just like this. But for now, we’ll keep using the subqueries. As you can see, subqueries are both powerful and complex. We’re going to learn one more little trick with subqueries. Don’t worry! We are through the toughest part of it.

WHERE IN Clause

A subquery can also be used in a WHERE clause. This might sound odd, but we’re going to look at a special way you can use a list of values in the WHERE clause. Let’s take a look at the long way of doing this first. It’ll help us see why it is better to use this handy shortcut.

SELECT * 
FROM Classes
WHERE ClassID = 1
OR ClassID = 2
OR ClassID = 3
OR ClassID = 10
OR ClassID = 11

This statement is completely valid. You are basically supplying a list of ClassIDs that you’d like to look at. However, it is a bit confusing and hard to read. The WHERE IN clause lets us solve this problem in a much easier way.

SELECT *
FROM Classes
WHERE ClassID IN (1,2,3,10,11)

This returns the exact same result without having that ugly list of OR clauses. This is great for running a quick query to see a list of information. However, what happens if we want to use a query to determine what we’d like to see? The WHERE IN clause will also accept a SQL subquery as long as it only has one field in the SELECT statement.

SELECT *
FROM Classes
WHERE ClassID IN ( SELECT ClassID FROM Classes
WHERE ClassID < 4 and ClassID > 9)

Now we have a dynamic way to return data based on a query. We can even use more complex statements as long as they only return a single column. For example, let’s grab our StudentClasses subquery from above. We can use it to look for students who aren’t in any classes.

SELECT *
FROM Students
WHERE StudentID NOT IN (SELECT StudentID
FROM StudentClasses
INNER JOIN Students
ON StudentClasses.StudentID = Students.StudentID)

By supplying the NOT keyword, we’ve made this query work in reverse. Now it tells us what StudentIDs are not in the returned list of StudentIDs.

The Power Of Subqueries

Here are a few quick things to remember about subqueries:

  • Make sure each subquery works before joining them together.
  • Use the alias name when referencing fields from a subquery.
  • Subqueries used with WHERE IN or WHERE NOT IN must only have one column returned.

Now that we’ve learned about subqueries, we can move on to deleting data. Congratulations! You’ve mastered one of the more challenging SQL topics today!



author's photo

Author : SQL Tutor

Tainyan staff who teach SQL.

COMMENTS
blog comments powered by Disqus
Contributors
Categories