Education on Joins in SQL Allows for Easier and More Efficient Combining of Data
SQL or Structured Query Language is a “domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS)” (definition from https://en.wikipedia.org/wiki/SQL). SQL is a language utilized today in data science to uncover relationships among database entities. It is a helpful device for the effective accessing of many database records with a single command. SQL is a tool any data scientist should have in their toolbox to query databases in order to locate specific items of interest within a database, and more specifically, to join unique database tables quickly and efficiently to more easily manipulate and understand data. Today I am going to provide some challenges I completed on a Khan Academy website involving Structure Query Language (SQL). The purpose of these exercises is to expose people to join commands in SQL and enhance knowledge of joining various tables using Structured Query Language. I plan to provide two SQL challenges I completed on Khan Academy online with problems/requests and solutions to allow people to view and better comprehend joining data with Structured Query Language.
Note: The problems/requests below are not of my own creation, but came from the Khan Academy “Relational Queries in SQL” course. The solutions, however, are inspired by me from the problems/requests in the SQL challenges. For each problem/request and solution I will write out explanations on how I attained my SQL query results to enhance SQL coding comprehension. The problems/requests and solutions are shown below divided up by SQL challenge.
Challenge 1: Bobby’s Hobbies
Bobby’s Hobbies Step 1:
We’ve created a database of people and hobbies, and rows in the
hobbies table are related to rows in the
persons table via the
person_id column. Select the 2 tables with a join so that you can see each person’s name next to his/her hobby.
Bobby’s Hobbies Solution 1:
Bobby’s Hobbies Solution 1 Query Results:
To start to query a person’s name next to their hobby for my results, I wrote a “SELECT persons.name, hobbies.name FROM persons” statement to begin searching for the answers to my query utilizing the persons table. Next, I wrote a “JOIN hobbies” statement to join the query to include the hobbies table as well. Both tables have a similar id column, so I wrote an “ON persons.id = hobbies.persons_id;” (the persons.id column is invisible on my code displays, but goes in numeric ascending order e.g. 1,2,3, etc. for each row of the persons table) to combine the persons table and hobbies table on a similar column. Then, I achieved my resulting query. Notice that Fluffy Sparkles was not included in the resulting query because Fluffy Sparkles did not have a hobby in the hobbies table. The “JOIN” statement I used is also known as an inner join statement and only includes rows in both tables with matching id values. There was no matching id value for Fluffy Sparkles in the hobbies table (Fluffy Sparkles would to have had a person_id of 5 in the hobbies table to be in the queried results) and Fluffy Sparkles was left out of the result query.
Bobby’s Hobbies Step 2:
Add an additional query that shows only the name and hobbies of ‘Bobby McBobbyFace’, using
JOIN combined with
Bobby’s Hobbies Solution 2:
Bobby’s Hobbies Solution 2 Query Results:
Once again I used the same “SELECT persons.name, hobbies.name FROM persons” and “JOIN hobbies” statements to query the combination of both the persons table and the hobbies table with similar items for results. The “WHERE hobbies.person_id = 1 AND persons. name = “Bobby McBobbyFace;” statement selects just the hobbies that fall under the hobbies.person_id = 1 or persons.id = 1 category if the name of the corresponding person with that hobby is Bobby McBobbyFace from the persons.name column of the persons table. The query results display Bobby McBobbyFace with his hobbies of coding and drawing.
Challenge 2: Customer’s Orders
Customer’s Orders Step 1:
We’ve created a database for customers and their orders. Not all of the customers have made orders, however. Come up with a query that lists the
price of orders they've made. Use a
LEFT OUTER JOIN so that a customer is listed even if they've made no orders, and don't add any
Customer’s Orders Solution 1:
Customer’s Orders Solution 1 Query Results:
For the solution to the first part of the Customer’s Orders problem, I started by writing a “SELECT customers.name, customers.email, orders.item, orders.price FROM customers” statement to initialize searching for the selected column entries in the customers table. Then I joined the customers table to include items from the orders table involving a “LEFT OUTER JOIN orders” statement. Both tables have a similar customers id column so I wrote an “ON customers.id = orders.customer_id;” (the customers.id column is invisible on my code displays, but goes in numeric ascending order e.g. 1,2,3, etc. for each row of the customers table) to combine the customers table and orders table with the similar column. In this solution, a left outer join was included instead of a regular inner join. The left outer join is different from an inner join in that the left outer join includes all rows from the left table (the customers table) and joins them with the right table (the orders table) rows if there is a corresponding common row as well as including all the items in the left table in the queried results that don’t correspond to items in the right table (labeled with null values for the right table columns). I did a left outer join so all customers would be included in the queried results even if they did not make an order. Captain Awesome was included with null values for the orders.item and orders.price columns since he has yet to make an order.
Customer’s Orders Step 2:
Create another query that will result in one row per each customer, with their
Customer’s Orders Solution 2:
Customer’s Orders Solution 2 Query Results:
In order to get the total money spent by a customer on orders, I began by using a SUM() aggregate function on the orders.price column to sum the individual price values for orders for a customer. Then, I created a “SELECT customers.name, customers.email, SUM(orders.price) AS total_money_spent FROM customers) statement to establish finding the selected columns in the customers table. I scripted a “LEFT OUTER JOIN orders” statement to join the customers table to the orders table to include query results utilizing both tables. The join was done with the common id field in both tables with an “ON customers.id = orders.customers_id” statement. The left outer join statement for this problem allows for the inclusion of columns in the left table (the customers table) in the results even if there is not a matching value in the orders table. The next “GROUP BY orders.customer_id” statement groups the customers with the orders they made. The “ORDER BY total_money_spent DESC” orders the customers by the amount of money they spent on orders from the highest total money value spent on orders descending to the lowest money value spent on orders.
SQL Challenges Solved and Explained
Structured Query Language or SQL is an instrument in data science that enables a data scientist to simply and rapidly access and assess relationships among complex database entities. SQL joins, specifically, support the combinations of unique data tables in databases in a speedy manner to illicit interesting and insightful information that would otherwise be too complicated, disjointed, or disorganized to evaluate. Today I provided two SQL challenges I completed from Khan Academy online with solutions to provide any person the opportunity to explore and enhance their learning on the important data science computer program known as SQL. The challenges were meant to expand knowledge of SQL and make comprehension of dense data more simplistic. I hope the challenges were helpful and informative. Thank you for tuning in to my blog and good luck with future SQL querying endeavors!
Khan Academy | Free Online Courses, Lessons & Practice
Learn for free about math, art, computer programming, economics, physics, chemistry, biology, medicine, finance…
SQL - Wikipedia
SQL ( S-Q-L, "sequel"; Structured Query Language) is a domain-specific language used in programming and designed for…