Databases Pedagogy SQL

Trust Data, Not Lore: SQL Joins Visualized

Sound advice from Star Trek The Next Generation.

Quoting a very cool YouTube video by Kurazgesagt, “When it comes down to it, information is nothing tangible. It’s typically understood as the property of the arrangement of particles.”

What this means is, a bunch of carbon atoms organized one way can create coal. The very same carbon atoms arranged another way can create diamonds. Zoomed in, the atoms are the same. What changes is the information. Our mental representation of physical things (e.g. a diamond or coal) is also information, and the process by which these thoughts manifest in the first place is also, in the end, different neuronal activity in the brain, which, once again, are just the arrangement of particles. By our definition, also information.

In a way, the whole universe is just a collection of information, and how we perceive ourselves in it is a product of our ability to see the context, or interrelatedness, of said information. In this frame of reference, zoomed out, all of the ideas, philosophies, technology, connections, and concepts that exist in this blip of space-time on planet Earth, and even Earth itself, can be seen as just different information, interrelated, arranged, and interpreted, in different ways (The carbon particles in any configuration are objectively just that, carbon particles, but to be conscious is to know that we, as humans, tend to burn one configuration and wear the other in a ring).

“I just wondered how things related to one another.” – Claude Shannon, father of information theory

One philosophical view is that it’s all just information out there, and SQL (Structured Query Language) gives us a way to organize and see connections between different tables or groupings of information. Information that is orderly and is able to be referenced and analyzed in relation to each other is useful, and can be referred to as data. Our job as developers is to be masters of information and data. We create context and map relationships out of information in order to bring function and order to our software applications. This is why SQL queries are important: they’re commands that get back specific information based on your needs. One particular set of queries, SQL joins, make sense of how information relates to each other, retrieving information in relation to other information in a database based on commonalities between the information.

Many novice programmers, including myself starting out, get confused when first learning SQL and how exactly to query specific information from tables in a database. Tables represent sets of information about specific domains, and the results of ‘join queries’ in SQL represent the interrelatedness of elements between these domains, but these can be hard to conceptualize.

From Ligaya Turmelle’s great blog post, we have our example universe of information: two data sets visualized in the form of circles in a venn diagram, each circle representing a table (Table 1 (T1) and Table 2 (T2)) and all the records within:

id name       id  name
-- ----       --  ----
1  Coal     1   Picard
2  Spot     2   Coal
3  Diamond  3   Neuron
4  Jordi    4   Diamond

The following are the three frequent ‘join queries’ one can use in SQL to query different matching records from each table, visually explained.

An inner join query only returns records with matches in both tables.

SELECT * FROM TableA
INNER JOIN TableB
ON TableA.name = TableB.name
id  name       id   name
-- ----        --  ----
1   Coal       2    Coal

3   Diamond    4    Diamond

An outer join query returns all the records from T1 and T2, with matching records from both sides where available.

SELECT * FROM TableA
FULL OUTER JOIN TableB
ON TableA.name = TableB.name
id    name       id    name
--    ----       --    ----
1     Coal       2     Coal

2     Spot       null  null

3     Diamond    4     Diamond

4     Jordi      null  null

null  null       1     Picard

null  null       3     Neuron

A left join query returns all of the records in the first (left) table whether or not they have a match or not, but if they do have a match, returns the matching record on the right table as well.

SELECT * FROM TableA
LEFT OUTER JOIN TableB
ON TableA.name = TableB.name

id  name       id    name
--  ----       --    ----
1   Coal       2     Coal
2   Spot       null  null
3   Diamond    4     Diamond
4   Jordi      null  null

Repeatedly visualizing each SQL join I had to make helped clarify them for me when I was learning.

When organized properly, we can trust data, and information’s relationship to other data gives us context and a narrative.

Learning SQL and organizing data helps us make sense of a complicated universe, whether in a computer program or in reality, which is, perhaps in and of itself, in the end, all just information.

Trust Data, Not Lore: SQL Joins Visualized
3 thoughts on “Trust Data, Not Lore: SQL Joins Visualized
  1. indir says:

    I was able to find good information from your blog posts. Jewell Vasily Jayne

  2. mp3 says:

    This page appears to get a good ammount of visitors. How do you advertise it? It offers a nice individual twist on things. I guess having something authentic or substantial to talk about is the most important thing. I would appreciate if you could visit my site and give me some tips. Harlie Darnell Porcia

  3. dizi says:

    Pretty! This was an incredibly wonderful article. Thanks for providing these details. Anjela Titos Christean

Leave a Reply

Your email address will not be published.