2-Introduction-to-Relational-Databases-in-SQL
文章目录
-
- Begin with Your Database
-
- 1.1 Overview of Relational Databases (video)
-
- A Closer Look at Attributes (video)
-
- Retrieving Information from Schema using SELECT (video)
-
- 1.4 Tables: The Building Blocks of Databases (video)
-
- Crafting Your Initial Tables
-
- As You Evolve Your Database: Adding Columns with ALTER TABLEs
-
- Adjustments: Updating Your Database as the Structure Changes (video)
-
- A Guide to Renaming and Dropping Columns in Affiliations (video)
-
- Migrating Data Efficiently with INSERT INTO SELECT DISTINCT
-
- A Final Step: Eradicating Unnecessary Tables
-
2. 实施数据一致性管理
- 2.1 确保高质量数据(视频)
- 2.2 数据约束类型
- 2.3 符合数据类型要求
- 2.4 类型转换(CAST)(视频)
- 2.5 处理数据类型(视频)
- 2.6 修改列数据类型(ALTER COLUMN)
- 2.7 使用函数转换数据类型
- 2.8 非空与唯一约束(视频)
- 2.9 防止插入NULL值(SET NOT NULL)
- 2.10 插入NULL值时会发生什么?
- 2.11 创建唯一列(ADD CONSTRAINT)
- 2.1 确保高质量数据(视频)
-
3. 唯一标识记录的关键约束
- 3.1 主键与外键(视频)
- 3.2 学习SELECT COUNT DISTINCT语句
- 3.3 通过SELECT COUNT DISTINCT识别键
- 3.4 主键(视频)
- 3.5 认识主键
- 3.6 在表中添加主键约束
- 3.7 外设主键(视频)
- 3.8 添加外设序列号主键
- 3.9 将列连接到外设序列号主键
- 3.10 在深入学习前测试你的知识
-
4. 将表通过外键连接
- 4.1 建立N对一关系(视频)
- 4.2 引用包含外键的表
- 4.3 探索外键约束
- 4.4 链接相关联的表
- 4.5 建立更复杂的关系(视频)
- 4.6 向‘关联’表添加外键
- 4.7 填充‘教授ID’列
- 4.8 删除‘firstname’和‘lastname’
- 4.9 设置引用完整性(视频)
- 4.10 引用完整性违反的情况
- 4.11 修改引用完整性行为的方式(视频)
- 4.12 总结(视频)
- 4.13 统计每个大学的关联数量
- 4.14 将所有表连接在一起
- 4.1 建立N对一关系(视频)
1. You First Database
1.1 Introduction to Relational Databases (video)
1.2 Attributes of Relational Databases
1.3 Query Information_Schema with SELECT
It serves as a comprehensive meta-database, designed to store metadata related to your operational database. It includes multiple tables that allow you to access information using the standard SQL SELECT * FROM statement.
- databases中的表及其详细信息
- fields:在各个数据库中的各个表格中各字段的详细信息
- …
Within this exercise, the requirement is to retrieve data solely from the 'public' schema that is defined as the column table_schema within both the tables and columns tables. This particular schema encompasses details about user-defined tables and databases. In contrast, other categories of table_schema, such as those related to system metadata, are outside the scope of this module since we are focusing exclusively on user-defined schemas.
Instruction 1 Obtain detailed information about every table name in the current database environment, while ensuring that your query is restricted exclusively to tables associated with the public schema.
-- Query the right table in information_schema
SELECT table_name
FROM information_schema.tables
-- Specify the correct table_schema value
WHERE table_schema = 'public';
Please review the columns in university_professors through the process of selecting all entries in information_schema.columns that belong to this table.
-- Query the right table in information_schema to get columns
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_name = 'university_professors' AND table_schema = 'public';
Instruction 3: Finally, output the top five records from the university_professors table.
-- Query the first five rows of our table
SELECT *
FROM university_professors
LIMIT 5;
1. 4 Tables: At the Core of Every Database (video)
1.5 CREATE Your First Few TABLEs
For your guidance, you are now beginning to develop an improved database structure. As part of this process, you will establish tables specifically for professors and universities. The remaining tables, which cover other essential aspects, will be set up automatically by the system.
The syntax for creating simple tables is as follows:
CREATE TABLE table_name (
column_a data_type,
column_b data_type,
column_c data_type
);
Attention: 在编写文档或代码时,在引用表格名称、列名以及数据类型时无需以引号括起来。
Instruction 1: Construct a database table named professors that includes two text columns: first name and last name.
-- Create a table for the professors entity type
CREATE TABLE professors (
firstname text,
lastname text
);
-- Print the contents of this table
SELECT *
FROM professors;
Construct a table named universities that contains three textual columns: short university names, the full university names, and university cities.
-- Create a table for the universities entity type
CREATE TABLE universities (
university_shortname text,
university text,
university_city text
);
-- Print the contents of this table
SELECT *
FROM universities;
1.6 ADD a COLUMN with ALTER TABLEs
Regrettably, we overlooked the university_shortname column when populating the professors table. You may have already noticed...
firstname
professors
lastname
university_shortname
Within chapter four of this course, it is necessary for you to have access to a specific section dedicated to linking up the professors' data with university information.
Inserting new columns into existing databases can be a straightforward process, particularly when those databases are currently empty. It is an easy task for database administrators to add new columns, especially when the tables have no existing fields.
To add columns you can use the following SQL query:
ALTER TABLE table_name
ADD COLUMN column_name data_type;
Update professors to include the university_shortname text column.
-- Add the university_shortname column
ALTER TABLE professors
ADD COLUMN university_shortname text;
-- Print the contents of this table
SELECT *
FROM professors;
1.7 Update Your Database as the Structure Changes (video)
1.8 RENAME and DROP COLUMNs in Affiliations
The affiliation table, as discussed in the video, remains vacant. You will be tasked with fixing these issues during this exercise.
You’ll use the following queries:
- To rename columns:
ALTER TABLE table_name
RENAME COLUMN old_name TO new_name;
- To delete columns:
ALTER TABLE table_name
DROP COLUMN column_name;
Rebrand the organisation column as organization within the affiliations context.
-- Rename the organisation column
ALTER TABLE affiliations
RENAME COLUMN organisation TO organization;
Instruction 2:
Delete the university_shortname column in affiliations.
-- Rename the organisation column
ALTER TABLE affiliations
RENAME COLUMN organisation TO organization;
-- Delete the university_shortname column
ALTER TABLE affiliations
DROP COLUMN university_shortname;
1.9 Migrate Data with INSERT INTO SELECT DISTINCT
Now, it has been a long time since we've had this opportunity to transfer the data into dedicated tables. You will utilize the following approach.
INSERT INTO ...
SELECT DISTINCT ...
FROM ...;
It can be broken up into two parts:
First part:
SELECT DISTINCT column_name1, column_name2, ...
FROM table_a;
This selects all distinct values in table table_a – nothing new for you.
Second part:
INSERT INTO table_b ...;
Append this section to the beginning of the document, which are then inserted into table_b.
最后一个要点:但只有在填满所有空白后,请确保所有代码同时运行。
Instruction 1:
- Insert a list of unique professors from university_professors into the professors table.
- Output every record from the professors table.
-- Insert unique professors into the new table
INSERT INTO professors
SELECT DISTINCT firstname, lastname, university_shortname
FROM university_professors;
-- Doublecheck the contents of professors
SELECT *
FROM professors;
Instruction 2: Populate every DISTINCT affiliation into the field of affiliations from university_professors.
-- Insert unique affiliations into the new table
INSERT INTO affiliations
SELECT DISTINCT firstname, lastname, function, organization
FROM university_professors;
-- Doublecheck the contents of affiliations
SELECT *
FROM affiliations;
1.10 Delete tables with DROP TABLE
The university_professors table is now no longer needed and can be safely removed.
For table deletion, you can use the simple command:
DROP TABLE table_name;
Instruction:
Delete the university_professors table.
-- Delete the university_professors table
DROP TABLE university_professors;
2. Enforce Data Consistency with Attribute Constrains
2.1 Better Data Quality with Constrains (video)
2.2 Types of Database Constrains
2.3 Conforming with Data Types
I constructed a fictional database table solely for illustrative purposes. The table comprises three columns designated as date, integer, and text respectively.
CREATE TABLE transactions (
transaction_date date,
amount integer,
fee text
);
Have a look at the contents of the transactions table.
This field is designed to store transaction dates. Referencing the PostgreSQL documentation, it is clear that date values can be entered using formats such as YYYY-MM-DD or DD/MM/YY.
这两个列 amount 和 fee 出现为数字类型但其实只有第二个一个是被建模为文本类型的——在下一个练习中你将会处理这个问题。
Instruction:
- Run the provided sample code.
- Despite it not working, examine the error message and fix the statement appropriately before re-running it again.
-- Let's add a record to the table
INSERT INTO transactions (transaction_date, amount, fee)
VALUES ('2018-09-24', 5454, '30');
-- Doublecheck the contents
SELECT *
FROM transactions;
2.4 Types CASTs
In the video, you observed that type conversions represent a potential approach for information issues. If you know that a specific column stores numbers as text, you can convert the column to a numeric form, for example, into integer.
SELECT CAST(some_column AS integer)
FROM table;
Currently, the some_column variable is temporarily designated as an integer type rather than a text type, which implies that numerical operations can be executed on this column.
Instruction:
Run the provided sample code. When the sample code doesn't function properly, insert an integer type cast at the correct location and re-run it.
-- Calculate the net amount as amount + fee
SELECT transaction_date, amount + CAST(fee AS integer) AS net_amount
FROM transactions;
2.5 Working with Data Types (video)
2.6 Change Types with ALTER COLUMN
The method for modifying the data type of a column is simple. The subsequent code block modifies the data type of the specified \texttt{column\_name} within \texttt{table\_name} to \texttt{varchar(10)}:
ALTER TABLE table_name
ALTER COLUMN column_name
TYPE varchar(10)
Now it’s time to start adding constraints to your database.
Instruction 1: Examine all unique university_shortname values in the professors table, noting the length of each string.
-- Select the university_shortname column
SELECT DISTINCT(university_shortname)
FROM professors;
Instruction 2: 定义一个固定长度的字符字段来适配university_shortname的正确长度。
-- Specify the correct fixed-length character type
ALTER TABLE professors
ALTER COLUMN university_shortname
TYPE char(3);
Instruction 3:
Change the type of the firstname column to varchar(64).
-- Change the type of firstname
ALTER TABLE professors
ALTER COLUMN firstname
TYPE varchar(64);
2.7 Convert Types USING a Function
If you prefer not to allocate excessive storage space for a specific varchar column, it is possible to truncate its values prior to altering its data type.
For this, you can use the following syntax:
ALTER TABLE table_name
ALTER COLUMN column_name
TYPE varchar(x)
USING SUBSTRING(column_name FROM 1 FOR x)
One should approach it as follows: Since one aims to allocate merely x characters for each column named $column_name$', one must extract a substring from every value—specifically, its initial $x$ characters—and discard the remainder. In this manner, all values will conform to the required format of varchar(x).
Instruction:
- Execute the sample code without modification and pay attention to any errors.
- Proceed by using the
SUBSTRING()function to trimfirstnamedown to 16 characters, thereby enabling a change in its data type tovarchar(16).
-- Convert the values in firstname to a max. of 16 characters
ALTER TABLE professors
ALTER COLUMN firstname
TYPE varchar(16)
USING SUBSTRING(firstname FROM 1 FOR 16)
2.8 The Not-Null and Unique Constrains (video)
2.9 Disallow NULL values with SET NOT NULL
The professors table has nearly reached completion. Nonetheless, it permits the entry of NULL values. Despite the potential absence of certain information regarding specific professors, there are definitely columns that must be included in the table structure.
Instruction 1:
Add a not-null constraint for the firstname column.
-- Disallow NULL values in firstname
ALTER TABLE professors
ALTER COLUMN firstname SET NOT NULL;
Instruction 2:
Add a not-null constraint for the lastname column.
-- Disallow NULL values in lastname
ALTER TABLE professors
ALTER COLUMN lastname SET NOT NULL;
2.10 What Happens If You Try to Enter NULLs?
Execute the following statement:
INSERT INTO professors (firstname, lastname, university_shortname)
VALUES (NULL, 'Miller', 'ETH');
Why does this throw an error?
The current statement breaches a non-null constraint that you have just outlined.
2.11 Make Your Columns UNIQUE with ADD CONSTRAINT
After observing a video, you are required to insert the UNIQUE keyword into the column designated as unique. However, this approach is exclusively applicable to newly created tables.
CREATE TABLE table_name (
column_name UNIQUE
);
If you desire to impose a distinctiveness requirement on an existing table, you can do so in this manner:
ALTER TABLE table_name
ADD CONSTRAINT some_name UNIQUE(column_name);
Please ensure it is not the same as the ALTER COLUMN syntax for the not-null constraint. Additionally, it is necessary to assign a name some_name to the constraint.
Create a unique constraint for the university_shortname column within the universities table and name it as university_shortname_unq (UIN_unq).
-- Make universities.university_shortname unique
ALTER TABLE universities
ADD CONSTRAINT university_shortname_unq UNIQUE(university_shortname);
Instruction 2:
Add a unique constraint to the organization column in organizations. Give it the name organization_unq.
-- Make organizations.organization unique
ALTER TABLE organizations
ADD CONSTRAINT organization_unq UNIQUE(organization);
3. Unique Identify Records with Key Constraints
3.1 Keys and Superkeys (video)
3.2 Get to Know SELECT COUNT DISTINCT
Your database has not yet established any primary keys thus far and you are unaware of which columns or composite primary key candidates would serve well as primary key candidates.
There is a straightforward approach to determine whether a particular column (or group of columns) is composed exclusively of unique values, which can subsequently identify the records within the table.
You are already familiar with the SELECT DISTINCT construct from the first chapter. Now, you should enclose all elements within the COUNT() function, and PostgreSQL will compute and return the total number of distinct rows for those specified columns.
SELECT COUNT(DISTINCT(column_a, column_b, ...))
FROM table;
Instruction 1:
First, find out the number of rows in universities.
-- Count the number of rows in universities
SELECT COUNT(*)
FROM universities;
Instruction 2:
Then, find out how many unique values there are in the university_city column.
-- Count the number of distinct values in the university_city column
SELECT COUNT(DISTINCT(university_city))
FROM universities;
3.3 Identify Keys with SELECT COUNT DISTINCT
Among databases, there exists an elementary approach to identify the criteria required for a key in a ready, populated database.
Identify unique record counts across every possible column combination. If the resulting number x matches the total rows when considering that combination, you have found a superkey.
Successively remove columns until removing further columns would cause the number x to decrease. Upon reaching this point, you have identified a potential candidate key.
该表格包含551条记录。它仅有一个候选主键,并且是两个属性的组合。如果您想尝试不同的组合,请使用"Run code"按钮进行操作。一旦找到解决方案,请提交您的答案。
Instruction:
-- Try out different combinations
SELECT COUNT(DISTINCT(firstname, lastname))
FROM professors;
3.4 Primary Keys (video)
3.5 Identify the Primary Key
Examine the sample table from the earlier video. As a database designer, you must choose wisely which column will be designated as the primary key.
| license_no | serial_no | make | model | year |
|---|---|---|---|---|
| Texas ABC-739 | A69352 | Ford | Mustang | 2 |
| Florida TVP-347 | B43696 | Oldsmobile | Cutlass | 5 |
| New York MPO-22 | X83554 | Oldsmobile | Delta | 1 |
| California 432-TFY | C43742 | Mercedes | 190-D | 99 |
| California RSK-629 | Y82935 | Toyota | Camry | 4 |
| Texas RSK-629 | U028365 | Jaguar | XJS | 4 |
Among the following columns and/or column combinations, which ones can most effectively serve as a primary key?
PK = {license_no}
3.6 ADD Key CONSTRAINTs to the Tables
A couple of tables in your database already have well-fitting candidate keys, each having a single column. The organizations and universities use the organization and university_shortname columns, respectively.
In this exercise, you will rename these columns to id using the RENAME COLUMN command to rename them, and then specify primary key constraints for them. This is just as simple as adding unique constraints (see the last exercise in Chapter 2):
ALTER TABLE table_name
ADD CONSTRAINT some_name PRIMARY KEY (column_name)
Note that you can also specify more than one column in the brackets.
Instruction 1:
- The organization column is renamed as id within organizations.
- Set the id column as a primary key and rename it as organization_pk.
-- Rename the organization column to id
ALTER TABLE organizations
RENAME COLUMN organization TO id;
-- Make id a primary key
ALTER TABLE organizations
ADD CONSTRAINT organization_pk PRIMARY KEY (id);
Instruction 2:
- Rename the
university_shortnamecolumn toidin universities. - Make id a primary key and name it
university_pk.
-- Rename the university_shortname column to id
ALTER TABLE universities
RENAME COLUMN university_shortname TO id;
-- Make id a primary key
ALTER TABLE universities
ADD CONSTRAINT university_pk PRIMARY KEY (id);
3.7 Surrogate Keys (video)
3.8 ADD A SERIAL Surrogate Key
There is no singular column candidate key in the professors table (only a composite key candidate comprising firstname and lastname). It is recommended that you will add a new column named id to this table.
This column features a specialized data type serial, which transforms the column into one with auto-incrementing numbers. This implies that whenever a new professor is added to the database, it will automatically receive a unique identifier not previously assigned within the same table. This setup ensures that every new entry receives a unique identifier not previously present in the database: it serves as an ideal primary key!
Create a new field id, specifying the data type as serial, for the professors table.
-- Add the new column to the table
ALTER TABLE professors
ADD COLUMN id serial;
Instruction 2:
Make id a primary key and name it professors_pkey.
-- Make id a primary key
ALTER TABLE professors
ADD CONSTRAINT professors_pkey PRIMARY KEY (id);
Issue a query that retrieves all column names and first 10 records from professors.
-- Have a look at the first 10 rows of professors
SELECT *
FROM professors
LIMIT 10;
3.9 CONCATenate Columns to A Surrogate Key
A method to add a surrogate key to an existing table can be achieved by joining existing columns using the CONCAT() function.
Let’s think of the following example table:
CREATE TABLE cars (
make varchar(64) NOT NULL,
model varchar(64) NOT NULL,
mpg integer NOT NULL
):
The table is populated with 10 rows of completely fictional data.
Heavily, the table lacks a primary key. None of its columns consist solely of unique values; this implies that certain columns must be merged to create a composite key.
During the subsequent exercises, you are tasked with merging make and model to create a unified surrogate key.
Calculate the total number of unique row groups that incorporate both the make and model columns.
-- Count the number of distinct rows with columns make, model
SELECT COUNT(DISTINCT(make, model))
FROM cars;
Instruction 2:
Add a new column id with the data type varchar(128).
-- Add the id column
ALTER TABLE cars
ADD COLUMN id varchar(128);
Instruct step 3: Merge the make and model strings into the id column by employing an UPDATE query on the specified table and utilizing the CONCAT() function.
-- Update id with make + model
UPDATE cars
SET id = CONCAT(make, model);
Instruction 4:
Make id a primary key and name it id_pk.
-- Make id a primary key
ALTER TABLE cars
ADD CONSTRAINT id_pk PRIMARY KEY(id);
-- Have a look at the table
SELECT * FROM cars;
3.10 Test Your Knowledge before Advancing
Before moving on to the next chapter, let's review what you have learned so far about attributes and key constraints. If you are uncertain about the answer, please review chapters 2 and 3 respectively.
Let’s think of an entity type “student”. A student has:
- A family name that is no longer than 128 characters in length, which cannot have any missing values,
- A Social Security identifier that is exactly 9 digits in length, consisting solely of numeric characters,
- A phone number that is precisely 12 characters long, composed exclusively of integer digits and other allowable symbols (notably, some students do not possess such a number).
Instruction:
- Based on the outlined student entity description, construct a table named students that includes appropriate data types.
- Implement a primary key constraint for the social security number field
ssn.
Note that there is no formal length specification for the integer column. the application must ensure it is a valid social security number!
-- Create the table
CREATE TABLE students (
last_name varchar(128) NOT NULL,
ssn integer[9] UNIQUE,
phone_no char(12)
);
4. Glue Together Tables with Foreign Keys
4.1 Model 1:N Relationships with Foreign Keys (video)
4.2 REFERENCE A Table with A FOREIGN KEY
If you need the professors table to link to the universities table in your database, you should define a corresponding column in the professors table that establishes a link to a specific column in the universities table.
As just shown in the video, the syntax for that looks like this:
ALTER TABLE a
ADD CONSTRAINT a_fkey FOREIGN KEY (b_id) REFERENCES b (id);
Table a must now reference table b through the identifier specified by $b_id$, which uniquely maps to $id$. The foreign key $a_fkey$ traditionally serves as a conventional identifier for establishing relationships between tables.
Typically, when a foreign key references another primary key with an id attribute, it adopts the form x\_id, where x represents the singular name of the referencing table.
The professors dataset now features a renamed field previously known as university_shortname, now referred to as university_id.
-- Rename the university_shortname column
ALTER TABLE professors
RENAME COLUMN university_shortname TO university_id;
Instruction 2:
- Designate this foreign key as professors_fkey.
- Create a foreign key on the university_id field within the professors table, which references the id column of the universities table.
-- Add a foreign key on professors referencing universities
ALTER TABLE professors
ADD CONSTRAINT professors_fkey FOREIGN KEY (university_id) REFERENCES universities (id);
4.3 Explore Foreign Key Constrains
Primary Key Constraints enable you to enforce organizational structure within your database mini-world. Within your database context, scholars affiliated with institutions from Switzerland must be restricted to ensure compliance with the universities table. Because only universities from Switzerland are included in the universities table.
The foreign key field in the professors table is configured to reference the universities table that you have just created. This setup ensures that only existing universities can be referenced when adding new data. I’d like to verify this functionality.
Instruction:
- Execute the sample code and examine the error messages for troubleshooting.
- I'm seeing an issue with your university ID. Please update it to reflect Albert Einstein's dissertation location, which is correctly noted as ETH Zurich.
-- Try to insert a new professor
INSERT INTO professors (firstname, lastname, university_id)
VALUES ('Albert', 'Einstein', 'UZH');
4.4 JOIN Tables Linked by A Foreign Key
Let’s join these two tables to analyze the data further!
Some people may have learned the way SQL joins function through the Intro to SQL for Data Science course, which included the final exercise, or through the 'Joining Data' module in PostgreSQL.
Here’s a quick recap on how joins generally work:
SELECT ...
FROM table_a
JOIN table_b
ON ...
WHERE ...
Though foreign and primary keys aren't mandatory for join operations, they still provide valuable insights into expected results. Take an example: if a record linked from Table A is guaranteed to exist in Table B, then a join operation from Table A will reliably find corresponding data in Table B. If such a connection doesn't hold true, the foreign key constraint would be violated.
Instruction:
- Perform an inner join between professors and universities where professors' university_id matches universities' id, equivalent to retaining all records where the foreign key of professors equals the primary key of universities.
- Filter the dataset to include only entries where university_city is set to 'Zurich.'
-- Select all professors working for universities in the city of Zurich
SELECT professors.lastname, universities.id, universities.university_city
FROM professors
JOIN universities
ON professors.university_id = universities.id
WHERE universities.university_city = 'Zurich';
4.5 Model More Complex Relationships (video)
4.6 Add Foreign Keys to the “Affiliations” Table
Currently, this table is structured with fields including firstname, lastname, function, and organization. As shown in the preview below (on or near), At present time
You will be redesigning the affiliations table in place, specifically by not requiring the creation of a temporary table for storing intermediate data.
Create a professor_id field with integer data type within the affiliations table, designating it as a foreign key that references the ID column in the professors table.
-- Add a professor_id column
ALTER TABLE affiliations
ADD COLUMN professor_id integer REFERENCES professors (id);
Rename the _instruction_2_ section in the _affiliations_ table as _instruction_id_.
-- Rename the organization column to organization_id
ALTER TABLE affiliations
RENAME organization TO organization_id;
The system requires implementing a directive statement to add a foreign key constraint on the _organization_id_ field such that it ensures referencing to the _id_ column within the _organizations_ table.
ALTER TABLE affiliations
ADD CONSTRAINT affiliations_organization_fkey FOREIGN KEY (organization_id) REFERENCES organizations (id);
4.7 Populate the “professor_id” Column
Currently, it is necessary to additionally populate the professors_id. The process will involve obtaining the ID directly from the professors table.
Here’s a way to update columns of a table based on values in another table:
UPDATE table_a
SET column_to_update = table_b.column_to_update_from
FROM table_b
WHERE condition1 AND condition2 AND ...;
This query does the following:
- For each row in
table_a, find the corresponding row intable_bwherecondition1,condition2, etc., are met. - Set the value of
column_to_updateto the value ofcolumn_to_update_from(from that corresponding row).
The conditions usually compare other columns of both tables, e.g.table_a.some_column = table_b.some_column. Of course, this query only makes sense if there is only one matching row intable_b.
Initially, examine the current status of the affiliations by retrieving 10 rows and all columns.
-- Have a look at the 10 first rows of affiliations
SELECT *
FROM affiliations
LIMIT 10;
_Assign a new value to the professor_id column, specifically using the matching identifier from the id column within the professors table. This refers to rows within the professors table where both first name and last name exactly match those found in affiliations.
-- Set professor_id to professors.id where firstname, lastname correspond to rows in professors
UPDATE affiliations
SET professor_id = professors.id
FROM professors
WHERE affiliations.firstname = professors.firstname AND affiliations.lastname = professors.lastname;
Review the initial ten rows of every column in the affiliations table again. Have the professor IDs been correctly matched?
-- Have a look at the 10 first rows of affiliations again
SELECT *
FROM affiliations
LIMIT 10;
4.8 Drop “firstname” and “lastname”
The fields representing the person's first and last names from the affiliations table were utilized to create a connection during the previous exercise, allowing appropriate professor IDs to be transferred. This was made possible because each row in the affiliations corresponds uniquely to one professor. In essence: {firstname, lastname} serves as a candidate key for the professors table, representing a unique combination of columns.
It isn't achieved through affiliations though, because as mentioned in the video, professors may have multiple affiliations.
Since professors are now being referenced using professor_id, the firstname and lastname columns have become unnecessary. Therefore, it's appropriate to remove these columns. Moreover, one key objective of databases is to minimize redundancy in various scenarios.
Remove the firstname and lastname fields from the affiliations table.
-- Drop the firstname column
ALTER TABLE affiliations
DROP COLUMN firstname;
-- Drop the lastname column
ALTER TABLE affiliations
DROP COLUMN lastname
4.9 Referential Integrity (video)
4.10 Referential Integrity Violations
4.11 Change the Referential Integrity Behavior of A Key
So far, you implemented three foreign key constraints:
professors.university_id to universities.id
affiliations.organization_id to organizations.id
affiliations.professor_id to professors.id
These foreign keys currently have the behavior ON DELETE NO ACTION. Here, you’re going to change that behavior for the column referencing organizations from affiliations. If an organization is deleted, all its affiliations (by any professor) should also be deleted.
Modify altering a key constraint won't work when using ALTER COLUMN. Instead, it's necessary to remove the existing primary key constraint before creating a new one with altered ON DELETE behavior.
To delete constraints, though, you must have knowledge of their names. The necessary information regarding constraint deletion is also contained within $information_schema$.
It is advisable to examine the existing foreign key constraints by querying table_constraints within the information_schema database.
-- Identify the correct constraint name
SELECT constraint_name, table_name, constraint_type
FROM information_schema.table_constraints
WHERE constraint_type = 'FOREIGN KEY';
Remove the affiliations_organization_id_fkey foreign key constraint from within the affiliations table.
-- Drop the right foreign key constraint
ALTER TABLE affiliations
DROP CONSTRAINT affiliations_organization_id_fkey;
Create a new foreign key for the affiliations table, which will cascade deletions when an associated record is deleted from the organizations table. Specify the foreign key as affiliations_organization_id_fkey.
-- Add a new foreign key constraint from affiliations to organizations which cascades deletion
ALTER TABLE affiliations
ADD CONSTRAINT affiliations_organization_id_fkey FOREIGN KEY (organization_id) REFERENCES organizations (id) ON DELETE CASCADE;
Verify that the deletion cascade functions properly by executing DELETE and SELECT statements.
-- Delete an organization
DELETE FROM organizations
WHERE id = 'CUREM';
4.12 Roundup (video)
4.13Count Affiliations Per University
Once your data has been prepared for analysis, let's execute illustrative SQL statements against the database. You will be utilizing previously learned techniques, including grouping by columns and joining tables.
In this exercise, you will identify which university boasts the highest number of affiliations based on its faculty. To accomplish this, you must have access to both the affiliations and professors tables. Notably, the latter table also includes information about university_id.
As a quick repetition, remember that joins have the following structure:
SELECT table_a.column1, table_a.column2, table_b.column1, ...
FROM table_a
JOIN table_b
ON table_a.column = table_b.column
This combines $table_a$ and $table_b$, but only for rows where $table_a.column$ matches $table_b.column$.
Instruction:
- Calculate each university's total affiliation count.
- Sort the results based on this count in descending order.
-- Count the total number of affiliations per university
SELECT COUNT(*), professors.university_id
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
-- Group by the ids of professors
GROUP BY professors.university_id
ORDER BY count DESC;
4.14 Join All the Table Together
The final exercise session requires you to locate The academic city of the leading professor, which is in The highest number of affiliations within The sector of Media & Communication.
To achieve this goal, you must merge every table together, group them by any specific column, and then apply selection criteria to retrieve only those rows that belong to the correct sector.
Combine all the tables within this database, beginning with $affiliations$, $professors$, $organizations$, and $universities; then examine the resulting data set.
-- Join all tables
SELECT *
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id;
Instruction 2:
- Proceed to group the result by organization sector, professor, and university city.
- Count the resulting count of records.
SELECT COUNT(*), organizations.organization_sector,
professors.id, universities.university_city
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id
GROUP BY organizations.organization_sector,
professors.id, universities.university_city;
Only retain those rows where the organization sector is "Media & communication", and sort the table by count in descending order.
SELECT COUNT(*), organizations.organization_sector,
professors.id, universities.university_city
FROM affiliations
JOIN professors
ON affiliations.professor_id = professors.id
JOIN organizations
ON affiliations.organization_id = organizations.id
JOIN universities
ON professors.university_id = universities.id
WHERE organizations.organization_sector = 'Media & communication'
GROUP BY organizations.organization_sector,
professors.id, universities.university_city
ORDER BY count DESC;
