SQL Relational Database

 information_schema is a meta-database that holds information about your current database. information_schema has multiple tables you can query with the known SELECT * FROM syntax:

  • tables: information about all tables in your current database
  • columns: information about all columns in all of the tables in your current database

In this exercise, you'll only need information from the schema, which is specified as the column of the and tables. The 'public' schema holds information about user-defined tables and databases. The other types of hold system information – for this course, you're only interested in user-defined stuff.


-- Query the right table in information_schema to get columns

SELECT column_name, data_type 

FROM information_schema.columns

WHERE table_name = 'university_professors' AND table_schema = 'public';


Migrate data with INSERT INTO SELECT DISTINCT

-- Insert unique affiliations into the new table
INSERT INTO affiliations 
SELECT DISTINCT firstname, lastname, function, organization 
FROM university_professors;

-- Doublecheck the contents of affiliations
SELECT * 
FROM affiliations;

CONCATenate columns to a surrogate key

Another strategy to add a surrogate key to an existing table is to concatenate existing columns with the CONCAT() function.

Let's think of the following example table:

CREATE TABLE cars (
 make varchar(64) NOT NULL,
 model varchar(64) NOT NULL,
 mpg integer NOT NULL
)

The table is populated with 10 rows of completely fictional data.

Unfortunately, the table doesn't have a primary key yet. None of the columns consists of only unique values, so some columns can be combined to form a key.

In the course of the following exercises, you will combine make and model into such a surrogate key.


SELECT COUNT(DISTINCT(make, model)) 

FROM cars;


-- Add the id column

ALTER TABLE cars

ADD COLUMN id varchar(128);


-- Update id with make + model

UPDATE cars

SET id = CONCAT(make,model);



REFERENCE a table with a FOREIGN KEY

In your database, you want the professors table to reference the universities table. You can do that by specifying a column in professors table that references a column in the universities table.

As just shown in the video, the syntax for that looks like this:

ALTER TABLE a 
ADD CONSTRAINT a_fkey FOREIGN KEY (b_id) REFERENCES b (id);

Table a should now refer to table b, via b_id, which points to ida_fkey is, as usual, a constraint name you can choose on your own.

Pay attention to the naming convention employed here: Usually, a foreign key referencing another primary key with name id is named x_id, where x is the name of the referencing table in the singular form.



-- Add a professor_id column

ALTER TABLE affiliations

ADD COLUMN professor_id integer REFERENCES professors (id);


-- Rename the organization column to organization_id

ALTER TABLE affiliations

RENAME organization TO organization_id;


-- Add a foreign key on organization_id

ALTER TABLE affiliations

ADD CONSTRAINT affiliations_organization_fkey FOREIGN KEY(organization_id) REFERENCES organizations (id);



UPDATE affiliations

SET professor_id = professors.id

FROM professors

WHERE affiliations.firstname = professors.firstname AND affiliations.lastname = professors.lastname;


-- Have a look at the 10 first rows of affiliations again

SELECT *

FROM affiliations

LIMIT 10;



Comments

Popular posts from this blog

Binomial Test in Python

Slicing and Indexing in Python Pandas

Python Syntax and Functions Part2 (Summary Statistics)