Lecture 4: PostgreSQL part 1#
Learning objectives#
By the end of this lecture, students should be able to apply and practice:
Understand different data types in SQL
Filtering rows based on specific conditions using WHERE
Create derived columns
Conditional logic with CASE
Built-in functions & operators in SQL
Connect to database#
Make sure you edit the
credentials.json
file with appropriate username and password as in worksheet1
# load the ipython-sql extension
%load_ext sql
import json
import urllib.parse
with open('data/credentials.json') as f:
login = json.load(f)
username = login['user']
password = urllib.parse.quote(login['password'])
host = login['host']
port = login['port']
Let’s connect to your dvdrental database
%sql postgresql://{username}:{password}@{host}:{port}/dvdrental
'Connected: postgres@dvdrental'
Let’s try to print out the list of tables in this dvdrental
database
%%sql
SELECT
*
FROM
information_schema.tables
WHERE
table_schema = 'public'
AND
table_type = 'BASE TABLE'
;
* postgresql://postgres:***@localhost:5432/dvdrental
15 rows affected.
table_catalog | table_schema | table_name | table_type | self_referencing_column_name | reference_generation | user_defined_type_catalog | user_defined_type_schema | user_defined_type_name | is_insertable_into | is_typed | commit_action |
---|---|---|---|---|---|---|---|---|---|---|---|
dvdrental | public | actor | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | store | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | address | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | category | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | city | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | country | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | customer | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | film_actor | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | film_category | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | inventory | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | language | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | rental | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | staff | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | payment | BASE TABLE | None | None | None | None | None | YES | NO | None |
dvdrental | public | film | BASE TABLE | None | None | None | None | None | YES | NO | None |
Let’s print out the schema of table actor
%%sql
SELECT column_name, data_type, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = 'actor';
* postgresql://postgres:***@localhost:5432/dvdrental
4 rows affected.
column_name | data_type | is_nullable | column_default |
---|---|---|---|
actor_id | integer | NO | nextval('actor_actor_id_seq'::regclass) |
first_name | character varying | NO | None |
last_name | character varying | NO | None |
last_update | timestamp without time zone | NO | now() |
Print out the first few rows of table actor
:
%%sql
SELECT * FROM actor LIMIT 5;
* postgresql://postgres:***@localhost:5432/dvdrental
5 rows affected.
actor_id | first_name | last_name | last_update |
---|---|---|---|
1 | Penelope | Guiness | 2013-05-26 14:47:57.620000 |
2 | Nick | Wahlberg | 2013-05-26 14:47:57.620000 |
3 | Ed | Chase | 2013-05-26 14:47:57.620000 |
4 | Jennifer | Davis | 2013-05-26 14:47:57.620000 |
5 | Johnny | Lollobrigida | 2013-05-26 14:47:57.620000 |
Data Types in SQL#
Introduction#
Data types in SQL define the type of data that can be stored in a column. Choosing the correct data type is crucial for optimizing storage, ensuring data integrity, and improving query performance.
Convert between different data types#
To help you better understand the different data types in SQL. Let’s first learn how to convert between different data type using CAST
function
To use the CAST
function in PostgreSQL, you can follow the syntax below:
SELECT CAST(expression AS target_data_type) AS alias_name;
Here’s an example of how to use CAST
in PostgreSQL:
%sql SELECT CAST(12.312121 AS INT) AS converted_value;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
converted_value |
---|
12 |
You can use the ::
shortcut to cast values to different data types in PostgreSQL
<column>::<data_type>
%sql SELECT 12.312121::INT AS converted_value;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
converted_value |
---|
12 |
Numeric Data Types#
1. INT#
Description: Integer data type. Used for whole numbers.
Example:
Data Type |
Range |
Description |
Storage Size |
---|---|---|---|
TINYINT |
-128 to 127 (signed) or 0 to 255 (unsigned) |
Very small integer |
1 byte |
SMALLINT |
-32,768 to 32,767 (signed) or 0 to 65,535 (unsigned) |
Small integer |
2 bytes |
MEDIUMINT |
-8,388,608 to 8,388,607 (signed) or 0 to 16,777,215 (unsigned) |
Medium-sized integer |
3 bytes |
INT |
-2,147,483,648 to 2,147,483,647 (signed) or 0 to 4,294,967,295 (unsigned) |
Standard integer |
4 bytes |
BIGINT |
-9,223,372,036,854,775,808 to 9,223,372,036,854,775,807 (signed) or 0 to 18,446,744,073,709,551,615 (unsigned) |
Large integer |
8 bytes |
SERIAL |
1 to 2,147,483,647 (signed) or 0 to 4,294,967,295 (unsigned) |
Auto-incrementing integer |
4 bytes |
Note: The range and storage size may vary slightly depending on the specific database system you are using.
%%sql
SELECT CAST(44.7874 AS SMALLINT);
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
int2 |
---|
45 |
%%sql
SELECT column_name, data_type, udt_name, is_nullable, column_default
FROM information_schema.columns
WHERE table_name = 'products';
* postgresql://postgres:***@localhost:5432/dvdrental
3 rows affected.
column_name | data_type | udt_name | is_nullable | column_default |
---|---|---|---|---|
product_id | integer | int4 | NO | None |
product_name | character varying | varchar | YES | None |
quantity | integer | int4 | YES | None |
2. FLOAT#
Description: Floating-point number. Used for approximate numeric values.
Example:
Data Type |
Description |
Range |
Storage Size |
---|---|---|---|
FLOAT |
Floating-point number |
-1.79E+308 to 1.79E+308 |
4 or 8 bytes |
REAL |
Single-precision floating-point number |
-3.40E+38 to 3.40E+38 |
4 bytes |
DOUBLE |
Double-precision floating-point number |
-1.79E+308 to 1.79E+308 |
8 bytes |
DECIMAL |
Exact numeric value with a specified precision and scale |
Variable |
Variable |
%sql SELECT CAST(4.54021223948E-8 AS FLOAT);
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
float8 |
---|
4.54021223948e-08 |
3. DECIMAL#
Description: Fixed-point number. Used for exact numeric values with a specified precision and scale.
Example: DECIMAL(10,10) specifies that the number can have up to 10 digits in total, with 10 digits after the decimal point (scale)
%sql SELECT CAST(4.54021223948E-8 AS DECIMAL(10,10));
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
numeric |
---|
4.54E-8 |
Character Data Types#
Data Type |
Description |
Example |
---|---|---|
CHAR(n) |
Fixed-length character string with a maximum length of n characters. |
CHAR(10) |
VARCHAR(n) |
Variable-length character string with a maximum length of n characters. |
VARCHAR(255) |
TEXT |
Variable-length character string with no maximum length. |
TEXT |
4. CHAR#
Description: Fixed-length character string. Used for strings of a specified length.
Example:
5. VARCHAR#
Description: Variable-length character string. Used for strings of varying length.
Example:
6. TEXT#
Description: Variable-length character string. Used for large text data.
Example:
%%sql
SELECT
'banana'::CHAR(50),
'banana'::VARCHAR(2),
'banana'::TEXT
;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
bpchar | varchar | text |
---|---|---|
banana | ba | banana |
Note that you can’t see the space-padding for CHAR(50) ‘banana’ in the Jupyter notebook, but if you run the same statement in psql, you will see ‘banana’ + 44 spaces in the output.
Date and Time Data Types#
7. DATE#
Description: Date value. Used for storing dates without time.
Example:
PostgreSQL does a good job of parsing an input to datetime even if we don’t enter it exactly according to ISO way
%%sql
SELECT
'Oct 23, 2021'::DATE,
'23 October 2021'::DATE,
'2021 10 23'::DATE,
'10/23/2021'::DATE,
'today'::DATE,
'tomorrow'::DATE
;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
date | date_1 | date_2 | date_3 | date_4 | date_5 |
---|---|---|---|---|---|
2021-10-23 | 2021-10-23 | 2021-10-23 | 2021-10-23 | 2024-09-16 | 2024-09-17 |
8. TIME#
Description: Time value. Used for storing time without date.
Example:
%%sql
SELECT
'14:24:00'::TIME,
'2:24pm'::TIME,
'2:24 PM PST'::TIME WITH TIME ZONE,
'now'::TIME,
'now'::TIME WITH TIME ZONE
;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
time | time_1 | timetz | time_2 | timetz_1 |
---|---|---|---|---|
14:24:00 | 14:24:00 | 14:24:00-08:00 | 16:57:27.492849 | 16:57:27.492849-07:00 |
9. TIMESTAMP#
Description: Date and time value with time zone. Used for storing both date and time with time zone information.
Example:
%sql SELECT '2021-11-18 8:30:00'::TIMESTAMP;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
timestamp |
---|
2021-11-18 08:30:00 |
%sql SELECT '2021-11-18 8:30:00 EST'::TIMESTAMPTZ;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
timestamptz |
---|
2021-11-18 05:30:00-08:00 |
When you insert a timestamp without specifying a time zone, PostgreSQL assumes the timestamp is in the server’s current time zone.
%sql SHOW TIMEZONE;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
TimeZone |
---|
America/Vancouver |
Boolean Data Type#
10. BOOLEAN#
Description: Boolean value. Used for storing true or false values.
Example:
Valid values are
‘TRUE’, 1 (or any other positive integer), ‘YES’, ‘Y’, ‘T’
‘FALSE’, ‘0’, ‘NO’, ‘N’, ‘F’
%%sql
SELECT
'TRUE'::BOOLEAN,
'true'::BOOLEAN,
't'::BOOLEAN,
'1'::BOOLEAN,
'YES'::BOOLEAN
;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
bool | bool_1 | bool_2 | bool_3 | bool_4 |
---|---|---|---|---|
True | True | True | True | True |
%%sql
SELECT
'FALSE'::BOOLEAN,
'false'::BOOLEAN,
'f'::BOOLEAN,
'F'::BOOLEAN,
'n'::BOOLEAN,
'N'::BOOLEAN,
'no'::BOOLEAN
;
* postgresql://postgres:***@localhost:5432/dvdrental
1 rows affected.
bool | bool_1 | bool_2 | bool_3 | bool_4 | bool_5 | bool_6 |
---|---|---|---|---|---|---|
False | False | False | False | False | False | False |
Filtering Using WHERE
Clause in SQL#
Introduction#
The WHERE
clause in SQL is used to filter records that meet certain conditions. It is commonly used in SELECT
, UPDATE
, DELETE
, and INSERT
statements to specify which records should be affected by the query.
Basic Syntax#
SELECT column1, column2, ...
FROM table_name
WHERE condition;
Common Operators#
Comparison Operators:
=
,!=
,>
,<
,>=
,<=
Logical Operators:
AND
,OR
,NOT
Pattern Matching:
LIKE
,ILIKE
(case-insensitive in PostgreSQL)Range:
BETWEEN
Set Membership:
IN
Null Check:
IS NULL
,IS NOT NULL
Examples Using dvdrental
Database#
Example 1: Basic Filtering#
%%sql
SELECT * FROM customer
WHERE store_id = 2
LIMIT 10;
UsageError: Cell magic `%%sql` not found.
This query selects all columns from the customer
table where the store_id
is 2
Example 2: Using AND
and OR
Operators#
%%sql
SELECT * FROM customer
WHERE active = 1 AND store_id = 2
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active | |
---|---|---|---|---|---|---|---|---|---|
4 | 2 | Barbara | Jones | barbara.jones@sakilacustomer.org | 8 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
6 | 2 | Jennifer | Davis | jennifer.davis@sakilacustomer.org | 10 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
8 | 2 | Susan | Wilson | susan.wilson@sakilacustomer.org | 12 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
9 | 2 | Margaret | Moore | margaret.moore@sakilacustomer.org | 13 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
11 | 2 | Lisa | Anderson | lisa.anderson@sakilacustomer.org | 15 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
13 | 2 | Karen | Jackson | karen.jackson@sakilacustomer.org | 17 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
14 | 2 | Betty | White | betty.white@sakilacustomer.org | 18 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
18 | 2 | Carol | Garcia | carol.garcia@sakilacustomer.org | 22 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
20 | 2 | Sharon | Robinson | sharon.robinson@sakilacustomer.org | 24 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
23 | 2 | Sarah | Lewis | sarah.lewis@sakilacustomer.org | 27 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
This query selects all columns from the customer
table where the active
status is 1 and the store_id
is 1.
%%sql
SELECT * FROM customer
WHERE active = 1 OR store_id = 1
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active | |
---|---|---|---|---|---|---|---|---|---|
524 | 1 | Jared | Ely | jared.ely@sakilacustomer.org | 530 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
1 | 1 | Mary | Smith | mary.smith@sakilacustomer.org | 5 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
2 | 1 | Patricia | Johnson | patricia.johnson@sakilacustomer.org | 6 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
3 | 1 | Linda | Williams | linda.williams@sakilacustomer.org | 7 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
4 | 2 | Barbara | Jones | barbara.jones@sakilacustomer.org | 8 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
5 | 1 | Elizabeth | Brown | elizabeth.brown@sakilacustomer.org | 9 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
6 | 2 | Jennifer | Davis | jennifer.davis@sakilacustomer.org | 10 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
7 | 1 | Maria | Miller | maria.miller@sakilacustomer.org | 11 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
8 | 2 | Susan | Wilson | susan.wilson@sakilacustomer.org | 12 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
9 | 2 | Margaret | Moore | margaret.moore@sakilacustomer.org | 13 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
This query selects all columns from the customer
table where the active
status is 1 or the store_id
is 1.
Example 3: Using LIKE
for Pattern Matching#
%%sql
SELECT * FROM customer
WHERE first_name LIKE 'V%';
* postgresql://postgres:***@localhost:5432/dvdrental
16 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active | |
---|---|---|---|---|---|---|---|---|---|
35 | 2 | Virginia | Green | virginia.green@sakilacustomer.org | 39 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
116 | 1 | Victoria | Gibson | victoria.gibson@sakilacustomer.org | 120 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
149 | 1 | Valerie | Black | valerie.black@sakilacustomer.org | 153 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
158 | 1 | Veronica | Stone | veronica.stone@sakilacustomer.org | 162 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
184 | 1 | Vivian | Ruiz | vivian.ruiz@sakilacustomer.org | 188 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
195 | 1 | Vanessa | Sims | vanessa.sims@sakilacustomer.org | 199 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
201 | 1 | Vicki | Fields | vicki.fields@sakilacustomer.org | 205 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
218 | 1 | Vera | Mccoy | vera.mccoy@sakilacustomer.org | 222 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
244 | 2 | Viola | Hanson | viola.hanson@sakilacustomer.org | 248 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
251 | 2 | Vickie | Brewer | vickie.brewer@sakilacustomer.org | 255 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
286 | 1 | Velma | Lucas | velma.lucas@sakilacustomer.org | 291 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
289 | 1 | Violet | Rodriquez | violet.rodriquez@sakilacustomer.org | 294 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
382 | 2 | Victor | Barkley | victor.barkley@sakilacustomer.org | 387 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
414 | 1 | Vincent | Ralston | vincent.ralston@sakilacustomer.org | 419 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
483 | 2 | Vernon | Chapa | vernon.chapa@sakilacustomer.org | 488 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
581 | 1 | Virgil | Wofford | virgil.wofford@sakilacustomer.org | 587 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
%%sql
SELECT * FROM customer
WHERE first_name LIKE '____i';
* postgresql://postgres:***@localhost:5432/dvdrental
4 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active | |
---|---|---|---|---|---|---|---|---|---|
201 | 1 | Vicki | Fields | vicki.fields@sakilacustomer.org | 205 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
206 | 1 | Terri | Vasquez | terri.vasquez@sakilacustomer.org | 210 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
241 | 2 | Heidi | Larson | heidi.larson@sakilacustomer.org | 245 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 0 |
274 | 1 | Naomi | Jennings | naomi.jennings@sakilacustomer.org | 279 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
This query selects all columns from the customer
table where the first_name
who is 5 character long and ends with ‘i’.
Example 4: Using BETWEEN
for Range Filtering#
%%sql
SELECT * FROM payment
WHERE payment_date BETWEEN '2007-03-02 02:30:00' AND '2007-03-02 03:00:00';
* postgresql://postgres:***@localhost:5432/dvdrental
19 rows affected.
payment_id | customer_id | staff_id | rental_id | amount | payment_date |
---|---|---|---|---|---|
19842 | 300 | 1 | 10977 | 4.99 | 2007-03-02 02:40:43.996577 |
19887 | 305 | 1 | 10981 | 2.99 | 2007-03-02 02:46:19.996577 |
20056 | 326 | 2 | 10976 | 4.99 | 2007-03-02 02:40:14.996577 |
20240 | 345 | 2 | 10982 | 4.99 | 2007-03-02 02:47:37.996577 |
20267 | 348 | 2 | 10972 | 2.99 | 2007-03-02 02:36:51.996577 |
20383 | 361 | 2 | 10975 | 0.99 | 2007-03-02 02:39:51.996577 |
20577 | 380 | 1 | 10983 | 3.99 | 2007-03-02 02:52:49.996577 |
20610 | 383 | 1 | 10971 | 4.99 | 2007-03-02 02:36:43.996577 |
21248 | 450 | 1 | 10984 | 3.99 | 2007-03-02 02:58:28.996577 |
21619 | 487 | 1 | 10978 | 9.99 | 2007-03-02 02:40:53.996577 |
21652 | 491 | 2 | 10974 | 6.99 | 2007-03-02 02:39:18.996577 |
22161 | 547 | 1 | 10980 | 4.99 | 2007-03-02 02:45:58.996577 |
23861 | 127 | 1 | 10973 | 7.99 | 2007-03-02 02:38:08.996577 |
24111 | 149 | 2 | 10967 | 0.99 | 2007-03-02 02:30:42.996577 |
24222 | 160 | 2 | 10979 | 5.99 | 2007-03-02 02:45:03.996577 |
24465 | 186 | 1 | 10985 | 0.99 | 2007-03-02 02:58:45.996577 |
24838 | 234 | 2 | 10968 | 7.99 | 2007-03-02 02:31:39.996577 |
24936 | 244 | 2 | 10969 | 2.99 | 2007-03-02 02:32:58.996577 |
25086 | 260 | 2 | 10970 | 8.99 | 2007-03-02 02:35:12.996577 |
This query selects all columns from the payment
table where the payment_date
is between ‘2007-03-02 02:30:00’ and ‘2007-03-02 03:00:00’
Example 5: Using IN
for Set Membership#
%%sql
SELECT * FROM customer
WHERE last_name IN ('Smith', 'Johnson', 'Williams');
* postgresql://postgres:***@localhost:5432/dvdrental
3 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active | |
---|---|---|---|---|---|---|---|---|---|
1 | 1 | Mary | Smith | mary.smith@sakilacustomer.org | 5 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
2 | 1 | Patricia | Johnson | patricia.johnson@sakilacustomer.org | 6 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
3 | 1 | Linda | Williams | linda.williams@sakilacustomer.org | 7 | True | 2006-02-14 | 2013-05-26 14:49:45.738000 | 1 |
This query selects all columns from the customer
table where the last_name
is either ‘Smith’, ‘Johnson’, or ‘Williams’
Example 6: Checking for NULL
Values#
%%sql
SELECT * FROM customer
WHERE email IS NULL;
* postgresql://postgres:***@localhost:5432/dvdrental
0 rows affected.
customer_id | store_id | first_name | last_name | address_id | activebool | create_date | last_update | active |
---|
This query selects all columns from the customer
table where the email
is NULL
.
Derived Columns in SQL#
Introduction#
Derived columns are columns that are not stored in the database but are created dynamically using expressions or functions in a SELECT
statement. They are useful for performing calculations, formatting data, or combining multiple columns into one.
Basic Syntax#
SELECT column1, column2, (expression) AS derived_column_name
FROM table_name;
Examples Using dvdrental
Database#
Example 1: Concatenating Columns#
Concatenate the first_name
and last_name
columns to create a full name.
%%sql
SELECT first_name, last_name,
CONCAT(first_name, ' ',last_name) AS full_name
FROM customer
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
first_name | last_name | full_name |
---|---|---|
Jared | Ely | Jared Ely |
Mary | Smith | Mary Smith |
Patricia | Johnson | Patricia Johnson |
Linda | Williams | Linda Williams |
Barbara | Jones | Barbara Jones |
Elizabeth | Brown | Elizabeth Brown |
Jennifer | Davis | Jennifer Davis |
Maria | Miller | Maria Miller |
Susan | Wilson | Susan Wilson |
Margaret | Moore | Margaret Moore |
Example 2: Calculating Age from Date of Birth#
Assuming there is a date_of_birth
column, calculate the age of customers.
%%sql
SELECT first_name, last_name,
last_update,
now()-last_update AS rental_duration
FROM customer
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
first_name | last_name | last_update | rental_duration |
---|---|---|---|
Jared | Ely | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Mary | Smith | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Patricia | Johnson | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Linda | Williams | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Barbara | Jones | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Elizabeth | Brown | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Jennifer | Davis | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Maria | Miller | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Susan | Wilson | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Margaret | Moore | 2013-05-26 14:49:45.738000 | 4131 days, 2:41:43.614507 |
Example 3: Formatting Dates#
Format the rental_date
column to display only the year and month.
%%sql
SELECT rental_id,
rental_date,
TO_CHAR(rental_date, 'YYYY-MM') AS rental_year_month
FROM rental
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
rental_id | rental_date | rental_year_month |
---|---|---|
2 | 2005-05-24 22:54:33 | 2005-05 |
3 | 2005-05-24 23:03:39 | 2005-05 |
4 | 2005-05-24 23:04:41 | 2005-05 |
5 | 2005-05-24 23:05:21 | 2005-05 |
6 | 2005-05-24 23:08:07 | 2005-05 |
7 | 2005-05-24 23:11:53 | 2005-05 |
8 | 2005-05-24 23:31:46 | 2005-05 |
9 | 2005-05-25 00:00:40 | 2005-05 |
10 | 2005-05-25 00:02:21 | 2005-05 |
11 | 2005-05-25 00:09:02 | 2005-05 |
Example 4: Conditional Derived Column#
Create a derived column to categorize customers based on their active status.
%%sql
SELECT first_name, last_name,
active,
CASE
WHEN active = 1 THEN 'Active'
ELSE 'Inactive'
END AS status
FROM customer
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
first_name | last_name | active | status |
---|---|---|---|
Jared | Ely | 1 | Active |
Mary | Smith | 1 | Active |
Patricia | Johnson | 1 | Active |
Linda | Williams | 1 | Active |
Barbara | Jones | 1 | Active |
Elizabeth | Brown | 1 | Active |
Jennifer | Davis | 1 | Active |
Maria | Miller | 1 | Active |
Susan | Wilson | 1 | Active |
Margaret | Moore | 1 | Active |
Example 5: Arithmetic Operations#
Calculate the total payment amount including tax (assuming a tax rate of 10%).
tax_rate = 1.10
%%sql
SELECT payment_id,
amount,
amount * :tax_rate AS total_amount_with_tax
FROM payment
LIMIT 10;
* postgresql://postgres:***@localhost:5432/dvdrental
10 rows affected.
payment_id | amount | total_amount_with_tax |
---|---|---|
17503 | 7.99 | 8.789 |
17504 | 1.99 | 2.189 |
17505 | 7.99 | 8.789 |
17506 | 2.99 | 3.289 |
17507 | 7.99 | 8.789 |
17508 | 5.99 | 6.589 |
17509 | 5.99 | 6.589 |
17510 | 5.99 | 6.589 |
17511 | 2.99 | 3.289 |
17512 | 4.99 | 5.489 |