New “Sales History” Sample Schema in Oracle 23c
This post was originally published on this site
The newest version of the Oracle sample schema “Sales History” (SH) finally contains current data again and is easier to install than the previous versions. If you work with star schemas, the SH schema is a must.
Last week, I had the pleasure to teach a group of students about the basics of dimensional modeling and how to write SQL queries on a typical star schema. For the exercises we used livesql.oracle.com and the sample schema “Sales History” (SH) on Oracle 19c. Although it is quite old, this schema is still useful to test and show many features and examples based on dimension and fact tables. The only disadvantage of this sample schema is the outdated data. Of course, to explain how a GROUP BY works it doesn’t matter when you select sales data of the years 1998 to 2001, but at least for younger students it is confusing if they see data that is older than they are. And products such as “DVD-R Discs, 4.7GB, Pack of 5” or “64MB Memory Card” are not the best-selling electronic products nowadays. This was a bit different when this version of the SH schema was introduced – I think it was with Oracle 10g. By the way: the first version of the SH schema was a fashion shop with 10000 products. I still remember the different colors of shorts in size XXXL.
Updated Version of SH Schema
The new version of “Sales History” contains the same tables and table structure. The data distribution is the same as before, but some of the data was “renovated”:
- We still have the same 55500 customers. None of them moved to another address, but all customers became 11 years younger (column CUST_YEAR_OF_BIRTH was updated).
- The shop now sells sports items, not electronics anymore. Instead of photo cameras, DVDs and computer games, the sold products are tennis rackets, soccer balls and other sports items.
- The calendar dimension TIMES and the two fact tables SALES and COSTS contain data from the years 2019 to 2022. The rows are exactly the same as before, but column TIME_ID was updated.
Simplified Installation Scripts
What I like even more than the updated data is the easier installation of the SH schema. Unlike all other sample schemas, e.g. “Human Resources” (HR) or “Customer Orders” (CO), the SH installation scripts need more than just a few INSERT statements. Because of the high number of rows, data was loaded with SQL*Loader. The installation scripts were a bit tricky to understand. I never liked them and usually used Data Pump export/import to install the SH schema on a new database.
The new installation scripts are much easier. The small tables are populated with INSERT statements, and the larger tables use the LOAD command of SQLcl to load the data from CSV files. For example, the CUSTOMERS dimension is loaded with one command:
LOAD customers customers.csv
The installation scripts for SH (and all the other sample schemas of Oracle) can be downloaded from GitHub:
You need at least Oracle database version 19c. To run the installation script for SH, you need SQLcl (Oracle SQL Developer Command Line), because SQL*Plus does not support the LOAD command.
I had only one small issue that was easy to fix: The date format must be ‘YYYY-MM-DD’, but unfortunately this is not included in the script. I added the following line in the main script sh_install.sql:
ALTER SESSION SET NLS_DATE_FORMAT='YYYY-MM-DD’;
fter this change, the installation script ran through in just over a minute without errors.
From now on, I will use the new version of the SH scripts for my tests, live demos and blog post example.