Getting the most out of Oracle Database Express Edition (XE) 18c
Express Edition is a nice starter database which comes for free and does not need a license. Well, you do not get support for it either. There where XE databases in 10g and in 11g, but in this blog post I look into current XE which is 18c. I will concentrate on how to get the most out of it. This mainly goes to its space limitation. After discussion with clients and other consultants often there comes the concern of limited space, which is documented to be 12 GB in 18c. So the question here is how to install an XE as lightweight as possible without any unneeded overhead. And how to store the real user data as compact as possible.
After installation of XE 18c and following the default deployment you will have a CDB with one PDB.
This is what is called by the “/etc/init.d/oracle-xe-18c configure” under the hood:
(echo 'YourChosenPassword'; echo 'YourChosenPassword'; echo 'YourChosenPassword') | dbca -silent -createDatabase -gdbName XE -templateName XE_Database.dbc -characterSet AL32UTF8 -createAsContainerDatabase true -numberOfPDBs 1 -pdbName XEPDB1 -sid XE -emConfiguration DBEXPRESS -emExpressPort 5500 -J-Doracle.assistants.dbca.validate.DBCredentials=false -sampleSchema true -customScripts /opt/oracle/product/18c/dbhomeXE/assistants/dbca/postdb_creation.sql -initParams sga_target=1146M,pga_aggregate_target=382M
CDB is called XE, PDB is called XEPDB1. DB has loaded all possible database components, which we can see here:
SELECT comp_name, COUNT(*) FROM cdb_registry GROUP BY comp_name ORDER BY comp_name; COMP_NAME COUNT(*) ---------------------------------- ---------- JServer JAVA Virtual Machine 2 OLAP Analytic Workspace 2 Oracle Database Catalog Views 2 Oracle Database Java Packages 2 Oracle Database Packages and Types 2 Oracle Database Vault 2 Oracle Label Security 2 Oracle Multimedia 2 Oracle OLAP API 2 Oracle Real Application Clusters 2 Oracle Text 2 Oracle Workspace Manager 2 Oracle XDK 2 Oracle XML Database 2 Spatial 2 15 rows selected.
The registry components are visible here twice as it is loaded into CDB$ROOT and PDB XEPDB1. Actually it is also there a third time – in PDB$SEED. This PDB is excluded from the most CDB_-views by default and so missing in the output above. So having OLAP, Multimedia and so on in this CDB multiple times seems not to be that lightweight any more. We waste space for probably unneeded components.
Anyway, now I go on with it and load data. I use Dominic Giles’ data generator tool for this, which can be downloaded from https://www.dominicgiles.com.
Dominic is also the author of the great swingbench tool, but here I make use of his test data generator.
I will create a test user in the PDB:
ALTER SESSION SET CONTAINER=XEPDB1; CREATE USER testdata IDENTIFIED BY testdata; ALTER USER testdata QUOTA UNLIMITED ON users; GRANT CREATE TABLE, CREATE SESSION, ALTER SESSION TO testdata; GRANT CONNECT, RESOURCE TO testdata;
And now let us see how much data we can load until I get the famous error: ORA-12954: The request exceeds the maximum allowed database size of 12 GB.
This is how I configure test data generator to load data until full:
Start data generator, then file –> open –> soe.xml to choose the popular schema with the customers, orders and order_items, then
click tools –> sizing wizard and try to insert 15,1 GB of data which is much more than allowed, click finish.
Click run –> start data generation to database, provide db credentials and possibly increase jdbc batch size, then click next.
Now wait until the db is full. At my lab it took a while until the error ORA-12954 was raised. Actually all the data was loaded but the indexes where missing. It stopped when creating index for primary key of orders table as it could not be created any more.
this was the DDL which failed:
ALTER TABLE orders ADD CONSTRAINT orders_pk PRIMARY KEY(order_id);
So how many user data we have now in the segments:
ALTER SESSION SET CONTAINER=XEPDB1; SELECT round(SUM(bytes)/1024/1024/1024,3) GB FROM dba_segments WHERE owner='TESTDATA'; GB ---------- 12.357
Wow, a little more than 12 GB. Thank you Oracle.
Well, now let us use another approach: create a lightweight database. And see how many data we can load. First we stop the current XE and drop it.
Okay and now we create – surprise, surprise – a non-CDB, and we do it very manual. I will name it different: XE18 and I will use a proper OFA.
mkdir -p /u00/app/oracle/admin/XE18/ cd /u00/app/oracle/admin/XE18/ mkdir adump dpdump pfile scripts cd scripts # create an init.ora vi init.ora audit_file_dest="/u00/app/oracle/admin/XE18/adump" audit_trail=none compatible=18.0.0 control_file_record_keep_time=90 control_files='/u01/oradata/XE18/ctl1XE18.dbf','/u02/oradata/XE18/ctl2XE18.dbf' db_block_size=8192 db_domain='' db_name='XE18' db_recovery_file_dest_size=4G db_recovery_file_dest='/u02/fast_recovery_area' db_unique_name='XE18_SITE1' diagnostic_dest=/u00/app/oracle filesystemio_options='setall' log_archive_dest_1='location="USE_DB_RECOVERY_FILE_DEST"','valid_for=(ONLINE_LOGFILES,ALL_ROLES)' log_checkpoints_to_alert=true log_file_name_convert='/u','/u' max_dump_file_size='10M' open_cursors=300 pga_aggregate_target=400m processes=300 remote_login_passwordfile=EXCLUSIVE sga_target=600m undo_tablespace=UNDOTBS
Typically I set audit_trail=db,extended, but here I want to save space, so I set it to none.
The other parameters are Trivadis best practice (TBP). Sizing of SGA, FRA etc. is quite minimal.
Create an oratab entry in /etc/oratab: XE18:/opt/oracle/product/18c/dbhomeXE:Y, then create a password file and shared parameter file:
orapwd file=/u00/app/oracle/admin/XE18/pfile/orapwXE18 format=12 password=manager CREATE SPFILE='/u00/app/oracle/admin/XE18/pfile/spfileXE18.ora' FROM PFILE='u00/app/oracle/admin/XE18/scripts/init.ora';
and link them to Oracle Home – dbs:
cd $ORACLE_HOME/dbs ln -s /u00/app/oracle/admin/XE18/pfile/spfileXE18.ora ln -s /u00/app/oracle/admin/XE18/pfile/orapwXE18
mkdir -p /u01/oradata/XE18/ /u02/oradata/XE18/
And now the creation of database.
STARTUP NOMOUNT CREATE DATABASE XE18 USER SYS IDENTIFIED BY manager USER SYSTEM IDENTIFIED BY manager -- yes we can and should multiplex our control and redo log files LOGFILE GROUP 1 ('/u01/oradata/XE18/redog1m1XE18.dbf','/u02/oradata/XE18/redog1m2XE18.dbf') SIZE 10M, GROUP 2 ('/u01/oradata/XE18/redog2m1XE18.dbf','/u02/oradata/XE18/redog2m2XE18.dbf') SIZE 10M, GROUP 3 ('/u01/oradata/XE18/redog3m1XE18.dbf','/u02/oradata/XE18/redog3m2XE18.dbf') SIZE 10M MAXLOGHISTORY 1 MAXLOGFILES 16 MAXLOGMEMBERS 3 MAXDATAFILES 1024 -- probably we could also experiment with other character sets here, but I like to have databases in unicode CHARACTER SET AL32UTF8 NATIONAL CHARACTER SET AL16UTF16 EXTENT MANAGEMENT LOCAL -- I am too stingy, so restrict autoextend for SYSTEM and SYSAUX -- 450M for SYSTEM and 400M for SYSAUX should be enough as a start -- later on you should monitor the growth of the tablespaces well and find out how you can stop it from further growing DATAFILE '/u01/oradata/XE18/system01XE18.dbf' SIZE 300M AUTOEXTEND ON NEXT 10M MAXSIZE 450M SYSAUX DATAFILE '/u01/oradata/XE18/sysaux01XE18.dbf' SIZE 300M AUTOEXTEND ON NEXT 10M MAXSIZE 400M -- this tablespace is for our real user data DEFAULT TABLESPACE users DATAFILE '/u01/oradata/XE18/users01XE18.dbf' SIZE 10M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED -- TEMP and UNDO do not count, so make them as huge as you need them DEFAULT TEMPORARY TABLESPACE temp TEMPFILE '/u01/oradata/XE18/temp01XE18.dbf' SIZE 10M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED -- I like undo tablespaces with fixed size datafiles without autoextend UNDO TABLESPACE undotbs DATAFILE '/u01/oradata/XE18/undotbs01XE18.dbf' SIZE 1G ; -- load the components @?/rdbms/admin/catalog.sql @?/rdbms/admin/catproc.sql @?/rdbms/admin/utlrp.sql -- the sqlplus support CONNECT system/manager @?/sqlplus/admin/pupbld.sql @?/sqlplus/admin/help/hlpbld.sql helpus.sql -- sometimes I use user system for data pump, let us create its master tables -- not in system tablespace, as I want to limit its size strictly. ALTER USER system DEFAULT TABLESPACE users;
Now the database is lean:
SELECT comp_name, status, version FROM dba_registry ORDER BY comp_name; COMP_NAME STATUS VERSION ----------------------------------- ----------- ----------- Oracle Database Catalog Views VALID 220.127.116.11.0 Oracle Database Packages and Types VALID 18.104.22.168.0 Oracle Real Application Clusters OPTION OFF 22.214.171.124.0 Oracle XML Database VALID 126.96.36.199.0
So now we create the same testdata user here and try to load same data as before.
This time I got the error later at another constraint:
ALTER TABLE order_items ADD CONSTRAINT order_items_pk PRIMARY KEY(order_id,line_item_id);
This time the orders_pk index was created successfully before. So we have more segments now. Let us check the data and its space consumption again:
SELECT round(SUM(bytes)/1024/1024/1024,3) GB FROM dba_segments WHERE owner='TESTDATA'; GB ---------- 13.482
More wow, a little more than 13 GB. 1.125 GB more than on our first try with the CDB/PDB. But still some indexes are missing.
Can we improve that?
I repeat the data load with a very small set (1% or so) to get the last missing index definitions. And then I do the following:
-- compact the regular tables BEGIN FOR i IN (SELECT owner, table_name FROM dba_tables WHERE owner = 'TESTDATA' AND partitioned = 'NO' AND iot_type IS NULL AND external = 'NO') LOOP dbms_output.put_line('Changing: "' || i.owner || '"."' || i.table_name || '"'); EXECUTE IMMEDIATE 'ALTER TABLE "' || i.owner || '"."' || i.table_name || '" PCTFREE 0 ROW STORE COMPRESS ADVANCED'; END LOOP; END; / -- compact the regular indexes BEGIN FOR i IN (SELECT owner, index_name FROM dba_indexes WHERE owner = 'TESTDATA' AND partitioned = 'NO' AND index_type = 'NORMAL' -- exclude secondary indexes on IOTs AND pct_direct_access IS NULL ) LOOP dbms_output.put_line('Changing: "' || i.owner || '"."' || i.index_name || '"'); EXECUTE IMMEDIATE 'ALTER INDEX "' || i.owner || '"."' || i.index_name || '" REBUILD PCTFREE 0 COMPRESS ADVANCED'; END LOOP; END; /
In this schema I only have “regular” tables and indexes. The modification would be different for index-organized tables, for bitmap indexes and also for partitioned tables and indexes. Yes, by the way, we have bitmap indexes and partitioning in XE 18c, and we even have Advanced Compression and more things that are missing in Standard Edition 2. Cool, right!?
Have a look at https://docs.oracle.com/en/database/oracle/oracle-database/18/xelic/licensing-information.html for details.
A little check now shows us, that the definition really changed to compressed tables. “Row store compress advanced” was introduced as OLTP table compression in 11g. It is only the current name for the same feature, but in Enterprise Edition it is part of Advanced Compression option. PCT_FREE is set to lowest possible value 0 to make data even more compact.
SELECT owner, pct_free, compression, compress_for, COUNT(*) FROM dba_tables WHERE owner = 'TESTDATA' AND partitioned = 'NO' GROUP BY owner, pct_free, compression, compress_for; OWNER PCT_FREE COMPRESS COMPRESS_FOR COUNT(*) ---------- ---------- -------- ------------ ---------- TESTDATA 0 ENABLED ADVANCED 8
We do it similar for the indexes, here we use Advanced Index Compression, another feature of Advanced Compression.
SELECT owner, pct_free, compression, COUNT(*) FROM dba_indexes WHERE owner = 'TESTDATA' AND partitioned = 'NO' GROUP BY owner, pct_free, compression; OWNER PCT_FREE COMPRESSION COUNT(*) ---------- ---------- ------------- ---------- TESTDATA 0 ADVANCED HIGH 7
Yep, everything is defined to be compressed. And now let us load the data again. Same settings as before with the desired 15.1 GB. In the last dialog we choose to keep the tables as they are (now compressed) and only to truncate them before the load.
Hours later we could load the data completely into the database. Thanks to compression it fitted. The tables and indexes only use 12.398 GB in the segments, but we are complete.
SELECT round(SUM(bytes)/1024/1024/1024,3) GB FROM dba_segments WHERE owner='TESTDATA'; GB ---------- 12.398
But we also paid a price for this. The load was much slower as before. In my lab hours instead of minutes. This is mainly because of enabled indexes during the load but also because the data is much more compact. You should keep in mind never to compress tables which are updated frequently, as this would lead to chained or migrated rows. And compressing data and indexes can also lead to more contention in case you need to access the data from a lot of concurrent sessions.
Let us analyze the data a little further and get very accurate statistics now for that,
well this cannot be done parallel in XE, so takes quite a while:
EXEC dbms_stats.gather_schema_stats('TESTDATA', estimate_percent => NULL);
We have exactly the data which was promised by the load generator as you see in the first picture above. And you can see also the space consumption per table here:
SELECT t.table_name, t.num_rows, t.blocks used_blocks, s.blocks segment_blocks, s.bytes/1024/1024 segment_mb FROM dba_tables t, dba_segments s WHERE t.owner = s.owner AND t.table_name = s.segment_name AND t.owner='TESTDATA' ORDER BY t.table_name; TABLE_NAME NUM_ROWS USED_BLOCKS SEGMENT_BLOCKS SEGMENT_MB -------------------- ---------- ----------- -------------- ---------- CUSTOMERS 53958516 449636 450560 3520 INVENTORIES 899476 1882 1920 15 LOGON 13489629 20297 20480 160 ORDERS 60703330 359100 359936 2812 ORDER_ITEMS 182491582 441452 442368 3456 PRODUCT_DESCRIPTIONS 1000 28 32 .25 PRODUCT_INFORMATION 1000 28 32 .25 WAREHOUSES 1000 5 8 .0625 8 rows selected.
And this is the information about the 7 indexes, we only have PK indexes in the example schema.
SELECT i.index_name, i.num_rows, i.leaf_blocks, s.blocks segment_blocks, s.bytes/1024/1024 segment_mb FROM dba_indexes i, dba_segments s WHERE i.owner = s.owner AND i.index_name = s.segment_name AND i.owner='TESTDATA' ORDER BY i.table_name; INDEX_NAME NUM_ROWS LEAF_BLOCKS SEGMENT_BLOCKS SEGMENT_MB ------------------------ ---------- ----------- -------------- ---------- CUSTOMERS_PK 53958516 56571 57728 451 INVENTORIES_PK 899476 999 1152 9 ORDERS_PK 60703330 62962 63488 496 ORDER_ITEMS_PK 182491582 218870 227328 1776 PRODUCT_DESCRIPTIONS_PK 1000 2 8 .0625 PRODUCT_INFORMATION_PK 1000 1 8 .0625 WAREHOUSES_PK 1000 1 8 .0625 7 rows selected.
Can we beat that?
Table compression in the heap tables is dependent on redundant data per block and also works better if loaded with direct path loads. So I will do now the following for the 3 biggest tables. export the data and sort it at the same time. The sort criteria was chosen by me when looking at the column statistics. I favored columns with very redundant data, so with a little number of distinct values, to be first criteria in order by list. But also the average column length was considered. The longer the column contents the more relevant its sorting is. You might want to experiment here with other order by clauses as well.
This is how I reorganized the data:
# expdp system/manager TABLES=testdata.customers CONTENT=data_only DUMPFILE=customers.dmp REUSE_DUMPFILES=YES LOGFILE=exp_customers.log QUERY='"ORDER BY cust_first_name, cust_last_name, nls_language, nls_territory, cust_email, account_mgr_id, credit_limit, customer_id"' ALTER TABLE testdata.customers DROP CONSTRAINT customers_pk; TRUNCATE TABLE testdata.customers; # impdp system/manager DUMPFILE=customers.dmp LOGFILE=imp_customers.log ALTER TABLE testdata.customers ADD CONSTRAINT customers_pk PRIMARY KEY (customer_id) USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;
# expdp system/manager TABLES=testdata.orders CONTENT=data_only DUMPFILE=orders.dmp REUSE_DUMPFILES=YES LOGFILE=exp_orders.log QUERY='"ORDER BY order_mode, order_status, sales_rep_id, warehouse_id, promotion_id, order_total, customer_id, order_date, order_id"' ALTER TABLE testdata.orders DROP CONSTRAINT orders_pk; TRUNCATE TABLE testdata.orders; # impdp system/manager DUMPFILE=orders.dmp LOGFILE=imp_orders.log ALTER TABLE testdata.orders ADD CONSTRAINT orders_pk PRIMARY KEY (order_id) USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;
# expdp system/manager TABLES=testdata.order_items CONTENT=data_only DUMPFILE=order_items.dmp REUSE_DUMPFILES=YES LOGFILE=exp_order_items.log QUERY='"ORDER BY line_item_id, quantity, product_id, unit_price, order_id"' ALTER TABLE testdata.order_items DROP CONSTRAINT order_items_pk; TRUNCATE TABLE testdata.order_items; # impdp system/manager DUMPFILE=order_items.dmp LOGFILE=imp_order_items.log ALTER TABLE testdata.order_items ADD CONSTRAINT order_items_pk PRIMARY KEY (order_id, line_item_id) USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;
Again I updated the statistics and queried the same as above:
TABLE_NAME NUM_ROWS USED_BLOCKS SEGMENT_BLOCKS SEGMENT_MB -------------------- ---------- ----------- -------------- ---------- CUSTOMERS 53958516 105447 105472 824 INVENTORIES 899476 1882 1920 15 LOGON 13489629 20297 20480 160 ORDERS 60703330 192882 196608 1536 ORDER_ITEMS 182491582 303003 303616 2372 PRODUCT_DESCRIPTIONS 1000 28 32 .25 PRODUCT_INFORMATION 1000 28 32 .25 WAREHOUSES 1000 5 8 .0625 8 rows selected. INDEX_NAME NUM_ROWS LEAF_BLOCKS SEGMENT_BLOCKS SEGMENT_MB ------------------------ ---------- ----------- -------------- ---------- CUSTOMERS_PK 53958516 69023 69632 544 INVENTORIES_PK 899476 999 1152 9 ORDERS_PK 60703330 78875 79872 624 ORDER_ITEMS_PK 182491582 250648 253952 1984 PRODUCT_DESCRIPTIONS_PK 1000 2 8 .0625 PRODUCT_INFORMATION_PK 1000 1 8 .0625 WAREHOUSES_PK 1000 1 8 .0625 7 rows selected.
Awesome we saved a lot of space, but interestingly the indexes grew. They seem to be dependent on the sort order of the data in the heap segment.
The overall used GB in the segments is now:
SELECT round(SUM(bytes)/1024/1024/1024,3) GB FROM dba_segments WHERE owner='TESTDATA'; GB ---------- 7.88
We could furthermore experiment with tablespaces with uniform extent sizes using the minimum extent sizes of 40k for 8k block size tablespaces to save the space which is left free in potentially larger extents. Of course larger block sizes could also improve compression ratios. And attribute clustering might help for special cases, another feature which is available in XE but not in Standard Edition 2. And you
could move static tables without indexes to external tables.
Well, all that stuff is left as an exercise for you. Thanks for reading to the end.