This post was originally published on this site

Express Edition is a nice starter database which comes for free and does not need a license. Well, you do not get support for it either. There where XE databases in 10g and in 11g, but in this blog post I look into current XE which is 18c. I will concentrate on how to get the most out of it. This mainly goes to its space limitation. After discussion with clients and other consultants often there comes the concern of limited space, which is documented to be 12 GB in 18c. So the question here is how to install an XE as lightweight as possible without any unneeded overhead. And how to store the real user data as compact as possible.

After installation of XE 18c and following the default deployment you will have a CDB with one PDB.
This is what is called by the “/etc/init.d/oracle-xe-18c configure” under the hood:

(echo 'YourChosenPassword'; echo 'YourChosenPassword'; echo 'YourChosenPassword') 
| dbca -silent -createDatabase -gdbName XE 
-templateName XE_Database.dbc -characterSet AL32UTF8 
-createAsContainerDatabase true 
-numberOfPDBs 1 -pdbName XEPDB1 -sid XE 
-emConfiguration DBEXPRESS -emExpressPort 5500 
-J-Doracle.assistants.dbca.validate.DBCredentials=false 
-sampleSchema true 
-customScripts /opt/oracle/product/18c/dbhomeXE/assistants/dbca/postdb_creation.sql 
-initParams sga_target=1146M,pga_aggregate_target=382M

CDB is called XE, PDB is called XEPDB1. DB has loaded all possible database components, which we can see here:

SELECT comp_name, COUNT(*)
FROM cdb_registry
GROUP BY comp_name
ORDER BY comp_name;

COMP_NAME                            COUNT(*)
---------------------------------- ----------
JServer JAVA Virtual Machine                2
OLAP Analytic Workspace                     2
Oracle Database Catalog Views               2
Oracle Database Java Packages               2
Oracle Database Packages and Types          2
Oracle Database Vault                       2
Oracle Label Security                       2
Oracle Multimedia                           2
Oracle OLAP API                             2
Oracle Real Application Clusters            2
Oracle Text                                 2
Oracle Workspace Manager                    2
Oracle XDK                                  2
Oracle XML Database                         2
Spatial                                     2

15 rows selected.

The registry components are visible here twice as it is loaded into CDB$ROOT and PDB XEPDB1. Actually it is also there a third time – in PDB$SEED. This PDB is excluded from the most CDB_-views by default and so missing in the output above. So having OLAP, Multimedia and so on in this CDB multiple times seems not to be that lightweight any more. We waste space for probably unneeded components.

Anyway, now I go on with it and load data. I use Dominic Giles’ data generator tool for this, which can be downloaded from https://www.dominicgiles.com.
Dominic is also the author of the great swingbench tool, but here I make use of his test data generator.

I will create a test user in the PDB:

ALTER SESSION SET CONTAINER=XEPDB1;
CREATE USER testdata IDENTIFIED BY testdata;
ALTER USER testdata QUOTA UNLIMITED ON users;
GRANT CREATE TABLE, CREATE SESSION, ALTER SESSION TO testdata;
GRANT CONNECT, RESOURCE TO testdata;

And now let us see how much data we can load until I get the famous error: ORA-12954: The request exceeds the maximum allowed database size of 12 GB.

This is how I configure test data generator to load data until full:
Start data generator, then file –> open –> soe.xml to choose the popular schema with the customers, orders and order_items, then
click tools –> sizing wizard and try to insert 15,1 GB of data which is much more than allowed, click finish.

Click run –> start data generation to database, provide db credentials and possibly increase jdbc batch size, then click next.

Now wait until the db is full. At my lab it took a while until the error ORA-12954 was raised. Actually all the data was loaded but the indexes where missing. It stopped when creating index for primary key of orders table as it could not be created any more.
this was the DDL which failed:

ALTER TABLE orders ADD CONSTRAINT orders_pk PRIMARY KEY(order_id);

So how many user data we have now in the segments:

ALTER SESSION SET CONTAINER=XEPDB1;
SELECT round(SUM(bytes)/1024/1024/1024,3) GB
FROM dba_segments
WHERE owner='TESTDATA';

        GB
----------
    12.357

Wow, a little more than 12 GB. Thank you Oracle.

Well, now let us use another approach: create a lightweight database. And see how many data we can load. First we stop the current XE and drop it.
Okay and now we create – surprise, surprise – a non-CDB, and we do it very manual. I will name it different: XE18 and I will use a proper OFA.

mkdir -p /u00/app/oracle/admin/XE18/
cd /u00/app/oracle/admin/XE18/
mkdir adump dpdump pfile scripts
cd scripts
# create an init.ora
vi init.ora
 audit_file_dest="/u00/app/oracle/admin/XE18/adump"
 audit_trail=none
 compatible=18.0.0
 control_file_record_keep_time=90
 control_files='/u01/oradata/XE18/ctl1XE18.dbf','/u02/oradata/XE18/ctl2XE18.dbf'
 db_block_size=8192
 db_domain=''
 db_name='XE18'
 db_recovery_file_dest_size=4G
 db_recovery_file_dest='/u02/fast_recovery_area'
 db_unique_name='XE18_SITE1'
 diagnostic_dest=/u00/app/oracle
 filesystemio_options='setall'
 log_archive_dest_1='location="USE_DB_RECOVERY_FILE_DEST"','valid_for=(ONLINE_LOGFILES,ALL_ROLES)'
 log_checkpoints_to_alert=true
 log_file_name_convert='/u','/u'
 max_dump_file_size='10M'
 open_cursors=300
 pga_aggregate_target=400m
 processes=300
 remote_login_passwordfile=EXCLUSIVE
 sga_target=600m
 undo_tablespace=UNDOTBS

Typically I set audit_trail=db,extended, but here I want to save space, so I set it to none.
The other parameters are Trivadis best practice (TBP). Sizing of SGA, FRA etc. is quite minimal.

Create an oratab entry in /etc/oratab: XE18:/opt/oracle/product/18c/dbhomeXE:Y, then create a password file and shared parameter file:

orapwd file=/u00/app/oracle/admin/XE18/pfile/orapwXE18 format=12 password=manager
CREATE SPFILE='/u00/app/oracle/admin/XE18/pfile/spfileXE18.ora'
FROM PFILE='u00/app/oracle/admin/XE18/scripts/init.ora';

and link them to Oracle Home – dbs:

cd $ORACLE_HOME/dbs
ln -s /u00/app/oracle/admin/XE18/pfile/spfileXE18.ora
ln -s /u00/app/oracle/admin/XE18/pfile/orapwXE18 

Provide storage:

mkdir -p /u01/oradata/XE18/ /u02/oradata/XE18/

And now the creation of database.

STARTUP NOMOUNT

CREATE DATABASE XE18
USER SYS IDENTIFIED BY manager
USER SYSTEM IDENTIFIED BY manager
-- yes we can and should multiplex our control and redo log files
LOGFILE GROUP 1 ('/u01/oradata/XE18/redog1m1XE18.dbf','/u02/oradata/XE18/redog1m2XE18.dbf') SIZE 10M,
        GROUP 2 ('/u01/oradata/XE18/redog2m1XE18.dbf','/u02/oradata/XE18/redog2m2XE18.dbf') SIZE 10M,
        GROUP 3 ('/u01/oradata/XE18/redog3m1XE18.dbf','/u02/oradata/XE18/redog3m2XE18.dbf') SIZE 10M
MAXLOGHISTORY 1
MAXLOGFILES 16
MAXLOGMEMBERS 3
MAXDATAFILES 1024
-- probably we could also experiment with other character sets here, but I like to have databases in unicode
CHARACTER SET AL32UTF8
NATIONAL CHARACTER SET AL16UTF16
EXTENT MANAGEMENT LOCAL
-- I am too stingy, so restrict autoextend for SYSTEM and SYSAUX
-- 450M for SYSTEM and 400M for SYSAUX should be enough as a start
-- later on you should monitor the growth of the tablespaces well and find out how you can stop it from further growing
DATAFILE '/u01/oradata/XE18/system01XE18.dbf' SIZE 300M AUTOEXTEND ON NEXT 10M MAXSIZE 450M
SYSAUX DATAFILE '/u01/oradata/XE18/sysaux01XE18.dbf' SIZE 300M AUTOEXTEND ON NEXT 10M MAXSIZE 400M
-- this tablespace is for our real user data
DEFAULT TABLESPACE users DATAFILE '/u01/oradata/XE18/users01XE18.dbf' SIZE 10M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED
-- TEMP and UNDO do not count, so make them as huge as you need them
DEFAULT TEMPORARY TABLESPACE temp TEMPFILE '/u01/oradata/XE18/temp01XE18.dbf' SIZE 10M AUTOEXTEND ON NEXT 10M MAXSIZE UNLIMITED
-- I like undo tablespaces with fixed size datafiles without autoextend
UNDO TABLESPACE undotbs DATAFILE '/u01/oradata/XE18/undotbs01XE18.dbf' SIZE 1G 
;

-- load the components
@?/rdbms/admin/catalog.sql
@?/rdbms/admin/catproc.sql
@?/rdbms/admin/utlrp.sql
-- the sqlplus support
CONNECT system/manager
@?/sqlplus/admin/pupbld.sql
@?/sqlplus/admin/help/hlpbld.sql helpus.sql
-- sometimes I use user system for data pump, let us create its master tables
-- not in system tablespace, as I want to limit its size strictly.
ALTER USER system DEFAULT TABLESPACE users;

Now the database is lean:

SELECT comp_name, status, version
FROM dba_registry
ORDER BY comp_name;

COMP_NAME                           STATUS      VERSION    
----------------------------------- ----------- -----------
Oracle Database Catalog Views       VALID       18.0.0.0.0
Oracle Database Packages and Types  VALID       18.0.0.0.0
Oracle Real Application Clusters    OPTION OFF  18.0.0.0.0
Oracle XML Database                 VALID       18.0.0.0.0

So now we create the same testdata user here and try to load same data as before.
This time I got the error later at another constraint:

ALTER TABLE order_items ADD CONSTRAINT order_items_pk PRIMARY KEY(order_id,line_item_id);

This time the orders_pk index was created successfully before. So we have more segments now. Let us check the data and its space consumption again:

SELECT round(SUM(bytes)/1024/1024/1024,3) GB
FROM dba_segments
WHERE owner='TESTDATA';

        GB
----------
    13.482

More wow, a little more than 13 GB. 1.125 GB more than on our first try with the CDB/PDB. But still some indexes are missing.

Can we improve that?

I repeat the data load with a very small set (1% or so) to get the last missing index definitions. And then I do the following:

-- compact the regular tables
BEGIN
  FOR i IN (SELECT owner, table_name
            FROM dba_tables
            WHERE owner = 'TESTDATA'
            AND partitioned = 'NO'
            AND iot_type IS NULL
            AND external = 'NO') LOOP
    dbms_output.put_line('Changing: "' || i.owner || '"."' || i.table_name || '"');
    EXECUTE IMMEDIATE 'ALTER TABLE "' || i.owner || '"."' || i.table_name || '" PCTFREE 0 ROW STORE COMPRESS ADVANCED';
  END LOOP;
END;
/


-- compact the regular indexes
BEGIN
  FOR i IN (SELECT owner, index_name
            FROM dba_indexes
            WHERE owner = 'TESTDATA'
            AND partitioned = 'NO'
            AND index_type = 'NORMAL'
            -- exclude secondary indexes on IOTs
            AND pct_direct_access IS NULL ) LOOP
    dbms_output.put_line('Changing: "' || i.owner || '"."' || i.index_name || '"');
    EXECUTE IMMEDIATE 'ALTER INDEX "' || i.owner || '"."' || i.index_name || '" REBUILD PCTFREE 0 COMPRESS ADVANCED';
  END LOOP;
END;
/

In this schema I only have “regular” tables and indexes. The modification would be different for index-organized tables, for bitmap indexes and also for partitioned tables and indexes. Yes, by the way, we have bitmap indexes and partitioning in XE 18c, and we even have Advanced Compression and more things that are missing in Standard Edition 2. Cool, right!?
Have a look at https://docs.oracle.com/en/database/oracle/oracle-database/18/xelic/licensing-information.html for details.

A little check now shows us, that the definition really changed to compressed tables. “Row store compress advanced” was introduced as OLTP table compression in 11g. It is only the current name for the same feature, but in Enterprise Edition it is part of Advanced Compression option. PCT_FREE is set to lowest possible value 0 to make data even more compact.

SELECT owner, pct_free, compression, compress_for, COUNT(*)
FROM dba_tables
WHERE owner = 'TESTDATA'
AND partitioned = 'NO'
GROUP BY owner, pct_free, compression, compress_for;

OWNER        PCT_FREE COMPRESS COMPRESS_FOR   COUNT(*)
---------- ---------- -------- ------------ ----------
TESTDATA            0 ENABLED  ADVANCED              8

We do it similar for the indexes, here we use Advanced Index Compression, another feature of Advanced Compression.

SELECT owner, pct_free, compression, COUNT(*)
FROM dba_indexes
WHERE owner = 'TESTDATA'
AND partitioned = 'NO'
GROUP BY owner, pct_free, compression;

OWNER        PCT_FREE COMPRESSION     COUNT(*)
---------- ---------- ------------- ----------
TESTDATA            0 ADVANCED HIGH          7

Yep, everything is defined to be compressed. And now let us load the data again. Same settings as before with the desired 15.1 GB. In the last dialog we choose to keep the tables as they are (now compressed) and only to truncate them before the load.

Hours later we could load the data completely into the database. Thanks to compression it fitted. The tables and indexes only use 12.398 GB in the segments, but we are complete.

SELECT round(SUM(bytes)/1024/1024/1024,3) GB
FROM dba_segments
WHERE owner='TESTDATA';

        GB
----------
    12.398

But we also paid a price for this. The load was much slower as before. In my lab hours instead of minutes. This is mainly because of enabled indexes during the load but also because the data is much more compact. You should keep in mind never to compress tables which are updated frequently, as this would lead to chained or migrated rows. And compressing data and indexes can also lead to more contention in case you need to access the data from a lot of concurrent sessions.

Let us analyze the data a little further and get very accurate statistics now for that,
well this cannot be done parallel in XE, so takes quite a while:

EXEC dbms_stats.gather_schema_stats('TESTDATA', estimate_percent => NULL);

We have exactly the data which was promised by the load generator as you see in the first picture above. And you can see also the space consumption per table here:

SELECT
  t.table_name, t.num_rows, t.blocks used_blocks,
  s.blocks segment_blocks, s.bytes/1024/1024 segment_mb
FROM dba_tables t, dba_segments s
WHERE t.owner = s.owner
AND t.table_name = s.segment_name
AND t.owner='TESTDATA'
ORDER BY t.table_name;

TABLE_NAME             NUM_ROWS USED_BLOCKS SEGMENT_BLOCKS SEGMENT_MB
-------------------- ---------- ----------- -------------- ----------
CUSTOMERS              53958516      449636         450560       3520
INVENTORIES              899476        1882           1920         15
LOGON                  13489629       20297          20480        160
ORDERS                 60703330      359100         359936       2812
ORDER_ITEMS           182491582      441452         442368       3456
PRODUCT_DESCRIPTIONS       1000          28             32        .25
PRODUCT_INFORMATION        1000          28             32        .25
WAREHOUSES                 1000           5              8      .0625

8 rows selected.

And this is the information about the 7 indexes, we only have PK indexes in the example schema.

SELECT
  i.index_name, i.num_rows, i.leaf_blocks,
  s.blocks segment_blocks, s.bytes/1024/1024 segment_mb
FROM dba_indexes i, dba_segments s
WHERE i.owner = s.owner
AND i.index_name = s.segment_name
AND i.owner='TESTDATA'
ORDER BY i.table_name;

INDEX_NAME                 NUM_ROWS LEAF_BLOCKS SEGMENT_BLOCKS SEGMENT_MB
------------------------ ---------- ----------- -------------- ----------
CUSTOMERS_PK               53958516       56571          57728        451
INVENTORIES_PK               899476         999           1152          9
ORDERS_PK                  60703330       62962          63488        496
ORDER_ITEMS_PK            182491582      218870         227328       1776
PRODUCT_DESCRIPTIONS_PK        1000           2              8      .0625
PRODUCT_INFORMATION_PK         1000           1              8      .0625
WAREHOUSES_PK                  1000           1              8      .0625

7 rows selected.

Can we beat that?

Table compression in the heap tables is dependent on redundant data per block and also works better if loaded with direct path loads. So I will do now the following for the 3 biggest tables. export the data and sort it at the same time. The sort criteria was chosen by me when looking at the column statistics. I favored columns with very redundant data, so with a little number of distinct values, to be first criteria in order by list. But also the average column length was considered. The longer the column contents the more relevant its sorting is. You might want to experiment here with other order by clauses as well.

This is how I reorganized the data:

customers table:

# expdp system/manager TABLES=testdata.customers CONTENT=data_only DUMPFILE=customers.dmp REUSE_DUMPFILES=YES LOGFILE=exp_customers.log 
  QUERY='"ORDER BY cust_first_name, cust_last_name, nls_language, nls_territory, cust_email, account_mgr_id, credit_limit, customer_id"'

ALTER TABLE testdata.customers DROP CONSTRAINT customers_pk;
TRUNCATE TABLE testdata.customers;

# impdp system/manager DUMPFILE=customers.dmp LOGFILE=imp_customers.log

ALTER TABLE testdata.customers
ADD CONSTRAINT customers_pk PRIMARY KEY (customer_id)
USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;

orders table:

# expdp system/manager TABLES=testdata.orders CONTENT=data_only DUMPFILE=orders.dmp REUSE_DUMPFILES=YES LOGFILE=exp_orders.log 
  QUERY='"ORDER BY order_mode, order_status, sales_rep_id, warehouse_id, promotion_id, order_total, customer_id, order_date, order_id"'

ALTER TABLE testdata.orders DROP CONSTRAINT orders_pk;
TRUNCATE TABLE testdata.orders;

# impdp system/manager DUMPFILE=orders.dmp LOGFILE=imp_orders.log

ALTER TABLE testdata.orders ADD CONSTRAINT orders_pk PRIMARY KEY (order_id)
USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;

order_items table:

# expdp system/manager TABLES=testdata.order_items CONTENT=data_only DUMPFILE=order_items.dmp REUSE_DUMPFILES=YES LOGFILE=exp_order_items.log 
  QUERY='"ORDER BY line_item_id, quantity, product_id, unit_price, order_id"'

ALTER TABLE testdata.order_items DROP CONSTRAINT order_items_pk;
TRUNCATE TABLE testdata.order_items;

# impdp system/manager DUMPFILE=order_items.dmp LOGFILE=imp_order_items.log

ALTER TABLE testdata.order_items ADD CONSTRAINT order_items_pk PRIMARY KEY (order_id, line_item_id)
USING INDEX PCTFREE 0 COMPRESS ADVANCED HIGH;

Again I updated the statistics and queried the same as above:

TABLE_NAME             NUM_ROWS USED_BLOCKS SEGMENT_BLOCKS SEGMENT_MB
-------------------- ---------- ----------- -------------- ----------
CUSTOMERS              53958516      105447         105472        824
INVENTORIES              899476        1882           1920         15
LOGON                  13489629       20297          20480        160
ORDERS                 60703330      192882         196608       1536
ORDER_ITEMS           182491582      303003         303616       2372
PRODUCT_DESCRIPTIONS       1000          28             32        .25
PRODUCT_INFORMATION        1000          28             32        .25
WAREHOUSES                 1000           5              8      .0625

8 rows selected.


INDEX_NAME                 NUM_ROWS LEAF_BLOCKS SEGMENT_BLOCKS SEGMENT_MB
------------------------ ---------- ----------- -------------- ----------
CUSTOMERS_PK               53958516       69023          69632        544
INVENTORIES_PK               899476         999           1152          9
ORDERS_PK                  60703330       78875          79872        624
ORDER_ITEMS_PK            182491582      250648         253952       1984
PRODUCT_DESCRIPTIONS_PK        1000           2              8      .0625
PRODUCT_INFORMATION_PK         1000           1              8      .0625
WAREHOUSES_PK                  1000           1              8      .0625

7 rows selected.

Awesome we saved a lot of space, but interestingly the indexes grew. They seem to be dependent on the sort order of the data in the heap segment.

The overall used GB in the segments is now:

SELECT round(SUM(bytes)/1024/1024/1024,3) GB
FROM dba_segments
WHERE owner='TESTDATA';

        GB
----------
      7.88

What else?

We could furthermore experiment with tablespaces with uniform extent sizes using the minimum extent sizes of 40k for 8k block size tablespaces to save the space which is left free in potentially larger extents. Of course larger block sizes could also improve compression ratios. And attribute clustering might help for special cases, another feature which is available in XE but not in Standard Edition 2. And you
could move static tables without indexes to external tables.

Well, all that stuff is left as an exercise for you. Thanks for reading to the end.