In this post I will share to how to create an external table using the DBMS_CLOUD package and then run a query against a csv file uploaded into a Pure Storage FlashBlade S3 Object Storage bucket.

I have previously shared how to install and configure the Oracle DBMS_CLOUD package, if you want to follow this blog post, you may find it useful to review the above blog post first.

Upload Object

Let’s start by uploading a csv file using a previously created configuration and set of S3 Object Storage credentials created in my working with Oracle 19c database and on-premises S3 Object Storage post.

% ls -l airports.csv 
-rw-r--r--@ 1 rekins  staff  4965317  3 Aug 15:44 airports.csv

% wc -l airports.csv 
   72911 airports.csv

From the above we can see airports.csv is 72,911 rows and 4,965,317 bytes (4.7M).

Using the aws s3api utility we can upload the file into the S3 Object Storage, for example.

BUCKET=ora-bucket
ENDPOINT=https://s3-fbstaines02.amazonaws.com
s3File=airports.csv

aws s3api put-object --bucket ${BUCKET} --key ${s3File} --body /Users/rekins/s3/${s3File} --endpoint-url ${ENDPOINT} --profile fbstaines02 --ca-bundle pureCA.pem

List Objects

We can confirm the object has been successfully upload with the aws s3api utility and list-objects option, for example.

BUCKET=ora-bucket
ENDPOINT=https://s3-fbstaines02.amazonaws.com

aws s3api list-objects --bucket ${BUCKET} --query 'Contents[].{Key: Key, Size: Size}' --endpoint-url ${ENDPOINT} --profile fbstaines02 --ca-bundle pureCA.pem

[
    {
        "Key": "airports.csv",
        "Size": 4965317
    }
]

Create Table

Logon to an Oracle 19c Pluggable Database (PDB) schema with execute privileges on the DBMS_CLOUD package and S3 credentials configured.

Using the SQL below, create an external table, the example below creates an external table called airports_fb using the airports.csv file stored in an S3 Object Storage bucket called ora-bucket.

begin
  dbms_cloud.create_external_table(
    table_name      => 'airports_fb',
    credential_name => 'fb_cred',
    file_uri_list   => 'https://s3-fbstaines02.amazonaws.com/ora-bucket/airports.csv',
    column_list     => 'ident             varchar2(20),
                        type              varchar2(20),
                        name              varchar2(20),
                        elevation_ft      number(6),
                        continent         varchar2(20),
                        iso_country       varchar2(20),
                        iso_region        varchar2(20),
                        municipality      varchar2(20)',
    format          => json_object('type' value 'csv', 'skipheaders' value '1', 'ignoremissingcolumns' value 'true', 'conversionerrors' value 'store_null', 'rejectlimit' value 'unlimited')
 );
end;
/

Query Table

Now we have created an external S3 table, let’s confirm the record count.

SQL> set linesize 200
SQL> select count(*) from airports_fb;

  COUNT(*)
----------
     72911

Elapsed: 00:00:00.94

And attempt to run a query against the external table, for example list all London airports.

SQL> select name, iso_region, municipality from airports_fb where name like '%London%';

NAME		     ISO_REGION 	  MUNICIPALITY
-------------------- -------------------- --------------------
London Airport	     CA-ON		  London
London Luton Airport GB-ENG		  London
London City Airport  GB-ENG		  London
London Heliport      GB-ENG		  London
London Colney	     GB-ENG
London Gliding Club  GB-ENG		  Dunstable
New London Airport   US-VA		  Forest

7 rows selected.

Elapsed: 00:00:00.98

Explain Plan

If we describe the table is looks like any other table, for example

SQL> desc airports_fb;
 Name					   Null?    Type
 ----------------------------------------- -------- ---------------------------
 IDENT						    VARCHAR2(20)
 TYPE						    VARCHAR2(20)
 NAME						    VARCHAR2(20)
 ELEVATION_FT					    NUMBER(6)
 CONTINENT					    VARCHAR2(20)
 ISO_COUNTRY					    VARCHAR2(20)
 ISO_REGION					    VARCHAR2(20)
 MUNICIPALITY					    VARCHAR2(20)

However, one way we can confirm our query is using an external table by creating an explain plan for it.

SQL> EXPLAIN PLAN FOR
  2  select name, iso_region, municipality from airports_fb where name like '%London%' order by name;

Explained.

SQL> select plan_table_output from table(dbms_xplan.display('plan_table',null,'basic'));

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
Plan hash value: 587737101

--------------------------------------------------------
| Id  | Operation			 | Name        |
--------------------------------------------------------
|   0 | SELECT STATEMENT		 |	       |
|   1 |  PX COORDINATOR 		 |	       |
|   2 |   PX SEND QC (ORDER)		 | :TQ10001    |
|   3 |    SORT ORDER BY		 |	       |
|   4 |     PX RECEIVE			 |	       |
|   5 |      PX SEND RANGE		 | :TQ10000    |

PLAN_TABLE_OUTPUT
-------------------------------------------------------------------------------
|   6 |       PX BLOCK ITERATOR 	 |	       |
|   7 |        EXTERNAL TABLE ACCESS FULL| AIRPORTS_FB |
--------------------------------------------------------

14 rows selected.

SQLDeveloper

Alternatively, use SQLDeveloper to create a pretty, visual explain plan.

SQLDeveloper has some amazing functionality, one example is the ability to provide a simple way of inspecting an external tables Access Parameters and Location, where we can see the URL for the S3 Object Storage.

Table Update

So what happens if we try to update a row ?

SQL> update airports_fb set municipality = 'London' where name='London Colney';
update airports_fb set municipality = 'London' where name='London Colney'
       *
ERROR at line 1:
ORA-30657: operation not supported on external organized table

Or, delete a row ?

SQL> delete from airports_fb where name='London Colney';
delete from airports_fb where name='London Colney'
            *
ERROR at line 1:
ORA-30657: operation not supported on external organized table

As expected they both fail as external tables don’t support DML, they are designed to be Read-Only, therefore any required updates need to be performed externally.

Summary

In this post I have demonstrated how a csv data file can be upload to an on-premises FlashBlade S3 Object Storage bucket and queried directly from an Oracle 19c database using the Oracle DBMS_CLOUD package.

If you want to learn how to download and upload objects to an S3 Object Storage from with an Oracle 19c database you may want to read this post next.

Accessing on-premises S3 Object Storage from Oracle 19c or 21c databases with DBMS_CLOUD

Upload Object

List Objects

Create Table

Query Table

Explain Plan

SQLDeveloper

Table Update

Summary

Like this:

Related

Leave a ReplyCancel reply

Search My Site

Categories

Upload Object

List Objects

Create Table

Query Table

Explain Plan

SQLDeveloper

Table Update

Summary

Share this:

Like this:

Related

Leave a ReplyCancel reply

Tags

Search My Site

Categories

Discover more from Ron Ekins' - Oracle Technology, DevOps and Kubernetes Blog