Cloud - Autonomous database integration with python

Saulo Toledo Pereira
Geek Culture
Published in
11 min readJun 23, 2022

--

Github project here

This article is part of a Project that I doing to practice some skills in Python, cloud, and SQL. You can read the first article here.

In this article, you will learn how to create a free account on Oracle Cloud, create an autonomous database and send data using Python and/or using an Open Source software Dbeaver.

Steps:
1- Creating an Oracle Cloud Infrastructure account
2 - Creating an autonomous database
3 - Create a connection to our database (OCI) with python
4 - Optional: Using Dbeaver to connect to our Database

1 - Creating an Oracle Cloud Infrastructure account

First, we have to create an account in Oracle Cloud, click on the link below, and fill in your details, including credit card details, but don’t worry, nothing will be charged. I’ve had an account for over a year, and I’ve never been charged for anything.

https://signup.cloud.oracle.com/

In this article, I won’t go into details about how the oracle cloud infrastructure works in this article, maybe someday I’ll do a detailed one explaining the possibilities and what we can do for free.This time we go straight to the point.

Inside your Oracle Cloud Console, for good practices, we will create a new compartment to create our autonomous database (ADB).

Click on ‘Create compartment’:

After clicking, the window below will appear:

1- Create a name for your compartment
2 - Write a little description of your compartment
3 - Click to create

That’s it, our compartment is created:

2 - Creating an autonomous database

Steps:
1- Creating an Oracle Cloud Infrastructure account ✔
2 - Creating an autonomous database
3 - Create a connection to our database (OCI) with python
4 - Optional: Using Dbeaver to connect to our Database

Now we can create our ADB. Back to the menu, click on ‘oracle database’ and ‘autonomous data warehouse’:

Now, on the left, select the compartment that we just created then click on ‘Create autonomous database’

After the click, a new window will open:

1- Select the compartment that we just created
2 - Database name (A user-friendly name to help you easily identify the resource.)
3 - Database name, you can repeat the last field (or not, it’s up to you)

In the next, select ‘Data Warehouse’ and Shared Infrastructure:

We’ll select the ‘always free’ button to ensure we're not using paid features, so only the free options will appear.

The username will be ADMIN for standard and cannot be changed, but you can create your password (don’t forget it, we will use it to access the database in the next steps).

The next options are just to leave it as is and then click on the blue button ‘create autonomous database’ to create the database.
That's it, our database is being created and will soon be available for us to create tables, insert data, and everything else.

When ‘ADW’ is green, our database is available.

For the last step on OCI, we need to download our wallet that will help us to access our database with Power BI. For that, click on ‘DB connection’:

Them click on ‘Download wallet’, and create another password (don’t forget this one too, we will need it for later).

Another way to connect Power BI to an oracle database is to create a file with the necessary credentials. To do this just click on ‘copy’ as shown in the image below and save it in a notepad with the name TNSnames.ora
If you have no idea how this works, follow the step of downloading the wallet as it is simpler initially.

Alright, for the next steps we will use only python and software called Dbeaver.

3 - Create a connection to our database (OCI) with python

Steps:
1- Creating an Oracle Cloud Infrastructure account ✔
2 - Creating an autonomous database ✔
3 - Create a connection to our database (OCI) with python
4 - Optional: Using Dbeaver to connect to our Database

For the next steps with Python, we will follow the documentation below:

https://www.oracle.com/database/technologies/appdev/python/quickstartpython.html

First, we need to install some libraries:

pip install cx_Oracle
pip install SQLAlchemy

Download the Oracle Instant Client (download the right files for your OS):

https://www.oracle.com/database/technologies/instant-client/downloads.html

In my case, I’m using windows 64-bits:

Create a new directory on ‘c:\’ named oracle then, when the download is finished, extract files to the following directory:

C:\oracle\instantclient_21_3

Remember the wallet that we had downloaded previously? extract files to the following directory:

C:\oracle\instantclient_21_3\network\admin

Remember this step for later, we will be back here sometimes.

After finishing, we have to create a connection to our database.

Lets code:

First, we have to import the library that will make our connection to oracle ADB:

# Library to connect to Oracle
import cx_Oracle

Then we have to set the path of our Oracle instant client:

# patch to instantclient_21_3
cx_Oracle.init_oracle_client(lib_dir=r"C:\\oracle\\instantclient_21_3")

You have to change the directory in bold to the one where you extracted the files previously. If you are following step by step I think you will not get any problems.

For the next, set your username, password, and dsn into variables:

# Variables to connect to Oracle
username = "ADMIN"
password = "topsecret123"
dsn = '(description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522)(host=adb.sa-vinhedo-1.oraclecloud.com))(connect_data=(service_name=g19cc7d208b76c0_adbarticle01_high.adb.oraclecloud.com))(security=(ssl_server_cert_dn="CN=adb.sa-vinhedo-1.oraclecloud.com, O=Oracle Corporation, L=Redwood City, ST=California, C=US")))'

For the username, you will use ADMIN, for the password you can use the first password we created when we were creating the database, and finally, for dsn you can open the file ‘tnsnames’ and copy your net service name like the image below:

Alright with our credentials ready we can create our connection to Database, just code the line below and that's it we’re connected!

# Connect to Oracle
connection = cx_Oracle.connect(user=username, password=password, dsn=dsn)

Now just send the SQL commands preceded by “cursor.execute(“SQL here”)” that you are already managing your database, congratulations!!!

Let's create our First table to insert the data that we have scrapped in the previous article:

# Drop table if it exists
cursor.execute("""begin
execute immediate 'drop table data_tab';
exception when others then if sqlcode <> -942 then raise; end if;
end;""")
#Create table
cursor.execute("""create table data_tab(
ano INT,
mes INT,
id INT,
nome VARCHAR(255),
siglaPartido VARCHAR(255),
siglaUf VARCHAR(5),
urlFoto VARCHAR(255),
email VARCHAR(255),
tipoDespesa VARCHAR(255),
dataDocumento VARCHAR(255),
valorDocumento INT,
urlDocumento VARCHAR(255),
nomeFornecedor VARCHAR(255),
cnpjCpfFornecedor VARCHAR(255))""")
#Close connection
connection.commit()

With this code, we created a table named ‘data_tab’ and the columns that we will use to insert deputies' data. After that, we closed the connection with:

connection.commit()

Alright now we have to load our dataset and send it to the database:
Documentation here and here.

First, we have to import sqlalchemy and pandas:

# library to insert data to database
from sqlalchemy import create_engine
import pandas as pd

For the next, we load the dataset that we have created in the previous article, if you don't have it, click here to download it (expenses.csv) or here to read the previous article.

# import dataset from csv
main = pd.read_csv('expenses.csv' , sep=',')

Set your credentials:
username: ‘ADMIN’
password: ‘your database password’
dsn: ‘you can open the file ‘tnsnames’ and copy your net service name like before’.

# Variables to connect to Oracle
username = "ADMIN"
password="topsecret123"
dsn = '(description= (retry_count=20)(retry_delay=3)(address=(protocol=tcps)(port=1522)(host=adb.sa-vinhedo-1.oraclecloud.com))(connect_data=(service_name=g19cc7d208b76c0_adbarticle01_high.adb.oraclecloud.com))(security=(ssl_server_cert_dn="CN=adb.sa-vinhedo-1.oraclecloud.com, O=Oracle Corporation, L=Redwood City, ST=California, C=US")))'

To send our data to the database we have to create an ‘engine’ that will serve as a bridge for us to send our complete dataset:

# create engine
engine = create_engine(f'oracle://{username}:{password}@{dsn}/?encoding=UTF-8&nencoding=UTF-8', max_identifier_length=128)

The code below will send the entire data frame to our database.

# insert dataframe into oracle database
main.to_sql('data_tab', con = engine, if_exists = 'append', chunksize=1000)

Documentation here.

main: data frame with all data;
to_sql: this method will send the data frame already in SQL format without us having to type any SQL command;
data_tab: Name of SQL table
con: the connection that we just created before
if_exists: if data_tab already exists, the code will append the data, if not, it raises a value error.
chunksize: Specify the number of rows in each batch to be written at a time. By default, all rows will be written at once.

Alright, to send the data with python that’s it!

PS: it may take a long time depending on your connection and computer.
I have a very good computer and this whole process took about 3 hours. I went to research the reason for the delay and saw that the ‘to_sql’ method is not well optimized for what we just did and that’s why it takes so long. So if you don’t want to wait that long (and I recommend you don’t, it’s impossible to do this in a real project), we have another option that after everything is configured, it takes about 3 minutes to send our entire dataset to the database. Just follow the steps in the next section.

Optional: Using Dbeaver to connect to our Database

Steps:
1- Creating an Oracle Cloud Infrastructure account ✔
2 - Creating an autonomous database ✔
3 - Create a connection to our database (OCI) with python ✔
4 - Optional: Using Dbeaver to connect to our Database

Official page here

What is Dbeaver?
DBeaver is a free, open source, graphical database management tool for database developers and administrators. You can use DBeaver to create and manage databases across a wide range of database management systems”

We will connect to the database and send our dataset using Dbeaver, It’s an alternative way to send data if you cannot wait for python to take a long time to send the same data.

Download and install Dbeaver here according to your OS.

Next steps referenced here

After the installation is completed open Dbeaver and follows the steps below:

1- New database connection
2 - Select Oracle
3 - Click on next

PS: If any window pops up saying you need to download and install some drivers, install everything dbeaver asks (this step is very important if you don't it won’t work)

For the next, you will see a window like this:

1- Click on TNS;
2 - Select the directory where is your wallet (If you follow step by step I think you don't get any problems ¯\_(ツ)_/¯);
3 - Enter your database username ‘ADMIN’;
4 - Enter your database password;
5 - Click on Drivers properties;

The next window looks like this:

Search for ‘javax.net.ssl.keyStore’, in this line, you will click to set a value. You will enter the path that is your wallet + ‘\keystore.jks’. For me it was like this:

C:\oracle\instantclient_21_3\network\admin\keystore.jks

on the line below: ‘javax.net.ssl.keyStorePassword’, enter the wallet password that you have created on the Oracle cloud (image below step 1)

Enter the same on ‘javax.net.ssl.trustStore’ (path) and ‘javax.net.ssl.trustStorePassword’ (password).

Click on test connection. If you entered the right path and password a window will pop up saying ‘Connected’, your server and driver like the image below:

Ok, we have connected to our database, you can check the table that we have created before using python by following the steps below:

On the left:
1- Oracle database;
2 - Users
3 - User ADMIN
4 - Table ‘DATA_TAB’ (double click here)

On the right:
5 - Path information (where you are inside the database)
6 - Table name
7 - Database columns, type, etc.

Now we have to send our data to the database, for this, right-click on ‘DATA_TAB’ (table name that you want to send data) and click on ‘Import Data’.

A window will pop up like this below:

1- Select CSV;
2- Click next
PS: Select your CSV dataset after that

1- This column will show your CSV file;
2 - This column (target) will show where your file will be sent.

Click on next.

1- Clicking the arrow to expand the table will show us the source columns (dataset columns), the data and the type they are in, and where they go (target column). In our case, we had already created the columns with python before, but if the columns didn’t exist, Dbeaver would create them automatically.

Click on next.

Here you can adjust the quantities of rows that will be sent for time. If you don’t know what it means, just click next.

And finally the last step, the window below just shows us a summary of what will be done.
Click on Proceed.

For now, you just wait some minutes and that's it.

The data was sent to the Database!

You can click on ‘Data’ and see what you have sent to the database.

Or if you have some SQL skills just press F3 and code your queries, in this case, I just coded:

SELECT  *
FROM DATA_TAB

And Dbeaver returned to me the first 200 rows of my database.

That's it you now send about 300 thousand lines of data to your autonomous database in Oracle Cloud for free! Congrats!

Steps:
1- Creating an Oracle Cloud Infrastructure account ✔
2 - Creating an autonomous database ✔
3 - Create a connection to our database (OCI) with python ✔
4 - Optional: Using Dbeaver to connect to our Database ✔

I hope this article helps you in your journey in python, cloud, or DB. In the next article, we will use this data to make an interactive dashboard using Power BI. If you have any suggestions, doubts, criticisms, or if you just want to talk, contact me:

saulodetp@gmail.com

Next article here.

Previous article here.

Github project here

--

--

Saulo Toledo Pereira
Geek Culture

PhD student trying to learn some code and practice my English. Can we talk five minutes?