Skip to main content

Hierarchies in Oracle.


This article explores the functionality and features offered by CONNECT BY clause in Oracle with a hands-on exercise approach. Prerequisite: Oracle 9g or lastest installed, any oracle SQL client. We have used Oracle's sample schema for this article, you can download it too from the link below. Execute this SQL in your oracle client and you should be all set with data and schema.

https://download.oracle.com/oll/tutorials/DBXETutorial/html/module2/les02_load_data_sql.htm

Let's get started with CONNECT BY clause in Oracle. This is basically an oracle clause to place eligible datasets in a hierarchical fashion. Meaning, usage of this function is generally for creating a new resultant query that will elaborate hierarchical relations in a table. Here is the basic syntax

[ START WITH condition ]
CONNECT BY [ NOCYCLE ] condition

  • START WITH is an optional keyword that can be used as a starting point for hierarchy.
  • CONNECT BY describes the relationship between a child and parent row in the hierarchy.
  • PRIOR In connectby condition, at least one expression must be qualified with the prior operator that describes the parent record. As you can see the condition in connectby can have multiple expressions, and each time it should have at least 1 but not restricted to 1 Prior clause. Prior is a urinary operator which evaluates the immediate expression and is practically used with '=' operator. With other operators, it generally puts the flow in an infinite loop and makes oracle through error.

  • CONNECT BY last_name != 'Oracle' AND PRIOR employee_id = manager_id
  • CONNECT BY PRIOR employee_id = manager_id and PRIOR account_mgr_id = customer_id ...

SELECT EMPLOYEE_ID,

           FIRST_NAME || ' ' || LAST_NAME EMPLOYEE_NAME,

           MANAGER_ID,

           PRIOR FIRST_NAME || ' ' || PRIOR LAST_NAME MANAGER_NAME,

           LEVEL

           FROM EMPLOYEES

           START WITH EMPLOYEE_ID = 100

           CONNECT BY PRIOR EMPLOYEE_ID = MANAGER_ID

           FETCH FIRST 5 ROWS ONLY;



You might have noticed that the Prior operator can also be used in select part to refer to Parent records. Prior expression can not refer to sequences, NEXTVAL for instance.

CONNECT_BY_ROOT operator can be used to flatten the hierarchy of an associate to the root level and place it in a single column cell.

SELECT EMPLOYEE_ID,

           FIRST_NAME || ' ' || LAST_NAME EMPLOYEE_NAME,

           MANAGER_ID,

           PRIOR FIRST_NAME || ' ' || PRIOR LAST_NAME MANAGER_NAME,

           LEVEL, SYS_CONNECT_BY_PATH(last_name, '/') "Path"

           FROM EMPLOYEES

           START WITH EMPLOYEE_ID = 100

           CONNECT BY PRIOR EMPLOYEE_ID = MANAGER_ID

           FETCH FIRST 5 ROWS ONLY;





 Let's now explore the NOCYCLE operator by updating the manager id in the root record.

UPDATE EMPLOYEES SET MANAGER_ID=108 WHERE EMPLOYEE_ID=100

When you run the query now Oracle will throw an ORA-01436 error due to the loop created.
ORA-01436: CONNECT BY loop in user data
  SELECT EMPLOYEE_ID,

            FIRST_NAME || ' ' || LAST_NAME EMPLOYEE_NAME,

            MANAGER_ID,

            CONNECT_BY_ISCYCLE "Cycle",

            LEVEL,

            SYS_CONNECT_BY_PATH (LAST_NAME, '/') "Path"

            FROM EMPLOYEES

            START WITH EMPLOYEE_ID = 100

            CONNECT BY NOCYCLE PRIOR EMPLOYEE_ID = MANAGER_ID


NOCYCLE help us use connectby even in datasets with Loop, another operator CONNECT_BY_ISCYCLE can be used to list out all the rows with loop.

What if you want to identify the leaf node employee in a hierarchy?

CONNECT_BY_ISLEAF pseudo column can be used in select to cater to this requirement. Column retuns 1 if it is a leaf node record, 0 otherwise.

Hierarchy levels can be tricky and Parent nodes can be often misunderstood as leaf nodes.

 

Only Child/leaf nodes will have CONNECT_BY_ISLEAF set as 1


Thank you for reading through, stay tuned...

Comments

Popular posts from this blog

How to work with XML files in Databricks using Python

This article will walk you through the basic steps of accessing and reading XML files placed at the filestore using python code in the community edition databricks notebook. We will also explore a few important functions available in the Spark XML maven library. Think of this article as a stepping stone in the databricks community edition. Features and functionalities elaborated herein can be scaled at the enterprise level using Enterprise editions of databricks to design reusable file processing frameworks. Requirements We will be using the Spark-XML package from Maven. **Spark 3.0 or above is required on your cluster for working with XML files. This article uses Databricks Community edition, refer to  this  video tutorial from Pragmatic works for getting started with  Databricks Community Edition . Create your first cluster in seconds : The next step is to install the Spark-XML library on your cluster. The cluster needs to be in a running state to install this li...

Microsoft Fabric

Complete Analytics Platform  In its new Software as a service offering, Microsoft basically clubbed every tool in their Analytics portfolio and gave it a new name - Fabric :). Claims are Fabric can serve every Data stakeholder ranging from a developer working with Data Lake to a Sales associate working on a self-serve Powerbi report. Microsoft has implemented tenant centric architecture in Fabric like office 365, In optimal design an organization will have 1 fabric similar to 1 office 365 tenant for entire organization. Lake centric and Open  All the data and apps built on Fabric provided solutions will get stored at a single lake, It auto calculates the lineage for objects stored on a single data lake. It uses delta file format and parquet data storage for all the objects.  Advantage: Table storage is shared across the fabric workspace, suppose you have a data issue in a Synapse datawarehouse query, just run a fix on the data set using Synapse data engineering python not...