CONNECTing With Your Inner Hierarchy: Hierarchical Query Enhancements
Oracle already provides elegant features for traversing and
reporting on the contents of hierarchical data structures. Some of the
more common hierarchical structures that I have encountered and needed to parse
include:
-
Corporate organization structures, especially the
relationships of employees to their supervisors
-
Part-subpart relationships, e.g., components made up of
subcomponents that are themselves made up of subcomponents
-
Geopolitical boundaries that describe relationships
between countries, states, counties, townships, and cities
-
Taxing structures that are usually intimately connected to
geopolitical boundaries
Thankfully, Oracle offers the CONNECT BY operator for
connecting together elements of a hierarchy in a query. I will utilize a common
example with which we are all familiar -- a company's organization chart -- to
illustrate:
-- A traditional hierarchical query (before Oracle 10g)
COL mgr_id FORMAT 99999 HEADING 'MgrID'
COL mgr_name FORMAT A24 HEADING 'Manager Name'
COL level FORMAT 999 HEADING 'Lvl'
COL emp_id FORMAT 99999 HEADING 'EmpID'
COL emp_name FORMAT A24 HEADING 'Employee Name'
SELECT
E.manager_id mgr_id
,(M.last_name || ', ' || M.first_name) mgr_name
,LEVEL
,E.employee_id emp_id
,E.last_name || ', ' || E.first_name emp_name
FROM
hr.employees E
,hr.employees M
WHERE E.manager_id = M.employee_id
START WITH E.manager_id = 101
CONNECT BY PRIOR E.employee_id = E.manager_id
ORDER BY SIBLINGS e.manager_id, e.employee_id;
SQL>
MgrID Manager Name Lvl EmpID Employee Name
------ ------------------------ ---- ------ ------------------------
101 Kochhar, Neena 1 108 Greenberg, Nancy
101 Kochhar, Neena 1 200 Whalen, Jennifer
101 Kochhar, Neena 1 203 Mavris, Susan
101 Kochhar, Neena 1 204 Baer, Hermann
101 Kochhar, Neena 1 205 Higgins, Shelley
108 Greenberg, Nancy 2 109 Faviet, Daniel
108 Greenberg, Nancy 2 110 Chen, John
108 Greenberg, Nancy 2 111 Sciarra, Ismael
108 Greenberg, Nancy 2 112 Urman, Jose Manuel
108 Greenberg, Nancy 2 113 Popp, Luis
205 Higgins, Shelley 2 206 Gietz, William
11 rows selected.
Note that I used the PRIOR operator of the CONNECT BY operator to tell Oracle to link up
each employee to his or her manager based on the value contained in each row's
MANAGER_ID column. In this case, it interrogates the expression that follows
immediately after the PRIOR
operator for the parent row of the current row. I also used the
optional START WITH
operator to direct Oracle to start querying the hierarchy at employee #101 (Neena
Kochar) and to show only those employees that report up to that employee.
Finally, I employed the LEVEL
pseudo-column to show which level in the hierarchy each employee belongs
to.
As powerful as these methods are, Oracle 10g has expanded
upon them considerably:
Traversing To the Top Node in a Hierarchy. Oracle 10g's
new CONNECT_BY_ROOT
unary operator makes it easy to gather information directly from the "root"
or top node in a hierarchy for any child row. In addition, another new
function, SYS_CONNECT_BY_ROOT,
traverses the hierarchy from the child up to and including the root node in the
hierarchy. This function will then return a list of values separated by the
character string chosen as a delimiter.
Listing 1.2
illustrates two queries: one whose root node starts at the top of the
hierarchy, and one whose root node starts at a lower point in the
hierarchy based on the value supplied to the START WITH operator.
Establishing Child vs. Parent Status. Oracle 10g also
provides a new pseudo-column, CONNECT_BY_ISLEAF, that
establishes whether a returned row itself has additional children in the
hierarchy, or is at the top of the hierarchy. If a row has additional children,
it will return a value of zero (0) for this pseudo-column; otherwise, if the
row has no further descendant nodes, it will return a value of one (1). Listing 1.3
shows how to utilize this new pseudo-column to find all rows that have no
children.
CONNECT_BY_ISCYCLE: Handling Recursion Within
Hierarchies. Most hierarchies have a simple tree-like node structure that
can be traversed from the "root" entry or entries down through the
parent to child relationships. However, in some cases, it is possible that a
hierarchical relationship may "loop back" upon itself. This may occur
as the result of a data entry error, or it may even be intentional.
To illustrate, I will update the MANAGER_ID column value for
a few employees so that a recursion is introduced. Note that I am making
Employee #902 the parent of Employee #901, and further disrupting the hierarchy
by making Employee #903 the parent of Employee #902:
UPDATE hr.employees
SET manager_id = 902
WHERE employee_id = 901;
UPDATE hr.employees
SET manager_id = 903
WHERE employee_id = 902;
COMMIT;
Now when I attempt to run a query against the EMPLOYEES
table, Oracle will detect the "loop-back" condition and return an
error:
SELECT
employee_id
,(last_name || ', ' || first_name)
,manager_id
,CONNECT_BY_ROOT last_name "Manager"
,CONNECT_BY_ISLEAF "Leaf?"
,LEVEL lvl_ind
FROM
hr.employees
CONNECT BY PRIOR employee_id = manager_id
ORDER BY SIBLINGS employee_id;
ORA-01436: CONNECT BY loop in user data
To detect this recursion, I can use the new CONNECT_BY_ISCYCLE
pseudo-column. This introduces a powerful, elegant method for detecting
unwanted recursion, and would be especially useful as part of an AFTER
statement trigger for data validation of hierarchies. Note that CONNECT_BY_ISCYCLE
can only be queried when the new NOCYCLE directive of the CONNECT BY operator is employed. When a loop-back
condition is detected, CONNECT_BY_ISCYCLE
will return a value of one (1); otherwise, it will return a zero (0). The query
in Listing
1.4 returns the entries that are participating in the recursion.
New Nested Table Functions: COLLECT, CARDINALITY, and SET
Over the past year, I have spent a lot of my spare time
investigating, experimenting with, and transforming existing PL/SQL code to use
Oracle's implementation of PL/SQL collections. I have found that they
are an excellent alternative to global temporary tables for temporarily storing
and manipulating moderate amounts of data during complex processing (please see
my prior
article on creating table functions that return PL/SQL collections.)
Encouraged by my progress with collections, I have now
progressed to some initial implementations of nested tables. I have
found them extremely useful for storing multiple entries of complex data in a
single column in a table. The good news is that Oracle 10g has expanded support
for powerful capabilities by adding some new functions for querying,
manipulating, converting, and translating nested tables:
COLLECT. This new function is designed to transform
values from any column selected from a table in a SELECT statement to a
collection in the form of a nested table. COLLECT must be used within a call to the CAST
function; it accepts a column from any type of table as input, and then returns
a nested table of the input type from the rows selected. If the COLLECT
function is applied against a column that is itself a collection, then it will
return a nested table of collections.
CARDINALITY. On the other hand, the CARDINALITY
function simply returns the number of elements present in a table's nested table
column. This is extremely useful when determining the amount of data stored
in a nested table column, or for controlling a FOR loop. If the nested table
column is empty, the CARDINALITY
function simply returns a NULL value.
SET. Unfortunately, there is no DISTINCT operator
that can be applied against nested table values to compress them down to only
unique entries. However, the new SET function accepts the value(s) in an existing
nested table as input and returns a set of non-duplicate values with the
same TYPE as the nested table itself.
I will illustrate these three handy and powerful new
features with a simple but realistic business scenario: the need to capture and
query telephone numbers for every employee in a company. I will first create a
traditional table named HR.TELEPHONE_NBRS
to store this information, and then I will create a simple TYPE and a second
table named HR.LOCATION_COMM_IDS
to store each location's telephone numbers in a nested table column. Listing 1.5
shows the statements to create these new tables, create the new TYPE, and
then load the tables with sample data.
I will then query the traditional table's column with COLLECT to
return a collection, apply the CARDINALITY
function to count the number of entries for each of the nested table columns in
the second table, and engage the SET function to return only the unique phone
number values stored in the nested table column. See Listing 1.6
for the SQL queries and sample output.
Conclusion
Oracle 10g has significantly improved hierarchical
queries, expanded the utility of the MERGE statement for data maintenance, and
upgraded the capabilities for querying nested tables. These new features and
enhancements to Structured Query Language make an already-robust database query
and data manipulation engine even more powerful and make a positive impact on
any Oracle DBA's most mundane tasks.
The next article in this series will concentrate on the
new aggregate and statistical analysis functions that Oracle 10g provides with
its new interrow calculation features.
References and Additional Reading
While there is no substitute for direct experience, reading
the manual is not a bad idea, either. I have drawn upon the following Oracle
10g documentation for the deeper technical details of this article:
B10750-01 Oracle Database
New Features Guide
B10759-01 SQL Reference
Previous
Next
Back to DBAsupport.com