DBASupport

 The Knowledge Center for Oracle Professionals
HOME 11g Central 10g Central 9i Central 8i Central Oracle News Scripts FAQ OCP Zone Resources Technical Docs Tools & Utilities Forums

» HOME
» FEATURES
    11g Central
    10g Central
    9i Central
    8i Central
    Oracle News
» COMMUNITY
    Scripts
    Forums
    FAQ
    OCP Zone
» RESOURCES
    Resources
    Technical Docs
    Tools & Utilities
    Tech Jobs
Marketplace Partners
Become a Marketplace Partner






Internet News
Small Business

Advertise
Newsletters
Tech Jobs
E-mail Offers


   DBAsupport.com > Oracle > Oracle 9i Central > Featured Stories



 

Oracle Developer Jr - READY TO HIRE!
Next Step Systems
US-CA-Thousand Oaks

Justtechjobs.com Post A Job | Post A Resume

Data Warehousing Technology Review (Part 2)
By Alexzander Nepomnjashiy


Welcome to the follow-up of the first part in Data Warehousing Technology Review. Let's jump right in.

So, the main idea underlying Data Warehousing is that carrying out the analysis directly on OLTP-systems is ineffective and undesirable. Instead, the idea is to extract the necessary data from several OLTP-systems, convert it, and then place it in a single data source -- a Data Warehouse. This process is also known as the ETL (Extract Transform Load) process.

During the ETL process, the data is:

  • Cleared (unnecessary or housekeeping information is eliminated);
  • Aggregated (the sum and average values are calculated);
  • Transformed (data type conversion and storage  structures reorganization are executed);
  • Unified (from various data sources); and
  • Synchronized (resulting in one instant of time).

Data Warehouse is a ready platform for DSS (Decision Support Systems) and OLAP systems construction as it contains data having the following properties:

  • Internal integrity
  • Although the data is preempted from several independent operating data sources (OLTP-systems), they should be joined by unified naming laws, values measurement ways, and common attribute sets. It has the highest value in those organizations where a number of various systems are exploited and/or similar data is presented in different ways (for example, logical value «True» may be stored as 1,-1, True or T. Other variants of this include the same metric within several systems being named differently - for example «balance» and «totals ».)

    At the stage of storage data loading, the above mentioned mismatches should be eliminated (i.e. the data should be checked out, cleared and presented in a uniform view). The Mandatory execution of such a procedure allows us to avoid many potential problems. Also, the analysis of such integrated data can be performed much easier.

  • «Single subject» focus
  • OLTP systems contain gigabytes of data that are rarely interesting for analysis -- addresses, postal codes, record identifiers, etc. Such information is typically not loaded into a warehouse, which limits the considered data spectrum to a necessary minimum.

    The information in DW-storage is organized according to the primary aspects of company business activity (customers, sales, etc.); this distinguishes Data Warehouses from operating databases where the data are organized according to business processes (invoice shipment, calculations with customers, goods shipment, etc.). Subject orientation (due to corresponding data structures) promotes both analysis simplification and faster execution of analytical queries.

  • Historic nature
  • OLTP systems usually cover relatively limited time intervals, whereas Data Warehouses frequently contain data for up to several decades. This makes them an ideal base for revealing trends and long-term tendencies in business.

    From a technical point of view, the historical nature of data means that tables in storage contain a «temporary key», or data are distributed under several tables, each of them referring to a fixed-time interval.

  • «Read-only» availability
  • Depending of business needs, you choose one of three available methods to update data in a Warehouse – Full Process - Completely restructures a cube based on its current definition and then its data recalculation; Incremental Update - Adds new data to a partition in the cube and aggregations updating. This method does not process changes to a cube's structure (measures, dimensions) or changes to its existing source data. An incremental update creates a temporary partition from the new data and merges it into an existing partition); and Refresh Data - Clears and reloads a cube's data and recalculates its aggregations. This method is used if the cube's source data has changed but its structure has not.

    The hand-held data modification in storage is practically never performed (actually Microsoft SQL Server 7.0 OLAP Services cube’s may be updated by the user (so called «write-back capabilities»), as it results in data integrity violation. Typically, it is not required to minimize the time of DW data loading, so the storage structure may be optimized for predefined queries execution, achieved by relational scheme de-normalization, preliminary data aggregation and, «of most use», indexes creation.

Thus, the data loaded in DW-storage are organized in a completely integrated structure, have natural internal connections, and gain new properties that add them the status of INFORMATION.

Alexzander Nepomnjashiy


Back to DBAsupport.com