DBAsupport.com Forums - Powered by vBulletin
Results 1 to 10 of 10

Thread: Calculate percentage null per column

  1. #1
    Join Date
    Dec 1999
    Location
    Purgatory
    Posts
    346

    Calculate percentage null per column

    Hi All you code kiddies!!

    Just had a quick request from a client. He want to know the percentage of nulls for each column in his schema.

    So output would be

    Table_Name Column_Name Percent_Null

    Anyone got a quick solution in their bag of tricks??

    I'll be working on it in the meantime....

    Thanks.

  2. #2
    Join Date
    Nov 2002
    Location
    New Delhi, INDIA
    Posts
    1,796
    How about

    Code:
    scott@ACME> select a.table_name,a.column_name,round((a.NUM_NULLS/b.NUM_ROWS)*100,0) percen_null
      2    from all_tab_columns a,all_tables b
      3   where a.table_name='EMPLOYEES'
      4     and a.table_name=b.table_name
      5  /
    
    TABLE_NAME                     COLUMN_NAME                    PERCEN_NULL
    ------------------------------ ------------------------------ -----------
    EMPLOYEES                      EMPLOYEE_ID                              0
    EMPLOYEES                      FIRST_NAME                               0
    EMPLOYEES                      LAST_NAME                                0
    EMPLOYEES                      EMAIL                                    0
    EMPLOYEES                      PHONE_NUMBER                             0
    EMPLOYEES                      HIRE_DATE                                0
    EMPLOYEES                      JOB_ID                                   0
    EMPLOYEES                      SALARY                                   0
    EMPLOYEES                      COMMISSION_PCT                          67
    EMPLOYEES                      MANAGER_ID                               1
    EMPLOYEES                      DEPARTMENT_ID                            1
    
    11 rows selected.
    Make sure you have analyzed your tables first.

    HTH
    Amar
    "There is a difference between knowing the path and walking the path."

    Amar's Blog  Get Firefox!

  3. #3
    Join Date
    Dec 1999
    Location
    Purgatory
    Posts
    346
    Thanks for that adewri,

    but I need a completely accurate output, NUM_NULLS just isn't accurate enough.

  4. #4
    Join Date
    Jun 2000
    Location
    Madrid, Spain
    Posts
    7,447
    then do compute statistics then?

  5. #5
    Join Date
    Dec 1999
    Location
    Purgatory
    Posts
    346
    Unfortunately, the tables are too big for a 'compute stats'.

    Maybe I should post this in the Developers forum ?

  6. #6
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    How many significant figures does the client envision on this, if they want something completely accurate? You have three rows and one of the m is null, what's the completely accurate percentage for that?

    See http://oraclesponge.blogspot.com/200...-sampling.html and the article by JL I reference there http://www.jlcomp.demon.co.uk/stats_i.html for information on the accuracy of statistics.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  7. #7
    Join Date
    Dec 1999
    Location
    Purgatory
    Posts
    346
    Tables have multi-million rows but...
    No worries........ It's sorted it. One table scan as well......

    Code:
    set verify off
    var result_set refcursor
    set echo off
    col "Table" Format a30
    col "Column" Format a30
    
    DECLARE
      cursor c1 is SELECT table_name,column_name from user_tab_columns
                   WHERE table_name = UPPER('&&tablename');
      myquery   varchar2(32000) := NULL;
      mysql     varchar2(32000) := NULL;
      BEGIN
      FOR C in C1 LOOP
         myquery := myquery || ',sum(nvl2('||c.column_name||',0,1)) '||c.column_name;
         mysql   := mysql || 'UNION ALL select '''||c.table_name||'''"Table",'
                          || ''''||c.column_name ||'''"Column",'
                          ||' ROUND('||c.column_name|| '/total*100) "Percent Nulls" from temp ';
      END LOOP;
      myquery :=  'SELECT '||substr(myquery,2) ||',SUM(1) Total FROM &tablename';
      myquery := 'with temp as ('||myquery||')' || SUBSTR(mysql,10) || 'ORDER by "Column"';
    
      open :result_set for myquery;
    END;
    /
    print result_set
    undefine tablename

  8. #8
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    You might compare that with the cost of gathering statistics on the table ... it's mostly the same query
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

  9. #9
    Join Date
    Dec 1999
    Location
    Purgatory
    Posts
    346
    I was under the impression that computing statistics on very large tables was frowned upon ??

    Not sure where I got that impression from, is it unfounded ??

  10. #10
    Join Date
    Aug 2002
    Location
    Colorado Springs
    Posts
    5,253
    I don't think it necessarily is ... it's just that you can get extremely accurate statistics without going that far. Data with heavy clustering of values generally needs a higher percentage.

    I see from
    Code:
    ROUND('||c.column_name|| '/total*100)
    that you are not really looking for a completely accurate answer here .. it is possible that an analyze with an estimation percent of less than 10 would have got you into the ballpark. When a person (especially a manager!) says "I need a completely accurate answer" it usually makes me shrug ... what are they going to do if the percentage of nulls is 13% that they wouldn't do if it was 14%? What if it was 13.25%? I really doubt that this makes any difference.
    David Aldridge,
    "The Oracle Sponge"

    Senior Manager, Business Intelligence Development
    XM Satellite Radio
    Washington, DC

    Oracle ACE

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •  


Click Here to Expand Forum to Full Width