-
How to eliminate outliers.
I'd like to remove the top and bottom N values. I think I'm close.
Here's what I'm working with where I fudge the rownum values to make the query work
as I know there are 220 rows. So I want all values without the top and bottom N.
How can I at run time do something like
rownum <= count(*) -N
For rnum it would be easy enough as I could just do....
rnum >= N
Code:
select * from
(
select p.*, rownum rnum
from (
select PRCSNAME,datediff('mi',enddttm,begindttm )*-1
from PROCESS_SCHEDULER_HISTORY where prcsname=...some job name
order by datediff('mi',enddttm,begindttm )*-1 desc
) p
where rownum <= 215
)
where rnum >= 5
I'm stmontgo and I approve of this message
-
This might give you something to work with. You may need to add columns to the sub query so that you can have them in the final query. There are some functions that will give you average and standard distribution which will help you to better focus on removing the outliers.
Code:
select PRCSNAME, elapsed_days
from ( select PRCSNAME, datediff('mi',enddttm,begindttm )*-1 elapsed_days, rownum rn
from PROCESS_SCHEDULER_HISTORY
where prcsname='...some job name'
order by datediff('mi',enddttm,begindttm )*-1 desc ) p
where rn between (rn*.25) and (rn*.975)
order by elapsed_days;
-
thanks
it didn't work - now rows returned.
I think there is some magical distintion between the rownum and rnum.
I was able to get it work by creating a function - not terribly elegant but it's working
Code:
create or replace function count_job_occurances (p_prcsname PROCESS_SCHEDULER_HISTORY.PRCSNAME%type)
RETURN NUMBER
IS
v_count_occurances NUMBER;
BEGIN
SELECT count(prcsname) INTO v_count_occurances FROM PROCESS_SCHEDULER_HISTORY
where PRCSNAME = p_prcsname;
RETURN v_count_occurances;
END;
/
select * from
(
select p.*, rownum rnum
from (
select PRCSNAME,datediff('mi',enddttm,begindttm )*-1
from PROCESS_SCHEDULER_HISTORY where prcsname='PER099'
order by datediff('mi',enddttm,begindttm )*-1 desc
) p
where rownum < count_job_occurances('PER099')-5
)
where rnum >= 5
/
I'm stmontgo and I approve of this message
-
For the previous one try it without the outer where clause.
You can also try this. Analytics can be useful, but hard to get right.
Also I have not tried the sql to see if it compiles.
Code:
select PRCSNAME, elapsed_days
from (select PRCSNAME,
datediff('mi',enddttm,begindttm )*-1 elapsed_days,
row_number() over
( partition by datediff('mi',enddttm,begindttm )*-1
order by datediff('mi',enddttm,begindttm )*-1 ) rn
from PROCESS_SCHEDULER_HISTORY
) start with rn = 1)
WHERE rn between (rn*.25) and (rn*.975)
order by elapsed_days;
-
Thanks - unfortunately that did not work.
Forgive me for double dipping but I also posted on the metalink forum as well
and I wanted to share what I ended up using.
I really appreciate your help.
Steve
Code:
select * from
(
select p.*, rownum rnum
from (
select PRCSNAME,datediff('mi',enddttm,begindttm )*-1
from PROCESS_SCHEDULER_HISTORY where prcsname=
order by datediff('mi',enddttm,begindttm )*-1 desc
) p
where rownum <= (select count(*) - &&how_many from PROCESS_SCHEDULER_HISTORY where prcsname=)
)
where rnum >= &&how_many ;
I'm stmontgo and I approve of this message
-
Originally Posted by stmontgo
Thanks - unfortunately that did not work.
Forgive me for double dipping but I also posted on the metalink forum as well
and I wanted to share what I ended up using.
Fortunately double dipping in forums is not as bad as double dipping your chip in the dip.
You seemed like you were close to figuring it out anyway.
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|