SQL Question - Need a query
I have a SQL question (Using Oracle Release 9.2)
In our application we have item_cost_history table which stores cost of an item on a particular date.
Item_Cost_History: Item_Id, Item_Cost, Cost_Date.
Item_Cost is downloaded everyday and we maintain an archive of item_cost by cost_date so as to fetch the cost as of a particular date.
The query (Q1) used to get cost is
where item_id = given_item_id
and cost_date = given_date;
This query is present throughout the code.
Since the cost of an item does not change very frequntly, we are storing the same repeated cost over and over again with different dates. As a result the table size is huge because we keep atleast 30 day archive for 500,000 items.
To reduce the amount of data we are storing, we have decided to change our approach in storing item_cost and record changes to item_cost only We are going to create a new table Item_Cost_Archive and record only cost changes and create a view Item_Cost_History so that the schema change is transparent to the application.
Item_Cost_Archive : Item_Id, Item_Cost, Last_Cost_Modify_Date.
Item1 3.00 17-JAN-2004
Item1 3.20 31-JAN-2004
Item1 3.15 17-FEB-2004
Is there a way to create a view (Item_Cost_History) so that Q1 can be executed as it is and fetch results.
It'd be a pretty expensive query to run. You'd join from the archive table to a table of dates, and it'd be a lot of LIO's to do so.
You might consider storing the data in an index-organized structure -- that'd probably cut your total storage requirement in half straight away, reduce LIO's in retrieving costs, and the PK index would be a good candidate for compression ...
Note that I made date lead the time in the PK definition -- this would give the index a rightwards growing characteristic that would eliminate any need for maintenance in the future, and also would decrease the overhead on inserting new records. You'd have to think about whether this will work for all your queries though. You could supplement the PK index with a compressed alternative on item_id also.
PRIMARY KEY (DAY,ITEM_ID)
2 Solutions but still you need to edit the query
1] Solution without View, direct on the new table
from Item_Cost_History_New a
where item_id = &given_item_id
and Last_Cost_Modify_Date =(select max(Last_Cost_Modify_Date)
from Item_Cost_History_New b
where Last_Cost_Modify_Date <= &given_date
and a.item_id = b.item_id);
2] Solution with View and Minor change to your query
View Definition :
Select a.item_id, a.item_cost, a.Last_Cost_Modify_Date as "startDate" ,min(b.Last_Cost_Modify_Date) AS "EndDate"
from Item_Cost_History_New a, Item_Cost_History_New b
where a.Last_cost_modify_date < b.last_cost_modify_date
and a.item_id = b.item_id
group by a.item_id,a.item_cost, a.Last_Cost_Modify_Date
select a.item_id, a.item_cost, a.last_cost_modify_date as "StartDate", to_date('31-Dec-2100') as "MaxDate"
from Item_cost_History_New a, (Select item_id, max(Last_Cost_Modify_Date) as Max_Last_Cost_Modify_Date
From Item_Cost_History_New b
Group by Item_id) b
where a.item_id = b.Item_id
and a.Last_Cost_Modify_Date = b.Max_Last_Cost_Modify_Date
If the table is :
ITEM_ID ITEM_COST LAST_COST_M
--------------- --------------- -----------
10 100 01-JAN-2001
10 102 10-FEB-2002
10 103 10-MAR-2002
20 201 10-JAN-2010
20 202 30-JAN-2011
30 401 01-FEB-2004
Then The view will be
ITEM_ID ITEM_COST startDate EndDate
--------------- --------------- ----------- -----------
10 100 01-JAN-2001 10-FEB-2002
10 102 10-FEB-2002 10-MAR-2002
10 103 10-MAR-2002 31-DEC-2100
20 201 10-JAN-2010 30-JAN-2011
20 202 30-JAN-2011 31-DEC-2100
30 401 01-FEB-2004 31-DEC-2100
The query to find the item Cost:
where item_id = given_item_id
and startDate <= given_date
and EndDate > given_date;
-- As far as my knowledge goes Oracle offers a way to create a View which will return different records according to session environment variables (or context value). Combining that feature with the above view definition you can get the exact answer to your problem. If you get how to do it then please paste the result in the forum and also mail me.
Thanks for all the responses.
I finally got it working.
select all_dates.cost_date, date_range.item_id, date_range.item_cost
from (select trunc(sysdate) - (rownum - 1) cost_date
where rownum < 366
(select item_id, item_cost, last_cost_date start_date,
nvl((lead(last_cost_date,1) over (partition by item_id order by last_cost_date) - 1), trunc(sysdate)) end_date
where all_dates.cost_date between date_range.start_date and date_range.end_date
order by 2, 1 desc
Have you compared performance between this solution and using a compressed IOT-based solution, so that you know what you are sacrificing in terms of CPU usage, LIO's etc to save on so-many Mb of disk space?
Actually our server is more I/O bound than CPU bound. CPU usage is really low (< 50%).
Actualy I am going to create 2 views one with one year of cost view and another with last 30 cost view. The application will look only at last 30 days cost view.
What space saving did you achieve with the compressed IOT over theuncompressed HOT+index?
I would go for RANGE PARTITION with RANGE KEY on date column table rather than IOT.
Then create a LOCAL INDEX (date column , item_id) on the partitioned table.
This solution is just a temporary fix for the problem on hand. We are eventually going to change the application logic (gradually ofcourse) to start using between operator.
our new table structure is going to look like
item_id, cost, start_date and end_date
and do find a cost for a date we would just use date between start_date and end_date queries.
Click Here to Expand Forum to Full Width