-
Data Modeling Question
Hello, I am new here and not sure if this is the correct forum to post this.
This is a Data Modeling question.
It is for a personal finance system I've been wanting to build for the past 5 years. My goal is to develop a platform-independent solution, although I'll likely use PHP/MySQL for implementation.
(BTW, I chose to post in an Oracle group since I think Oracle Developers are some of the smartest and most enterprise-focused around!)
The gist of my system is (1) Expense has (m) ExpenseDetails, where "expenses" represent anything I spend money on, including Groceries, Clothing, Fuel, Utilities, Rent, Medical, Travel, etc.
Experience shows me that regardless of the "category", most receipts/expenses have these fields...
tblExpense (all receipts)
---------------------------
ExpenseID
ExpenseDate
MerchantName
ExpenseTotal
PaymentInfo
However, a problem arises - from a data modeling standpoint - with "Utilities", "Travel-related", "Insurance" and other types of expenses.
For these "categories"/"sub-categories", there are lots of other important fields that should be captured under tblExpense, for example...
tblExpense (fuel only)
----------------------
FuelType
TotalGallons
CostPerGallon
tblExpense (electricity only)
----------------------------
DeliveredTo
RateClass
MeterReading_Start
MeterReading_End
CostPerKWH
tblExpense (telephone only)
------------------------------
BillingStartDate
BillingEndDate
TotalLocalCharges
TotalLongDistanceCharges
tblExpense (auto ins only)
----------------------------
BillingStartDate
BillingEndDate
CoverageType
InstallmentFee
LawInforcementSurcharge
and the list goes on...
(While some people might try to cram this disparate data into generic fields, that would be the WRONG approach. A "Quantity" of 5 Apples is NOT the same thing as buying 5.27 Gallons of Gas!!)
While hard to show in just text, conceptually this is what I believe is know a Super-Type/Sub-Type problem. And unfortunately, simplier databases like MS Access - which I'm using to prototype this!! - aren't designed to easily handle this Data Construct. :(
How to handle this predicament?!
SCENARIO #1: If I create on monster tblExpense and include all fields, then I solve the issue of missing any important information that needs to be captured, however, I will ultimately have a large table with lots of empty cells.
SCENARIO #2: I could create a (super-type) tblExpense with a primary key "ExpenseID" and the common fields described earlier. Then I could create (sub-type) tables, e.g. tblFuel, tblElectric, tblTelephone, and tblAutoInsurance which also have a primary key "ExpenseID" and the additional respective fields in each. Then I suppose I could somehow manage the PK's and synch everything up?! ??? (I believe this is one reasonable path to follow, but I would definitely need "hand-holding" to properly implement this!)
SCENARIO #3: One "un-informed" (and rather pompous) MS Access know-it-all I was being lectured by said I needed to create a tblExpense, tblExpenseAttribute, and tblExpenseAttributeType which would form a M-to-M relationship. After some thought, this might logically work, but I think it goes for "logical eloquence" OVER "implementation practicality". (Remember, I have to build forms and queries and logic to support my back-end design. And, to me, it would be very confusing to store things like "FuelType", "MeterReading_Start", "TotalLocalCharges", and "CoverageType" all in one table as his model would demand.
SCENARIO #4: Be a wimp, and build a seperate system for each Expense-Type because what I am capturing is too disparate and therefore should have its own database/home.
SCENARIO #5: Stop being so "anal-retentive" and just capture "ExpenseID", "ExpenseDate", "TotalAmount" and "Category" and be happy!
===================================================================
**NOTE: A similar problem exists with tblExpenseDetails!!
The data seems to fall into Retail and Non-Retail buckets.
With the first, you will find classic classroom "Order Details" type fields... "OrderItemDescirption", "UnitPrice", "Quantity", etc.
With the second group - which is usually a Utility - you won't find ExpenseDetails because you don't buy Gas a gallon-at-a-time, or insurance in seperate parts. And while there might be Order SubCategories like "TotalLocalService" and "TotalLongDistance", in the end, everything relates back to tblExpense - and not tblExpenseDetail - because you are buying the product/service in totality - usually for the month - if you can follow that?!
===================================================================
In closing, what seemed like a very straight-forward system to build, is actually much more complicated when you look at the "big picture". At the same time, this isn't rocket science, and I am CERTAIN that I can build an intelligent, detailed, robust, and scalable system that meets MY NEEDS if I can just get a little help on the Data Modeling portion! ;D
** Hell of a first post, eh?! **
Sincerely,
Just Bob
-
There is no right or wrong answer to this as you are only building a small system with limited users. From a true modelling point of view then Scenario 2 is correct, but as you say that then makes your solution more complicated for a novice to use.
Scenario 1 will work and be simple to manage (apart from all the blank fields)
So it is really up to you which you would be happiest with.
Personally I would go with Scenario 2 - you will soon get the hang of how to piece it all together for your queries
-
> There is no right or wrong answer to this as you are only building a small
> system with limited users.
>
> From a true modelling point of view then Scenario 2 is correct, but as you say
>that then makes your solution more complicated for a novice to use.
Okay, good, so at least I was on track!
> Scenario 1 will work and be simple to manage (apart from all the blank fields)
Valid point.
> So it is really up to you which you would be happiest with.
>
> Personally I would go with Scenario 2 - you will soon get the hang of how
> to piece it all together for your queries
Again, that is good to know that I wasn't off my rocker!
So, how would you do Scenario #2?
(Confession: I have spent most of my time using MS Access for projects like this. It is a decent quick-n-dirty DB tool, but one key thing it lacks is that Access's entire paradigm is built around "bound forms" where you are typing directly into the DB! This makes for shorter dev times, but actually create a thousand times more work in the long run. Point is I have limited experience developing in REAL databases...)
Let's say I have...
tblExpense
tblFuel
tblElectric
tblNaturalGas
which all contain "Expense-level" data, but where tblExpense is my "super-type" and the others are "sub-type" tables.
The user ("me") decides to enter a Gas Receipt.
The user goes to the "Expense Entry Form" and enters...
ExpenseID *system generated unique number
ExpenseDate
MerchantName
ServiceType
PumpNo
FuelType
TotalGallons
CostPerGallon
PaymentType
CCName
CCLast4
OrderTotal
and then clicks "Submit".
At this point, I would use an SQL INSERT statement to write the BLUE fields to tblExpense and the RED fields to tblExpenseDetail, right??
Is that the way it would work in a "normal" (i.e. non MS Access) database, especially if it was using Web Forms??
Just Bob
-
Hi Bob,
I was waiting to see if anyone else wanted to pitch in here, but it seems not. So it is just me and you !
The answer yo your question is yes, that is how a "normal" application would do it. That is why applications take a while to write !
-
I'm really sorry Bob but this is all wrong.
Citldba was really nice to you when saying "there is no right or wrong answer" but the naked truth is that there is a right and wrong answer... and this is all wrong.
1- As stated you want to build a "generic solution" ideally PHP/MySQL
2- Your notation looks like SQL Server standard
3- You are asking in an Oracle forum
4- You are "modeling" -or did you mean "prototyping" - in MS-Access
5- Your concepts of data modeling are all over the place -not necessarily good places.
I'll point just to one random issue here...
Originally Posted by Bob Just
(While some people might try to cram this disparate data into generic fields, that would be the WRONG approach. A "Quantity" of 5 Apples is NOT the same thing as buying 5.27 Gallons of Gas!!)
...let me break the news for you, in your scenario a "quantity" of something you are buying is a "quantity" no matter what you are buying.
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
Originally Posted by PAVB
I'm really sorry Bob but this is all wrong.
Citldba was really nice to you when saying "there is no right or wrong answer" but the naked truth is that there is a right and wrong answer... and this is all wrong.
That is a harsh statement!
Originally Posted by PAVB
1- As stated you want to build a "generic solution" ideally PHP/MySQL
I said I was looking for a "platform independent" solution, but that I would likely implement - for now - using PHP/MySQL.
What is wrong with that?
Originally Posted by PAVB
2- Your notation looks like SQL Server standard
It is Microsoft nomenclature...
So?
Originally Posted by PAVB
3- You are asking in an Oracle forum
Right. Because I think Oracle people are some of the smartest and closer, philosophically - to what I want to achieve. So?
Originally Posted by PAVB
4- You are "modeling" -or did you mean "prototyping" - in MS-Access
Right, I started a prototype in MS Access, but I want a solution that meets enterprise database standards (e.g. Oracle). So?
Originally Posted by PAVB
5- Your concepts of data modeling are all over the place -not necessarily good places.
How so?
Then what do YOU propose as a solution? (I listed 5 different scenarios.)
Originally Posted by PAVB
I'll point just to one random issue here...
...let me break the news for you, in your scenario a "quantity" of something you are buying is a "quantity" no matter what you are buying.
Maybe, I know lots of people - Oracle included - who would disagree.
From a datatype standpoint alone, I disagree.
You don't buy 10.57 bananas!
"ItemQuantity" is an Integer. "TotalGallons" is a Real number. As such, they aren't logically the same thing. But regardless, that is the LEAST of my worries.
If you have a better "Data Modeling solution", then please share it here.
Thanks,
Just Bob
-
[QUOTE=citldba]Hi Bob,
I was waiting to see if anyone else wanted to pitch in here, but it seems not.[quote]
Well, PAVB sure just had a mouthfull to say!!
Originally Posted by citldba
So it is just me and you !
I welcome you help!!
Originally Posted by citldba
The answer yo your question is yes, that is how a "normal" application would do it. That is why applications take a while to write !
Sounds like PAVB disagrees with nearly everything I have said. (Maybe with what you have said also?!)
However, I have been taking with LOTS of people - including Oracle types - and the consensus so far is that Sceanrio #2 - and your advice - is pretty much "on track"...
Just Bob
-
Originally Posted by Bob Just
... that is the LEAST of my worries.
I've to agree on this one Bob, that should be the least of your worries, indeed.
By the way, I volunteer my time here to answer specific questions. Packing a whole Data Modeling training in a forum-alike answer is out of scope so I'll graciously let somebody else do it.
Best of luck.
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
Originally Posted by Bob Just
...and the consensus so far is that Sceanrio #2 - and your advice - is pretty much "on track"...
Where in hell are you reading I'm "on track" with your scenario #2? Never-ever, not in a zillion years, not when I'm drunk, not when I'm not, not under torture either
Pablo (Paul) Berzukov
Author of Understanding Database Administration available at amazon and other bookstores.
Disclaimer: Advice is provided to the best of my knowledge but no implicit or explicit warranties are provided. Since the advisor explicitly encourages testing any and all suggestions on a test non-production environment advisor should not held liable or responsible for any actions taken based on the given advice.
-
So Bob - is it just me and you again then ?
Posting Permissions
- You may not post new threads
- You may not post replies
- You may not post attachments
- You may not edit your posts
-
Forum Rules
|
Click Here to Expand Forum to Full Width
|