XML db vs Relation db
What would be the top (say 5 reasons) I would select
a) an XML db over a Relational db
b) a Relational db over XML db
Assuming that the db size is medium
The choice of how to store your data should depend on your business requirements rather than the database size.
Please read this Ask Tom Link - it gives excellent pointers that should help make the right choice.
In essence, use XML DB if:
- It is sufficient for your application that the XML data will be persisted and retrieved: the database does not really require an understanding of the XML structure.
- Data will be accessed mostly as XML, and less often relationally.
- There is a need to maintain document fidelity - that is, elements like whitespace, XML processing instructions, etc. are to be retained.
Oracle 9i and above provides a variety of XML capabilities such as XPath searching/extracts, XSLT transformations, converting your XML document into a user-defined type and Oracle Text to index XML data for XML-centric searching. You can make use of these features if you store XML documents as-is.
Use Relational DB if:
- XML is just the data-transfer mechanism: once the XML document reaches the application it is the contents that are of significance and not the XML document itself.
- Access to data will be mostly relational. For example, an application which received input XML data and thereafter has to produce reports on the data, would be better off storing this data in relational form.
- Your only temptation to use XML DB is that it sounds cool
In either case, you can generate data in other form if required - that is, access XML data through relational queries or programmatically re-construct the XML data from relational form. There is some overhead of course when doing that. You might even opt to store data in both XML and relational form. There is more data storage in the database, but it gives you the advantages of both approaches.
Go for what suits you!
Isn't it the weight and validity of reasons rather than the count that matters?
Originally Posted by Padma R
yes it would be based on weights assigned to a set of factors.
Was also wondering what is the equivalent for Locking/dirty reads, hiding of data in columns behind views in XML DB
You should check this out.
"The person who says it cannot be done should not interrupt the person doing it." --Chinese Proverb
@LKBrwn_DBA: I hope that the article you linked to was meant only in jest. If it is sincerely intended then IMHO it appears written with a myopic perspective. [Or maybe it is dated - many arguments it makes do not hold in the present times, either for Oracle or for XML.]
Just so that it does not misguide people who miss the joke and are seriously considering using XML DB in Oracle, here are some counter points.
Whenever data is transferred between disparate systems, it must be structured in some way that is understandable to the communicating systems. Structured information contains both content and some indication of what role that content plays. Tags provide structure and meaning to data.
By jamming all of those tags into the data you can easily more than double the size of the data files being sent across the network. And the bigger the file, the slower the transmission time.
The article does not propose an alternative approach to data transfer that provides the key advantages of XML - flexibility, simplicity, interoperability - without the use of tags. It is the problems faced by older complex data transfer mechanisms like EDI that led to the evolution and increased acceptance of XML.
Bottomline: The benefits of using XML in the right context outweigh the extra storage space it takes up in the database.
For a developer (the DBA?) who wishes to remain in the comfort zone of familiarity, the first approach would indeed be preferable. A developer (the DBA?) who learns that this can be done fairly easily in Oracle would make a more intelligent, rational choice based on what suits the application best, rather than what entails least work for him.
The first approach is to be preferred. It is less troublesome to administer but more complicated to implement. It requires someone (the DBA?) to map each tagged element to an appropriate column that already exists in a table.
Oracle has the capability for easy traversal, access, manipulation of the "giant clump of text" that is XML using SQL/XML or XML-specific packages - the parsing program is right there - and we do view that as progress.
And, once again, that data is of no use to anyone who does not have the parsing program. So the database is holding data that is not really defined in any way other than as a giant clump of text. Hello? Does anyone really view this as progress?
On the contrary, current versions of XML Schema languages allow sophisticated constraint definitions, including datatype specification (both primitive and user-defined), length, minimum/maximum occurences, default values and so on.
XML offers little control over data type and length
The acronym is derived phonetically. The primary letter from the first syllable of each word is chosen: the X sound of 'eXtensible' is arguably more pronounced than the E - which intuitively sounds good to me. At least I prefer XML to EML
XML stands for eXtensible Markup Language. And that is another thing wrong with it – I hate acronyms that use letters from the middle of words!
[How would the world be if we chose technologies based on our biases for their names? I would stay miles away from C# because I hate names with special characters ]
So you want relational transaction management and security features for XML.
Originally Posted by Padma R
The best you could do is
(i) build relational views over XMLType data, or
(ii) store XML in relational form after all!
Just to insure that people don't take the article above as a joke but as a warning, here is my view about xml .
Originally Posted by hobbes
XML is a very practical and flexible way to exchange data, for that it is just great, especially when exchanging data between different systems/languages. Now, even for that there are big drawbacks that often make you say at first "Wow, yeah xml would be great for that", and at second sight, "finally, I can certainly do without it" :
- let's face it, XML is huge in terms of size, so even if it may be very suitable for some systems, it may be simply out of way for some intensive real-time systems with remote data processing (through a leased line for example)
- furthermore, the parsing of xml uses a lot of memory and is very cpu intensive compared to other, less generic, exchange protocols (we are managing exchange data and are using a proprietary protocol because xml would have killed us)
Now concerning storage of xml data, IMO the only situation where it could be useful is when you receive structured XML data from some data provider and you have no control over data integrity. (in our case, we could have done so for our news provider, for example).
For any other storage concern, just use relational tables because you will gain performance, use less memory, use less storage size, have a more sophisticated constraint management... Data integrity is simply not maintainable with XML. XML Schema is good for XML, but it remains XML. There is no way XML will ever be a better storage management system than retational DBMSs. Objects may be someday, XML will never.
Just my 2 cents ,
Neither meant the above as a joke nor as a warning - only sought to give a more objective idea about XML, for itself and for its role in the database, as the linked article seemed unfairly biased. Do not see contention with the points you make: largely agree with you!
The key advantage of XML is interoperability, and where that is the deciding factor about the mode of data transfer, one opts for XML despite its size. If one can have a proprietary protocol which uses smaller-sized files - interfacing systems agree on it - one could easily do without XML.
Unfortunately typical EAI projects do not provide this kind of choice as interfacing systems are pre-designed and/or different technologically and/or exchanging common data files with multiple systems not just what we're building, and so one is not empowered to say let's have a customized protocol for us.
Given that the mode of transfer, for good or evil, is XML - the question now is: how does one store it in the database - dump the entire file in native form, or parse and store it relationally? The choice I'd say should depend on what is to be done with the data next. Is it to be accessed through queries, modified? Or is it to be retrieved only as XML, perhaps to be transferred to another application in its original form at a later time? In the first case, relational storage is the obvious choice; in the latter, XML DB. [Note that XML document fidelity is lost if data is converted to relational form. If one ever needs to regenerate the XML document, it makes sense to store it in native form as well.]
As you state, XML DB cannot hope to match up to the content management features that relational DB provides. Couldn't agree more - but I think of XML DB not as a substitute for relational DB but as a supplement to it. That's the way Oracle has designed it too: it supports XDB in addition to its relational features, not as a replacement for them, doesn't it?
Last edited by hobbes; 07-25-2006 at 09:14 AM.
We agree. Just thought your answer to LKBrwn_DBA was too XML-biased .
Click Here to Expand Forum to Full Width