XML db vs Relation db
What would be the top (say 5 reasons) I would select
a) an XML db over a Relational db
b) a Relational db over XML db
Assuming that the db size is medium
The choice of how to store your data should depend on your business requirements rather than the database size.
Please read this Ask Tom Link - it gives excellent pointers that should help make the right choice.
In essence, use XML DB if:
- It is sufficient for your application that the XML data will be persisted and retrieved: the database does not really require an understanding of the XML structure.
- Data will be accessed mostly as XML, and less often relationally.
- There is a need to maintain document fidelity - that is, elements like whitespace, XML processing instructions, etc. are to be retained.
Oracle 9i and above provides a variety of XML capabilities such as XPath searching/extracts, XSLT transformations, converting your XML document into a user-defined type and Oracle Text to index XML data for XML-centric searching. You can make use of these features if you store XML documents as-is.
Use Relational DB if:
- XML is just the data-transfer mechanism: once the XML document reaches the application it is the contents that are of significance and not the XML document itself.
- Access to data will be mostly relational. For example, an application which received input XML data and thereafter has to produce reports on the data, would be better off storing this data in relational form.
- Your only temptation to use XML DB is that it sounds cool
In either case, you can generate data in other form if required - that is, access XML data through relational queries or programmatically re-construct the XML data from relational form. There is some overhead of course when doing that. You might even opt to store data in both XML and relational form. There is more data storage in the database, but it gives you the advantages of both approaches.
Go for what suits you!
Isn't it the weight and validity of reasons rather than the count that matters?
Originally Posted by Padma R
yes it would be based on weights assigned to a set of factors.
Was also wondering what is the equivalent for Locking/dirty reads, hiding of data in columns behind views in XML DB
You should check this out.
"The person who says it cannot be done should not interrupt the person doing it." --Chinese Proverb
@LKBrwn_DBA: I hope that the article you linked to was meant only in jest. If it is sincerely intended then IMHO it appears written with a myopic perspective. [Or maybe it is dated - many arguments it makes do not hold in the present times, either for Oracle or for XML.]
Just so that it does not misguide people who miss the joke and are seriously considering using XML DB in Oracle, here are some counter points.
Whenever data is transferred between disparate systems, it must be structured in some way that is understandable to the communicating systems. Structured information contains both content and some indication of what role that content plays. Tags provide structure and meaning to data.
By jamming all of those tags into the data you can easily more than double the size of the data files being sent across the network. And the bigger the file, the slower the transmission time.
The article does not propose an alternative approach to data transfer that provides the key advantages of XML - flexibility, simplicity, interoperability - without the use of tags. It is the problems faced by older complex data transfer mechanisms like EDI that led to the evolution and increased acceptance of XML.
Bottomline: The benefits of using XML in the right context outweigh the extra storage space it takes up in the database.
For a developer (the DBA?) who wishes to remain in the comfort zone of familiarity, the first approach would indeed be preferable. A developer (the DBA?) who learns that this can be done fairly easily in Oracle would make a more intelligent, rational choice based on what suits the application best, rather than what entails least work for him.
The first approach is to be preferred. It is less troublesome to administer but more complicated to implement. It requires someone (the DBA?) to map each tagged element to an appropriate column that already exists in a table.
Oracle has the capability for easy traversal, access, manipulation of the "giant clump of text" that is XML using SQL/XML or XML-specific packages - the parsing program is right there - and we do view that as progress.
And, once again, that data is of no use to anyone who does not have the parsing program. So the database is holding data that is not really defined in any way other than as a giant clump of text. Hello? Does anyone really view this as progress?
On the contrary, current versions of XML Schema languages allow sophisticated constraint definitions, including datatype specification (both primitive and user-defined), length, minimum/maximum occurences, default values and so on.
XML offers little control over data type and length
The acronym is derived phonetically. The primary letter from the first syllable of each word is chosen: the X sound of 'eXtensible' is arguably more pronounced than the E - which intuitively sounds good to me. At least I prefer XML to EML
XML stands for eXtensible Markup Language. And that is another thing wrong with it – I hate acronyms that use letters from the middle of words!
[How would the world be if we chose technologies based on our biases for their names? I would stay miles away from C# because I hate names with special characters ]
Just to insure that people don't take the article above as a joke but as a warning, here is my view about xml .
Originally Posted by hobbes
XML is a very practical and flexible way to exchange data, for that it is just great, especially when exchanging data between different systems/languages. Now, even for that there are big drawbacks that often make you say at first "Wow, yeah xml would be great for that", and at second sight, "finally, I can certainly do without it" :
- let's face it, XML is huge in terms of size, so even if it may be very suitable for some systems, it may be simply out of way for some intensive real-time systems with remote data processing (through a leased line for example)
- furthermore, the parsing of xml uses a lot of memory and is very cpu intensive compared to other, less generic, exchange protocols (we are managing exchange data and are using a proprietary protocol because xml would have killed us)
Now concerning storage of xml data, IMO the only situation where it could be useful is when you receive structured XML data from some data provider and you have no control over data integrity. (in our case, we could have done so for our news provider, for example).
For any other storage concern, just use relational tables because you will gain performance, use less memory, use less storage size, have a more sophisticated constraint management... Data integrity is simply not maintainable with XML. XML Schema is good for XML, but it remains XML. There is no way XML will ever be a better storage management system than retational DBMSs. Objects may be someday, XML will never.
Just my 2 cents ,
You make a good argument for the use of XML as a data transmission/exchange format, which is what it is really useful for, but that is a matter independent of how the data is stored in the database. I don't recall anyone seriously arguning against the use of XML as a data exchange format, although it does have some disadvantages that ought to be acknowledged, whereas inside a database it is practically impossible to enforce consistency on it (data lengths, formats, foreign keys etc) without going to more trouble than you would experience in storing it relationally.
Originally Posted by hobbes
An RDBMS is not just a data dump.
So you want relational transaction management and security features for XML.
Originally Posted by Padma R
The best you could do is
(i) build relational views over XMLType data, or
(ii) store XML in relational form after all!
Click Here to Expand Forum to Full Width