Interview with database expert Ronald Bourret

Ronald Bourret is a freelance programmer, writer and XML researcher, specialising in databases and schemas. You can check his very useful web site: http://www.rpbourret.com.

newspaper techniques: What do you think of the XML services offered by the traditional SGBD providers such as Oracle, IBM, Microsoft and now MySQL? Are they performing as well as native XML databases, and would you recommend them for the publishing industry?

Ronald Bourret: The current version of Oracle and the next versions of IBM DB2 and Microsoft SQL Server all provide very good support for XML. The publishing industry will be most interested in the XML data type and XQuery support offered by these databases. The XML data type is a first class data type – at the same level as INTEGER or VARCHAR – designed to handle XML documents. Behind this data type are storage mechanisms similar to those found in native XML databases: Oracle uses indexed text, DB2 uses proprietary structures designed to store XML, and SQL Server stores XML both in a proprietary binary format and in tabular structures designed to store XML. All three databases are able to index XML data and markup.

On the query side, all three databases support XQuery queries over XML stored in the database. Depending on the database, XQuery statements can be embedded in SQL, run in standalone form, or have SELECT statements embedded in them. As a result, these databases are able to query XML directly, as well as to mix XML and relational data.

Of less interest to the publishing industry – but of major interest to industries that use XML to transfer transactional data – are the abilities of these databases for transferring data between XML documents and relational tables.

All three databases support the ability to construct XML documents from relational data. In Oracle and DB2, this is done through SQL/XML – a standard set of extensions to SQL for working with XML. SQL Server has its own proprietary extensions to the SELECT statement that provide equivalent functionality.

All three databases also support proprietary methods for mapping XML schemas to relational schemas using an object-relational mapping. This allows the database to transfer data between XML documents and relational tables based on a specific mapping.

Unfortunately, I don't have enough direct experience with any of these databases or with the publishing industry to state that these databases are or are not appropriate for the publishing industry. However, it seems likely that they are, if only as components in larger applications. (It seems unlikely that any of these companies would ignore the publishing industry, especially as all three companies offer content management solutions based on their databases.)

I'm not aware of significant support for XML in MySQL, although it has been a while since I have checked.

newspaper techniques: There are several standards in the new environment of SGBD. Which ones do you have the most confidence in for the future?

Ronald Bourret: XQuery and SQL/XML. XQuery is of more interest to the publishing industry, as it is designed for querying XML. While XQuery is not yet final, it has numerous implementations, including ones by database companies such as Microsoft, IBM, and Oracle. It is missing functionality that is important to the publishing industry – notably updates and full-text searches – but preliminary specifications are available in these areas and many companies offer proprietary solutions for them.

SQL/XML is of less interest to the publishing industry, as it is designed primarily for transforming relational data into XML.

And although it is not related to databases, the other standard that is of interest to the publishing industry is XSL:FO, which is used for high-quality rendering of XML content.

newspaper techniques: Is it possible for a publishing house which runs separate databases (traditional for commercial applications and XML for content management) to imagine a possible link between these databases and have one tool that would run performance researches between all these database? Are some companies already offering that.

Ronald Bourret: Yes. Many native XML databases already offer this capability. Specifically, a native XML database is a good solution for storing and querying the kinds of document-centric XML found in the publishing industry. In addition, many native XML databases can be used to integrate data from other sources, including relational databases. Combining these two capabilities gives the ability to link relational and XML data.

For more information, see the sections on "Storing and Querying
Document-centric XML" and "Data Integration" in:

http://www.xml.com/pub/a/2005/03/30/native.html
http://www.xml.com/pub/a/2005/04/13/going-native.html

A second solution is to migrate XML data to relational databases using the XML data type described above. As was mentioned earlier, these databases support the ability to integrate XQuery and SQL and can therefore directly link XML content and relational data.


  » To ifra-nt.com home