Xml/Xslt - Practical Uses for XML - Asked By Desert Ghost on 26-Oct-01 01:45 PM

I've been perusing some articles here, and have played with XML only briefly, so I have a few questions.

1.) What's the MAIN purpose of XML? To me, it seems like a heterogenous data store to be parsed by XSL depending on individual needs and wants.  The only problem with it being a data store, the parsers are quite slow compared to databases, and even so compared to a text file <albeit not as structured>

I've got some data that needs to be shared between Linux and Windows, so I'm naturally thinking of XML to accomplish this.  What I'm confused about is the purpose.  I can easily make a db connection and yank appropriate data off the database and format the data at will.  I am concerned with speed here.

I've played around at XML 101, and have the Wrox Beginning XML, but I still haven't found a use for a real world, heavy traffic situation.  The examples I've seen are really just for data sharing through the Internet.  It HAS to be something more than just that.

Oh, BTW, Robbe -- I started looking through the articles for XML here, but you don't place them in categories? I would find it easier to go to XML Articles, Database Articles, .NET articles, etc, than having them listed all out -- and as usual, that's just my opinion, I could be wrong :)

Yes, you are wrong! - Asked By Robbe Morris on 26-Oct-01 01:55 PM

Chuckle...With the advent of some future features to egghead, we'll likely be reevaluating how our content is referenced.  On the other hand, we've gotten some compliments on keeping the content simple and easy to scroll through in one large page.  Nowadays, so many of these articles span across languages/applications that it is often useful to just look at everything...

On the XML point, you've really nailed the two primary purposes for XML.  Presentation of data and the transmission of data.

I'll deal with the transmission part and let others with more experience with XSL comment on that.

You are correct in your assessment of being able to format a data stream between systems.  However, having a standardized way of accessing somewhat complicated data streams, sharing data across companies or entities becomes much easier and less complicated to coordinate.  This becomes particularly useful if you want to output database objects into an XML stream and then reload it on the other side.

As more and more products are developed to support the standards of XML, data shares across different applications becomes much easier to implement.  In many ways, XML is the foundation for what current and future distributed apps are built on across vendors.

Heh - Asked By Desert Ghost on 20-Jul-14 11:30 AM

Told ya... just my opinion.  When you only had a few articles, it wasn't bad, but you're getting significant now.  Plus, before I just read them cause I was interested, now I have a specific problem.

Ok, formatting a data stream, how is this different than just using a database? Or a text file? I don't have the luxury of doing this in ASP due to speed anymore... everything has to be a compiled C++ component, possibly used by CGI <linux> and ASP <Windows>

From my test experience, I've found it really slow... IMO, it really is just an organized text file <I say this in fear of being slapped>.

Speed is my primary concern. I guess really what is the advantage over a database? From what I've read, XML is not to replace the db, but to enhance it... but I can't help worry about performance.

I really need performance right now... as an estimate, I just got the page hits for this last quarter -- 1.4 billion.

Now, obviously this won't be in the majority of the pages, but I need something that will allow to grow.

All the examples I've seen in books and website do things like grab info from another site, share data between two separate companies with different agendas, etc.  My instance, however, is to share similar data between two different web servers and OS platforms.  Is this appropriate or am I looking at the wrong technology?

Hmmm... - Asked By Robbe Morris on 26-Oct-01 02:16 PM

If you are sharing data between two systems that your company supports, perhaps XML is not the answer.  The data stream option really comes into play when you want to marshall data over the net but don't have direct access to the actual database, just the web server.  You'll have fewer problems with database connectivity timeouts and such.

And no, XML really is just a highly organized text file.  So noone ought to be slapping anyone.  Chuckle...

Don't know about the performance of non MS XML parsers.  But, I've heard of drastic speed improvements on MSXML4.0 over 3.0.  Is the performance problem on the MS OS or other or both?

You may be right in your particular situation.  The DB may be the best answer especially if the DB is on the same local network as your various web servers.
Ah well - Asked By Desert Ghost on 26-Oct-01 02:25 PM
Was hoping I could use this for something.

But BINGO! What you say makes sense... IF you control your servers and your databases, use programs.  If you do NOT, XML is a good solution.

I think that's the best way to describe it so far... guess I'll have to vote now for ya -- just gotta learn how.

The test I did was on an NT machine... seemed slow to load up 10,000 records and organize them into XML then it was to me to manipulate the data manually.  Then again, I'm not real experienced with this, so it could be just my coding of it.

That was all with MSXML 3.  Never used the *nix version of it, was hoping I didn't have to.  Figured there may be something unique to PHP that could've grabbed it.  Ah well... a solution for another time, perhaps.  Someday I'll find my XML ship and a good, practical reason to really use it, instead of just saying "We have an XML solution for that".

Oh wow... - Asked By Robbe Morris on 26-Oct-01 02:29 PM
The normal MSXML4.0 parser doesn't handle such large individual files real well.  Might try the SAX XML parser.  It is better suited for larger files.

XML is really better suited for small files.  Say maybe a few hundred records.

While it is supported, I don't think the designers anticipated a high number of transactions with that much data via XML.
oooooh - Asked By Desert Ghost on 26-Oct-01 02:33 PM
Maybe that's why the performance was off :)

Yeah, I need to handle large data at times, small at others, but there is no "max" records.

SAX XML parser? Is that the company? Hmmm... I'll have to ask ole Google about this one.

Wondering though, how *difficult* could it be to write your own parser??? Maybe another option.

Thanks again!
Nope...I'd user there's. - Asked By Robbe Morris on 26-Oct-01 02:38 PM
There parsers (MS and SAX) are highly optimized.  While I'm certain you are a smart guy, I'm not sure it would be worth your time trying to write a parser that properly handles the seemingly endless number of different XML tags...

I'll see if either Peter or myself can round up a link to the other parser.  

Back to my wonderful in memory COM object hierarchy trees.  Man, I'm getting sick messing with this stuff...
Heh - Asked By Desert Ghost on 26-Oct-01 02:45 PM
I'll switch with ya... ever notice you can't return a UDT back to ASP? C++ is a Nazi with COM also...

I searched google for her to.  FINALLY, I know what Xerces is... heard of it before, just didn't know what it is... it's the XML parser for Apache -- open source.

True, I'm in the mindset that developed products are (supposed to be) fully optimized before deployment.  So you're definately right on about that one.
<note attittude="inquisitive">XML Redux</note> - Asked By Peter Bromberg on 26-Oct-01 02:51 PM
2 cents:
I've seen "XML" held up on a pedestal by the ill-informed and then terribly misused at both great expense and time lost.
On the other hand, if you need to either transfer amounts of information or to make remote procedure calls between disparate systems / OS's, XML has quickly become the de-facto standard. SAX BTW is not good for streaming data over the wire, only for blasting through XML and responding appropriately to formalized events as it passes through. if you want to stream XML, you can do that directly out of many RDBMS's without incurring the overhead of loading them into a parser. Parsers are for creating a traversable DOM from the XML, a function which can often be unnecessary,.
Here's the situation - Asked By Desert Ghost on 26-Oct-01 03:04 PM
Let's say I have a database that holds information about WidgetA.  Now, depending on certain conditions, the load balance can be taken by either Linux/Apache, or Windows/IIS <for reasons I don't want to get into here>.

The platforms are different, but the data needs to reach a program.  So, I've thought of a few things:

1. Use XML to make 'carbon copies' on each platform <data does not change very often>, updating appropriately
2. Keep one database on one OS, and just call to whereever the data resides

Currently, we have two seperate dbs on each system... seems like a waste to me.  I thought, maybe even keeping one XML doc on one platform will clear it up.

And you're right, I'm ill-informed, and probably could write a more efficient database app than XML app, just due to my experience.  But if XML is the way to go, I'll spend the next several months perfecting it.

Oh, BTW... I just read your most recent article...  Had to laugh "I don't do as well sitting in classes..." <or something like that> LOL! You have a PhD! I'd imagine you spent an awful amount of time sitting in classes!
At first "blush"... - Asked By Peter Bromberg on 26-Oct-01 03:11 PM
Based on what you're describing, I would be inclined to try to set everything up on one database on one machine (whichever OS, it could be DB2 or Oracle on Linux, it could be SQL Server on Windows, doesn't matter) and have the various machines get their data from the one database system. Modern databases can be scaled up (multitple processors) and scaled out (federated partitioned views over x number of machines). You can access ODBC data sources over the wire with PHP, JSP, ASP, Python, etc.
Now you may not need XML at all...
Kinda thought so - Asked By Desert Ghost on 26-Oct-01 03:19 PM
I thought that would be the best way to go, but before Robbe's suggestion, I was still trying to find a niche for XML -- which is now cleared up.

Yup, access one database.  Luckily <and in a strange move>, MS just released a JDBC driver for SQL Server... what's up with that???

Thanks for the suggestions and clearing up the "XML-Files" for me...  back to work :(