Thursday, May 6, 2010

Comparison XPathDocument, XmlReader, XLINQ, XmlDocument

From the days of .net 1.0 many of us use XmlDocument for parsing and writing xml in our .Net code. If you ask anyone who started with .Net 3.0 or above, they might say Xlinq is the best method to work with XML. Well, when we conclude on the method that we are going to use in our applications, we also need to see how it will impact on our application or if there is any better method to use.


Let us consider an XMl which I need to just parse and get the data out of it. If I am using .Net 3.5 or 4.0, first thought that comes to mind is that use Xlinq. I did a small test just to check which method would be better to read the xml and parse through the value and the result was really surprising.


Here is my Test program

class Program
{
static void Main(string[] args)
{
Stopwatch sw = new Stopwatch();


string xmlinput = "<Customer><Item desc=\"item1 desc\" id=\"1\" /><Item desc=\"item2 desc\" id=\"2\" /><Item desc=\"item3 desc\" id=\"3\" /><Item desc=\"item4 desc\" id=\"4\" /><Item desc=\"item5 desc\" id=\"5\" /><Item desc=\"item1 desc\" id=\"1\" /><Item desc=\"item2 desc\" id=\"2\" /><Item desc=\"item3 desc\" id=\"3\" /><Item desc=\"item4 desc\" id=\"4\" /><Item desc=\"item5 desc\" id=\"5\" /><Item desc=\"item1 desc\" id=\"1\" /><Item desc=\"item2 desc\" id=\"2\" /><Item desc=\"item3 desc\" id=\"3\" /><Item desc=\"item4 desc\" id=\"4\" /><Item desc=\"item5 desc\" id=\"5\" /><Item desc=\"item1 desc\" id=\"1\" /><Item desc=\"item2 desc\" id=\"2\" /><Item desc=\"item3 desc\" id=\"3\" /><Item desc=\"item4 desc\" id=\"4\" /><Item desc=\"item5 desc\" id=\"5\" /></Customer>";


XmlReader objReader = System.Xml.XmlReader.Create(new System.IO.StringReader(xmlinput));


sw.Start();
for (int i = 0; i < 1000; i++)
{
 Program.NavigateXPathNavigate(objReader);
}
sw.Stop();
Console.WriteLine("XPATH ----- " + sw.ElapsedTicks.ToString());
Console.WriteLine(" ");
sw.Reset();
sw.Start();
for (int i = 0; i < 1000; i++)
{
Program.NavigateXmlDocument(xmlinput);
}
sw.Stop();
Console.WriteLine("XMLDOC ------ " + sw.ElapsedTicks.ToString());
Console.WriteLine(" ");
sw.Reset();
sw.Start();
for (int i = 0; i < 1000; i++)
{
Program.NavigateXmlReader(xmlinput);
}
sw.Stop();
Console.WriteLine("XML Reader ------ " + sw.ElapsedTicks.ToString());
Console.WriteLine(" ");
sw.Reset();
sw.Start();
for (int i = 0; i < 1000; i++)
{
Program.NavigateXlinq(xmlinput);
}
sw.Stop();
Console.WriteLine("XLINQ ------ " + sw.ElapsedTicks.ToString());
Console.WriteLine(" ");
Console.ReadKey();
}






private static void NavigateXPathNavigate(XmlReader xmlinput)
{
XPathDocument xpathDoc = new XPathDocument(xmlinput);
XPathNavigator xpathNavig = xpathDoc.CreateNavigator();
XPathNodeIterator pgItereator = xpathNavig.Select("/Customer/Item");
foreach (XPathNavigator n in pgItereator)
{
 string val = n.GetAttribute("id", "");
}
}




private static void NavigateXmlReader(string xmlinput)
{
XmlReader objReader = System.Xml.XmlReader.Create(new System.IO.StringReader(xmlinput));
while (objReader.Read())
{
if (objReader.NodeType == XmlNodeType.Element)
{
if (objReader.Name == "Item")
{
 objReader.MoveToAttribute("id");
 string id = objReader.Value;
}
}
}
}






private static void NavigateXmlDocument(string xmlinput)
{
XmlDocument objdoc = new XmlDocument();
objdoc.LoadXml(xmlinput);
XmlNodeList objlist = objdoc.SelectNodes("/Customer/Item");
foreach (XmlNode n in objlist)
{
  string val = n.Attributes["id"].Value.ToString();
}
}






private static void NavigateXlinq(string xmlinput)
{
XElement obj = XElement.Parse(xmlinput);
IEnumerable oldItem = (from c in obj.Elements("Item")
select c);
foreach (XElement ele in oldItem)
{
  string val = ele.Attribute("id").Value;
}
}
}


I have kept the watcher for each method to get the time taken to process the method. Here is the output of my program











The XPathDocument was always much lower than any other methods. And the performance of the method always fall into below order


• XPATHDocument
• XMlReader
• XLinq
• XMLDocument


This is purely based on my sample test and it may differ based on the situation. For any enterprise applications, where the number hits and concurrent users are huge, we need to definitely consider on these methods as you can see a drastic difference on the process time.

No comments:

Post a Comment