Thursday, May 3, 2012

Reading an Xml Stream with XmlTextReader



In a previous article on the XmlTextReader Constructor we discussed several ways to instantiate an XmlTextReader object. In this article we are going to read an RSS xml feed as a stream using System.Xml.XmlTextReader. RSS (Really Simple Syndication) works well for this since it is a read only xml stream and has a well documented and established format. We'll be pulling headlines from the RSS feed to display on a web page.

Writing out some headlines

We're going to start by using the XmlTextReader.Read() method to progress through the nodes. We'll also use the XmlTextReader.Name property to get the name of the nodes that we are interested in. Once we've determined that we are at a title node we'll write out its value using the XmlTextReader.ReadString() method.
WebRequest request = WebRequest.Create(UrlToRSSFeed); 
WebResponse response = request.GetResponse(); 

//Get the Response Stream from the URL 
Stream responseStream = response.GetResponseStream(); 

// Read the Response Stream using XmlTextReader 
XmlTextReader reader = new XmlTextReader(responseStream); 

//read through all the nodes
while(reader.Read())
{
  //the headlines we want are in the item nodes
  if(reader.Name == "item")
  {
    while(reader.Read())
    {
      //if the node is a title we want to read its value
      if(reader.Name == "title")
      {
        Response.Write(reader.ReadString() + "<br />");
      }
    }
  }
}

responseStream.Close (); 
response.Close();
While this will give us a list of headlines it's not very interesting because they are just text. Also, the first while(reader.Read()) starts off the progressive node by node reading through the xml, but once we hit the second while(reader.Read()) it actually is the loop that finishes reading the entire xml stream. The only reason to check for reader.Name == "item" was to make sure we were in an item node before we started pulling the titles. If we'd left out the check for the item node and the second while loop we would have also pulled the title of the RSS feed.

Headlines with more details

Since a list of text only headlines isn't that useful we're going to extend on our previous example by also pulling out the url for each headline along with the description. We'll use the url to create a link and follow that with the description to flesh out our headlines. To do this we'll need to get the values from the title, link, and description nodes within each item node. To figure out when we've reached the end of an item node we're going to use the XmlTextReader.Depth property to track the depth of the item node so we can determine when we've hit it's corresponding end tag, which will be the same depth.
WebRequest request = WebRequest.Create(UrlToRSSFeed); 
WebResponse response = request.GetResponse(); 

//Get the Response Stream from the URL 
Stream responseStream = response.GetResponseStream(); 

// Read the Response Stream using XmlTextReader 
XmlTextReader reader = new XmlTextReader(responseStream); 

//some string variables to hold the rss info we want
string title = String.Empty;
string link = String.Empty;
string description = String.Empty;

//a variable to track the depth of the item node
int itemDepth = 0;

while(reader.Read())
{
  //we are only interested in the item nodes 
  if(reader.Name == "item")
  {
    //set the item depth
    itemDepth = reader.Depth;
  
    //advance the reader one node
    reader.Read();
  
    //get all the information we are looking for until we hit the 
    //end tag, which is the same depth as the start tag.
    while(reader.Depth != itemDepth)
    {   
      //find the elements we are interested in and set the
      //string variables to their values
      if(reader.Name == "title")
      {
        title = reader.ReadString();
      }
      else if(reader.Name == "link")
      {
        link = reader.ReadString();
      }
      else if(reader.Name == "description")
      {
        description = reader.ReadString();
      }
      
      //advance the reader one node
      reader.Read();        
    }
    
    //now that we have our info write it to the page
    Response.Write(String.Format("<a href=\"{0}\">{1}</a>
                    <br />{2}<p />", link, title, description));
  }

}

responseStream.Close (); 
response.Close();

Headlines with more details - take 2

Another way to find the closing tag for a node is to use the XmlTextReader.Name property in conjunction with the XmlTextReader.NodeType property to determine if you've hit the specific end tag. If the NodeType is equal to the XmlNodeType.EndElement and the Name of the node matches the starting element then you know you've hit the closing tag. We're also using the XmlTextReader.IsStartElement() method with the Name property to determine that we are at the correct start tag.
WebRequest request = WebRequest.Create(UrlToRSSFeed); 
WebResponse response = request.GetResponse(); 

//Get the Response Stream from the URL 
Stream responseStream = response.GetResponseStream(); 


// Read the Response Stream using XmlTextReader 
XmlTextReader reader = new XmlTextReader(responseStream); 

//some string variables to hold the rss info we want
string title = String.Empty;
string link = String.Empty;
string description = String.Empty;

while(reader.Read())
{
  //we are only interested in the item nodes 
  if(reader.IsStartElement() & reader.Name == "item")
  { 
  
    while(reader.Read())
    {   
      //find the elements we are interested in and set the
      //string variables to their values
      if(reader.Name == "title")
      {
        title = reader.ReadString();
      }
      else if(reader.Name == "link")
      {
        link = reader.ReadString();
      }
      else if(reader.Name == "description")
      {
        description = reader.ReadString();
      }  
      
      //if we've hit the item nodes closing tag we need to break
      //out of the loop and write out the values
      else if(reader.Name == "item" & reader.NodeType == XmlNodeType.EndElement)
      {
        break;
      }      
    }
    
    //now that we have our info write it to the page
    Response.Write(String.Format("<a href=\"{0}\">{1}</a>
                    <br />{2}<p />", link, title, description));
  }

}

responseStream.Close (); 
response.Close();
In all of these examples UrlToRSSFeed refers to any valid url string to an xml feed.

In Conclusion

You've just seen how to use a variety of the XmlTextReader methods and properties to read various nodes from an xml stream. These examples could be modified to write headlines or other xml directly into a page or to create a user control that writes one or more rss feeds in a web page.

No comments:

Post a Comment

Blog Archive