解析XML文件的节点

乔比

如何将给定目录下的所有XML文件解析为应用程序的输入,并将其输出写入文本文件。

注意:XML并不总是相同的,XML中的节点可以变化,并且可以具有任意数量的子节点。

在这方面任何帮助或指导都将非常有用:)

XML文件样本

<CATALOG>
<CD>
<TITLE>Empire Burlesque</TITLE>
<ARTIST>Bob Dylan</ARTIST>
<COUNTRY>
<CNT>USA</CNT>
<CODE>3456</CODE>
</COUNTRY>
<COMPANY>Columbia</COMPANY>
<PRICE>10.90</PRICE>
<YEAR>1985</YEAR>
</CD>
<CD>
<TITLE>Hide your heart</TITLE>
<ARTIST>Bonnie Tyler</ARTIST>
<COUNTRY>UK</COUNTRY>
<COMPANY>CBS Records</COMPANY>
<PRICE>9.90</PRICE>
<YEAR>1988</YEAR>
</CD>
</CATALOG>

C#代码

using System;
using System.Collections.Generic;
using System.Windows.Forms;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.IO;
using System.Data;
using System.Xml;
using System.Xml.Linq;

namespace XMLTagParser
{
    class Program
    {
        static void Main(string[] args)
        {
            Console.WriteLine("Please Enter the Location of the file");

            // get the location we want to get the sitemaps from 
            string dirLoc = Console.ReadLine();

            // get all the sitemaps 
            string[] sitemaps = Directory.GetFiles(dirLoc);
            StreamWriter sw = new StreamWriter(Application.StartupPath + @"\locs.txt", true);

            // loop through each file 
            foreach (string sitemap in sitemaps)
            {
                try
                {
                    // new xdoc instance 
                    XmlDocument xDoc = new XmlDocument();

                    //load up the xml from the location 
                    xDoc.Load(sitemap);

                    // cycle through each child noed 
                    foreach (XmlNode node in xDoc.DocumentElement.ChildNodes)
                    {
                        // first node is the url ... have to go to nexted loc node 
                        foreach (XmlNode locNode in node)
                        {

                                string loc = locNode.Name;

                                // write it to the console so you can see its working 
                                Console.WriteLine(loc + Environment.NewLine);

                                // write it to the file 
                                sw.Write(loc + Environment.NewLine);
                            }
                        }
                    }
                catch {
                    Console.WriteLine("Error :-(");
                }
            }
            Console.WriteLine("All Done :-)");
            Console.ReadLine();
        }
    }
}

首选输出:

CATALOG/CD/TITLE
CATALOG/CD/ARTIST
CATALOG/CD/COUNTRY/CNT
CATALOG/CD/COUNTRY/CODE
CATALOG/CD/COMPANY
CATALOG/CD/PRICE
CATALOG/CD/YEAR

CATALOG/CD/TITLE
CATALOG/CD/ARTIST
CATALOG/CD/COUNTRY
CATALOG/CD/COMPANY
CATALOG/CD/PRICE
CATALOG/CD/YEAR
圣格里菲思

这是一个递归问题,您正在寻找的被称为“树遍历”。这意味着对于每个子节点,您都希望先查看其子节点,然后再查看该节点的子节点(如果有的话),依此类推,沿行记录“路径”,但仅打印出其名称“叶”节点。

您将需要一个类似这样的函数来“遍历”树:

static void traverse(XmlNodeList nodes, string parentPath)
{
    foreach (XmlNode node in nodes)
    {
        string thisPath = parentPath;
        if (node.NodeType != XmlNodeType.Text)
        {
            //Prevent adding "#text" at the end of every chain
            thisPath += "/" + node.Name;
        }

        if (!node.HasChildNodes)
        {
            //Only print out this path if it is at the end of a chain
            Console.WriteLine(thisPath);
        }

        //Look into the child nodes using this function recursively
        traverse(node.ChildNodes, thisPath);
    }
}

然后,这就是我将其添加到您的程序中(在foreach sitemap循环内)的方式:

try
{
    // new xdoc instance 
    XmlDocument xDoc = new XmlDocument();

    //load up the xml from the location 
    xDoc.Load(sitemap);

    // start traversing from the children of the root node
    var rootNode = xDoc.FirstChild;
    traverse(rootNode.ChildNodes, rootNode.Name);
}
catch
{
    Console.WriteLine("Error :-(");
}

我利用了另一个有用的答案:使用递归函数遍历XML

希望这可以帮助!:)

本文收集自互联网,转载请注明来源。

如有侵权,请联系 [email protected] 删除。

编辑于
0

我来说两句

0 条评论
登录 后参与评论

相关文章