通过XML迭代?(Iteration through XML?)

我有一个6GB的XML文件,我正在使用XmlReader循环遍历该文件。 文件很大但我无能为力。 我使用LINQ,但是大小不允许我使用XDocument,因为我得到一个OutOfMemory错误。

我正在使用XmlReader循环遍历整个文件并提取我需要的内容。 我正在包含一个示例XML文件。

从本质上讲,这就是我的工作:

查找标签容器 。 如果找到,则检索属性“ID”。 如果“ID”以LOCAL开头,那么这就是我要阅读的内容。 读取器循环,直到我找到值为CELL_FD的标签Family 找到后,循环reader.read()直到找到标签IMPORTANT_VALUE 。 找到后,读取IMPORTANT_VALUE的值。 我已经完成了这个容器 ,所以继续循环直到我找到下一个Container (这是休息的地方)。

这是我阅读文件和查找相关值的简化版本。

while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { if (myReader.HasAttributes) { string Attribute = myReader.GetAttribute("id"); if (Attribute.IndexOf("LOCAL_") >= 0) { while (myReader.Read()) { if (myReader.Name == "FAMILY") { myReader.Read();//read value string Family = myReader.Value; if (Family == "CELL_FDD") { while (myReader.Read()) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); break; } } } } } } } } }

这是XML:

<es:esFD xmlns:es="File.xsd"> <vs:vsFD xmlns:vs="OTHER_FILE.xsd"> <CONTAINER id="LOCAL_CONTAINER1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> <CONTAINER id="FOREIGN_ELEMENT1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> </vs:vsFD> </es:esFD>

如何摆脱最内圈,以便达到最顶层的循环?

I have a 6GB XML file and I'm using XmlReader to loop through the file. The file's huge but there's nothing I can do about that. I use LINQ, but the size doesn't let me use XDocument as I get an OutOfMemory error.

I'm using XmlReader to loop through the whole file and extract what I need. I'm including a sample XML file.

Essentially, this is what I do:

Find tag Container. If found, then retrieve attribute "ID". If "ID" begins with LOCAL, then this is what I'll be reading. Reader loop until I find tag Family with value CELL_FD When found, loop the reader.read() until I find tag IMPORTANT_VALUE. Once found, read value of IMPORTANT_VALUE. I'm done with this container, so continue looping until I find the next Container (that's where the break comes in).

This is the simplified version of how I've been reading the file and finding the relevant values.

while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { if (myReader.HasAttributes) { string Attribute = myReader.GetAttribute("id"); if (Attribute.IndexOf("LOCAL_") >= 0) { while (myReader.Read()) { if (myReader.Name == "FAMILY") { myReader.Read();//read value string Family = myReader.Value; if (Family == "CELL_FDD") { while (myReader.Read()) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); break; } } } } } } } } }

And this is the XML:

<es:esFD xmlns:es="File.xsd"> <vs:vsFD xmlns:vs="OTHER_FILE.xsd"> <CONTAINER id="LOCAL_CONTAINER1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> <CONTAINER id="FOREIGN_ELEMENT1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> </vs:vsFD> </es:esFD>

How can I break from the most inner loop so that I can reach the top-most loop?

最满意答案

使用单独的方法应该可以更容易地控制循环:

while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { ProcessContainerElement(myReader); } }

在ProcessContainerElement方法中,您可以在确定需要开始查找下一个CONTAINER元素时返回。

private void ProcessContainerElement(XmlReader myReader) { while (whatever) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); return; } } }

Using svick's comment, I ended up combining LINQ to XML. Once I reached the correct element and checked that the attribute had the correct ID, I dumped it to XElement.Load.

更多推荐