我有一个6GB的XML文件,我正在使用XmlReader循环遍历该文件。 文件很大但我无能为力。 我使用LINQ,但是大小不允许我使用XDocument,因为我得到一个OutOfMemory错误。
我正在使用XmlReader循环遍历整个文件并提取我需要的内容。 我正在包含一个示例XML文件。
从本质上讲,这就是我的工作:
查找标签容器 。 如果找到,则检索属性“ID”。 如果“ID”以LOCAL开头,那么这就是我要阅读的内容。 读取器循环,直到我找到值为CELL_FD的标签Family 找到后,循环reader.read()直到找到标签IMPORTANT_VALUE 。 找到后,读取IMPORTANT_VALUE的值。 我已经完成了这个容器 ,所以继续循环直到我找到下一个Container (这是休息的地方)。这是我阅读文件和查找相关值的简化版本。
while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { if (myReader.HasAttributes) { string Attribute = myReader.GetAttribute("id"); if (Attribute.IndexOf("LOCAL_") >= 0) { while (myReader.Read()) { if (myReader.Name == "FAMILY") { myReader.Read();//read value string Family = myReader.Value; if (Family == "CELL_FDD") { while (myReader.Read()) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); break; } } } } } } } } }这是XML:
<es:esFD xmlns:es="File.xsd"> <vs:vsFD xmlns:vs="OTHER_FILE.xsd"> <CONTAINER id="LOCAL_CONTAINER1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> <CONTAINER id="FOREIGN_ELEMENT1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> </vs:vsFD> </es:esFD>如何摆脱最内圈,以便达到最顶层的循环?
I have a 6GB XML file and I'm using XmlReader to loop through the file. The file's huge but there's nothing I can do about that. I use LINQ, but the size doesn't let me use XDocument as I get an OutOfMemory error.
I'm using XmlReader to loop through the whole file and extract what I need. I'm including a sample XML file.
Essentially, this is what I do:
Find tag Container. If found, then retrieve attribute "ID". If "ID" begins with LOCAL, then this is what I'll be reading. Reader loop until I find tag Family with value CELL_FD When found, loop the reader.read() until I find tag IMPORTANT_VALUE. Once found, read value of IMPORTANT_VALUE. I'm done with this container, so continue looping until I find the next Container (that's where the break comes in).This is the simplified version of how I've been reading the file and finding the relevant values.
while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { if (myReader.HasAttributes) { string Attribute = myReader.GetAttribute("id"); if (Attribute.IndexOf("LOCAL_") >= 0) { while (myReader.Read()) { if (myReader.Name == "FAMILY") { myReader.Read();//read value string Family = myReader.Value; if (Family == "CELL_FDD") { while (myReader.Read()) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); break; } } } } } } } } }And this is the XML:
<es:esFD xmlns:es="File.xsd"> <vs:vsFD xmlns:vs="OTHER_FILE.xsd"> <CONTAINER id="LOCAL_CONTAINER1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> <CONTAINER id="FOREIGN_ELEMENT1"> <ATTRIBUTES> <FAMILY>CELL_FDD</FAMILY> <CELL_FDD> <VAL1>1.1.2.3</VAL1> <VAL2>JSMITH</VAL2> <VAL3>320</VAL3> <IMPORTANT_VALUE>VERY</IMPORTANT_VALUE> <VAL4>320</VAL4> </CELL_FDD> <FAMILY>BLAH</FAMILY> <BLAH> <VAL1>1.4.43.3</VAL1> <VAL2>NA</VAL2> <VAL3>349</VAL3> <IMPORTANT_VALUE>NA</IMPORTANT_VALUE> <VAL4>43</VAL4> <VAL5>00</VAL5> <VAL6>12</VAL6> </BLAH> </ATTRIBUTES> </CONTAINER> </vs:vsFD> </es:esFD>How can I break from the most inner loop so that I can reach the top-most loop?
最满意答案
使用单独的方法应该可以更容易地控制循环:
while (myReader.Read()) { if ((myReader.Name == "CONTAINER")) { ProcessContainerElement(myReader); } }在ProcessContainerElement方法中,您可以在确定需要开始查找下一个CONTAINER元素时返回。
private void ProcessContainerElement(XmlReader myReader) { while (whatever) { if ((myReader.Name == "IMPORTANT_VALUE")) { myReader.Read(); string Counter = myReader.Value; Console.WriteLine(Attribute + " (found: " + Counter + ")"); return; } } }Using svick's comment, I ended up combining LINQ to XML. Once I reached the correct element and checked that the attribute had the correct ID, I dumped it to XElement.Load.
更多推荐
发布评论