Hello all,
I am having some strange a curious behaviour with lxml.
I am using iterparse to run through xml files. I am finding a certain parameter in each line and then getting using getparent to be able to run through each element of the block that I want.
The only issue is that when I do this, one or two elements will not return the correct number of children. i.e. I use getparent and instead of returning an element block with 5 children it will return 2 or 3.
I have dug around the net and seen that it could be something to do with not clearing the memory of each element. So I implemented these methods to try and alleviate this but no dice.
Has anyone come across this issue before?
Code is below:
for root, dirs, files in os.walk(levelDir):
exceptions = ['mission', 'continent', 'editor_only', 'lights']
for d in dirs:
continentFiles = glob.glob(os.path.join(root, d)+'\*.continent')
for cf in continentFiles:
# Using ntpath to get the basename of the file
baseFileName = ntpath.basename(cf).split('.')[0]
if baseFileName in exceptions:
continue
else:
# Need to run through the elements of tag type entry
for event, element in ET.iterparse(cf, events=('end',), tag="entry"):
attribs = element.attrib
block, path = verifyUnitElement(element, attribs)
if block == '':
continue # TODO: Debug and fix this, not loading blocks with correct number of elements???!!!!
if len(block.getchildren()) < 5:
print 'Bad block loading', path[0], cf.split('/')[-1], block.sourceline
continue tempDict = returnPositionValues(block, path[0], 'name_id')
unitInfo.update(tempDict)
element.clear()
# New addition to help clear memory
for ancestor in element.xpath('ancestor-or-self::*'):
while ancestor.getprevious() is not None:
del ancestor.getparent()[0]