Recently, I have been working on manipulating xmls and csvs using Python. I was amazed by how quick and easy it is manipulate these files in Python. In this article I cover the flexibility of the python and xml.etree.ElementTree API. Let us jump into it.

Problem

Given a dictionary of words and meaning in multiple language. We want to manipulate the xml and add another meaning in French language for the word beauty. 

<dictionary>
    <definition>
        <word lang="en-US">beauty</word>
        <meaning lang="en-US">
            a combination of qualities, such as shape, color, or form, that pleases the aesthetic senses, especially the sight.
        </meaning>
    </definition>
</dictionary>

Approach 1

The simplest approach is to find / reach the element of definition. Then add a sub-element meaning to the definition. This would look something like below.

import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('sample.xml')
root = tree.getroot()

beauty_meaning_fr = "La beauté est l'harmonie des formes et des couleurs qui évoque un sentiment d'émerveillement et d'admiration."

def_elements = root.findall('definition')

for definition in def_elements:
    word = definition.find('word')
    if word.text == 'beauty':
        new_meaning = ET.SubElement(definition, 'meaning')
        new_meaning.text = beauty_meaning_fr
        new_meaning.set('lang', 'fr-FR')

tree.write('output.xml')

This approach works great and provides the desired output.

<dictionary>
    <definition>
        <word lang="en-US">beauty</word>
        <meaning lang="en-US">
            a combination of qualities, such as shape, color, or form, that pleases the aesthetic senses, especially the sight.
        </meaning>
    <meaning lang="fr-FR">La beaut&#233; est l'harmonie des formes et des couleurs qui &#233;voque un sentiment d'&#233;merveillement et d'admiration.</meaning></definition>
</dictionary>

However, when I was working on my real-life problem, the above code did not add the meaning element for some reason. Hence, I started looking for another approach to solve my issue.

Approach 2

Another option is to add a new element meaning, instead of a sub-element. Of course, the element will then need to added to another element. To achieve our requirement, we need to reach the definition element and insert or append the meaning element that we initially created. The below code summarizes this approach:

import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('sample.xml')
root = tree.getroot()

beauty_meaning_fr = "La beauté est l'harmonie des formes et des couleurs qui évoque un sentiment d'émerveillement et d'admiration."

def_elements = root.findall('definition')

for definition in def_elements:
    word = definition.find('word')
    if word.text == 'beauty':
        new_meaning = ET.Element("meaning")
        new_meaning.text = beauty_meaning_fr
        new_meaning.set('lang', 'fr-FR')
        definition.append(new_meaning)

tree.write('output1.xml')

The code works as expected and generates the output similar to that of approach 1.

No comments

Leave your comment

In reply to Some User