Recently, I have been working on manipulating xmls and csvs using Python. I was amazed by how quick and easy it is manipulate these files in Python. In this article I cover the flexibility of the python and xml.etree.ElementTree API. Let us jump into it.

Problem

Given a dictionary of words and meaning in multiple language. We want to manipulate the xml and add another meaning in French language for the word beauty. 

<dictionary>
    <definition>
        <word lang="en-US">beauty</word>
        <meaning lang="en-US">
            a combination of qualities, such as shape, color, or form, that pleases the aesthetic senses, especially the sight.
        </meaning>
    </definition>
</dictionary>

Approach 1

The simplest approach is to find / reach the element of definition. Then add a sub-element meaning to the definition. This would look something like below.

import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('sample.xml')
root = tree.getroot()

beauty_meaning_fr = "La beauté est l'harmonie des formes et des couleurs qui évoque un sentiment d'émerveillement et d'admiration."

def_elements = root.findall('definition')

for definition in def_elements:
    word = definition.find('word')
    if word.text == 'beauty':
        new_meaning = ET.SubElement(definition, 'meaning')
        new_meaning.text = beauty_meaning_fr
        new_meaning.set('lang', 'fr-FR')

tree.write('output.xml')

This approach works great and provides the desired output.

<dictionary>
    <definition>
        <word lang="en-US">beauty</word>
        <meaning lang="en-US">
            a combination of qualities, such as shape, color, or form, that pleases the aesthetic senses, especially the sight.
        </meaning>
    <meaning lang="fr-FR">La beaut&#233; est l'harmonie des formes et des couleurs qui &#233;voque un sentiment d'&#233;merveillement et d'admiration.</meaning></definition>
</dictionary>

However, when I was working on my real-life problem, the above code did not add the meaning element for some reason. Hence, I started looking for another approach to solve my issue.

Approach 2

Another option is to add a new element meaning, instead of a sub-element. Of course, the element will then need to added to another element. To achieve our requirement, we need to reach the definition element and insert or append the meaning element that we initially created. The below code summarizes this approach:

import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('sample.xml')
root = tree.getroot()

beauty_meaning_fr = "La beauté est l'harmonie des formes et des couleurs qui évoque un sentiment d'émerveillement et d'admiration."

def_elements = root.findall('definition')

for definition in def_elements:
    word = definition.find('word')
    if word.text == 'beauty':
        new_meaning = ET.Element("meaning")
        new_meaning.text = beauty_meaning_fr
        new_meaning.set('lang', 'fr-FR')
        definition.append(new_meaning)

tree.write('output1.xml')

The code works as expected and generates the output similar to that of approach 1.

No comments

Leave your comment

In reply to Some User
We use cookies

We use cookies on our website. Some of them are essential for the operation of the site, while others help us to improve this site and the user experience (tracking cookies). You can decide for yourself whether you want to allow cookies or not. Please note that if you reject them, you may not be able to use all the functionalities of the site.