Extracting a value after a find

jehoshua

unread,

Nov 30, 2024, 1:18:36 AM11/30/24

to beautifulsoup

Data ..

Code

#!/usr/bin/python

import bs4
from bs4 import BeautifulSoup

with open('testxml.xml', 'r') as f:
file = f.read()

soup = BeautifulSoup(file, 'xml')

# only transactions with account A000831 or A000832
for tag in soup.find_all("SPLIT", account=["A000831","A000832"]):
print("Anchor: ", tag)

print('finished')

OUTPUT

Anchor: <SPLIT account="A000831" action="" bankid="" id="S0001" memo="AAA Duracell - 48 pack" number="" payee="P000035" price="1/1" reconciledate="" reconcileflag="2" shares="-1499/50" value="-1499/50"/>

How do I extract the "value" ? Is it a tag within a tag ? Once I can extract the "-1499/50" , then just a matter of conversion to numerics and do the math = - 29.98

Isaac Muse

unread,

Nov 30, 2024, 12:46:30 PM11/30/24

to beautifulsoup

See example below:

import bs4 from bs4 import BeautifulSoup XML = """ <TRANSACTION id="T000000000000014727" entrydate="2024-04-20" memo="" commodity="AUD" postdate="2024-04-20"> <SPLITS> <SPLIT id="S0001" number="" reconciledate="" price="1/1" payee="P000035" account="A000831" value="-1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="2" shares="-1499/50"/> <SPLIT id="S0002" number="" reconciledate="" price="1/1" payee="P000035" account="A000620" value="1499/50" memo="AAA Duracell - 48 pack" action="" bankid="" reconcileflag="0" shares="1499/50"/> </SPLITS> </TRANSACTION> """ soup = BeautifulSoup(XML, 'xml') # only transactions with account A000831 or A000832 for tag in soup.find_all("SPLIT", account=["A000831","A000832"]): print("Anchor: ", tag) print("Value:", tag['value']) print('finished')

jehoshua

unread,

Nov 30, 2024, 5:22:42 PM11/30/24

to beautifulsoup

Great, that works fine. Thank you Isaac :)

jehoshua

unread,

Nov 30, 2024, 6:56:15 PM11/30/24

to beautifulsoup

Is there any difference between this ..

print("Value:", tag['value'])

and this ?

print("Value:", tag.get('value'))

Isaac Muse

unread,

Nov 30, 2024, 7:59:49 PM11/30/24

to beautifulsoup

.get() is good for returning defaults in case the attribute is not found .get('attr', 'some-default').

jehoshua

unread,

Nov 30, 2024, 8:07:44 PM11/30/24

to beautifulsoup

Okay, thank you :)

Reply all

Reply to author

Forward

Extracting a value after a find_all()

jehoshua

Isaac Muse

jehoshua

jehoshua

Isaac Muse

jehoshua