Hi there,
Thank you for spending your time here and helping out, I really appreciate it.
Currently, I have multiple excel files to loop through.
I am only looping through columns C:D. If the column has the keyword "Abbreviation", I want to extract all the values in that column.
This is because my keyword could exist in either column C or D
My columns will look something like this:
Col C Col D
| Abbreviation |
Description |
| urre_no |
sdS |
| BELLO |
helllo |
| |
|
OR
Col C Col D
| Description |
Abbreviation |
| hehehe |
abc |
| PEARRR |
ab33_2_ |
| FJF |
a_rr |
| kkkkk |
IIRR_VF |
| llkl |
jk_ff |
| |
|
| |
| |
After importing my excel files, here is how I loop through the columns that I want to scan through:
wb1 = load_workbook(join(dict_folder, file), data_only = True)
ws = wb1.active
for rowofcellobj in ws["C":"D"]:
for cellobj in rowofcellobj:
if cellobj.value == "Abbreviation":
# extract all words in that column but Idk how to execute this step or if my above steps are correct
if cellobj.value is not None:
data = re.findall(r"\b\w+_.*?\w+|[A-Z]*$\b", str(cellobj.value))
#filtering out blank rows here:
if data != [ ]:
if data != [' ']:
#extracting words from square brackets in list:
fields = data[0]
print(fields)
I am stuck at the area which I had commented in red..