I'm loading my texts from a CSV file, so I'm familiar with this issue.
I'm first creating a list of dictionaries because I have three different options to use for the texts:
import csv
# Creating the lists needed to get data from the items
items = []
# Open the items CSV file and cycle through it
with open(s_ItemsFile, newline='', encoding='utf-8') as csvfile:
reader = csv.reader(csvfile, dialect='excel')
# Skip the header row
next(reader)
for row in reader:
an_item = dict(Id = row[0],
Text1 = row[1],
Text2 = row[2],
Text3 = row[3],)
items.append(an_item)
# Get rid of the csvfile file object
texts = []
counta = 0
for item in items:
text = ''
if item['Text1'].strip() != '':
casetext = item['Text1'].strip()
counta = counta + 1
elif item['Text2'].strip() != '':
casetext = item['Text2'].strip()
counta = counta + 1
elif item['Text3'].strip() != '':
casetext = item['Text3t'].strip()
counta = counta + 1
text = preprocess(text)
texts.append(text)
Then you can go ahead and create your Dictionary from the texts.