string theory and web woes

24 views
Skip to first unread message

Mike Dewhirst

unread,
Apr 6, 2017, 12:44:44 AM4/6/17
to Django users
I'm collecting strings from web page sources and storing extracted data
in a TextField. If it is done again, it gets added into the TextField.

My code goes something like ...

If data not in textfield:
insert_data(data, textfield)

... but no matter how I wrinkle my brow and try harder and harder I
simply cannot get x in y == True when it most definitely appears true.

Therefore it must be a unicode problem. And unicode defeats me every time.

Can anyone suggest a recipe?

I hesitate to ask for unicode tutorial links because I have read them
all and always come away needing counselling.

I'm using Python 3.5, Python 2.7, Django 1.8.18 with the Admin and plain
TextField without any editor widgets.

Thanks heaps

Mike


Mike Dewhirst

unread,
Apr 6, 2017, 1:37:55 AM4/6/17
to Django users
On 6/04/2017 2:44 PM, Mike Dewhirst wrote:
> I'm collecting strings from web page sources and storing extracted
> data in a TextField. If it is done again, it gets added into the
> TextField.
>
> My code goes something like ...
>
> If data not in textfield:
> insert_data(data, textfield)

Got it working in the ugliest way possible ...

note is the proposed value of the TextField
content is the actual content of the TextField

x = bytes(note.strip(), 'utf8')
y = bytes(content.strip().replace("\r", ""), 'utf8')
if x not in y:
print("\n x in y is %s " % (x in y))
print("\n x = %s " % x)
print("\n y = %s " % y)
# append note to the TextField


The problem wasn't unicode it was line endings. My putting in '\n' or
the web providing me with '\n' in the scraped page somehow automatically
caused my blessed operating system to think I really meant '\r\n' and
helpfully did what it thought I wanted.

No further comment. I'm speechless.

Mike
Reply all
Reply to author
Forward
0 new messages