Account Options

  1. Sign in
The old Google Groups will be going away soon.
Switch to the new Google Groups.
Google Groups Home
« Groups Home
Using FeedController get_feed_data, mysterious "return unicode(value, encoding) type error: function takes exactly 5 arguments (1 given)"
There are currently too many topics in this group that display first. To make this topic appear first, remove this option from another topic.
There was an error processing your request. Please try again.
flag
  4 messages - Collapse all  -  Translate all to Translated (View all originals)
The group you are posting to is a Usenet group. Messages posted to this group will make your email address visible to anyone on the Internet.
Your reply message has not been sent.
Your post was successful
 
From:
To:
Cc:
Followup To:
Add Cc | Add Followup-to | Edit Subject
Subject:
Validation:
For verification purposes please type the characters you see in the picture below or the numbers you hear by clicking the accessibility icon. Listen and type the numbers you hear
 
buffalob  
View profile  
 More options Jun 26 2007, 10:06 pm
From: buffalob <bruce....@gmail.com>
Date: Wed, 27 Jun 2007 02:06:07 -0000
Local: Tues, Jun 26 2007 10:06 pm
Subject: Using FeedController get_feed_data, mysterious "return unicode(value, encoding) type error: function takes exactly 5 arguments (1 given)"
I'm using FeedController with get_feed_data, successfully for the most
part, but very intermittently I get the following mysterious crash
(and I say mysterious because my code does not directly call anything
listed below, so I've no idea why the "5 args expected, 1 provided"
should be happening).

I invoke from my web browser as follows:

http://localhost:8080/feed/rss2.0

...and then here's what happens (and I'll post my code in a response
momentarily):

500 Internal error

The server encountered an unexpected condition which prevented it from
fulfilling the request.

Page handler: <bound method Feed.rss2_0 of <hello.controllers.Feed
object at 0x01300310>>
Traceback (most recent call last):
  File "c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy
\_cphttptools.py", line 105, in _run
    self.main()
  File "c:\python24\lib\site-packages\CherryPy-2.2.1-py2.4.egg\cherrypy
\_cphttptools.py", line 254, in main
    body = page_handler(*virtual_path, **self.params)
  File "<string>", line 3, in rss2_0
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\controllers.py", line 334, in expose
    output = database.run_with_transaction(
  File "<string>", line 5, in run_with_transaction
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\database.py", line 303, in so_rwt
    retval = func(*args, **kw)
  File "<string>", line 5, in _expose
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\controllers.py", line 351, in <lambda>
    mapping, fragment, args, kw)))
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\controllers.py", line 391, in _execute_func
    return _process_output(output, template, format, content_type,
mapping, fragment)
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\controllers.py", line 82, in _process_output
    fragment=fragment)
  File "c:\python24\lib\site-packages\TurboGears-1.0.2.2-py2.4.egg
\turbogears\view\base.py", line 131, in render
    return engine.render(**kw)
  File "c:\python24\lib\site-packages\TurboKid-1.0.1-py2.4.egg\turbokid
\kidsupport.py", line 192, in render
    return t.serialize(encoding=self.defaultencoding, output=format,
fragment=fragment)
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\__init__.py", line 299, in serialize
    raise_template_error(module=self.__module__)
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\__init__.py", line 297, in serialize
    return serializer.serialize(self, encoding, fragment, format)
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\serialization.py", line 105, in serialize
    text = ''.join(self.generate(stream, encoding, fragment, format))
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\serialization.py", line 343, in generate
    for ev, item in self.apply_filters(stream, format):
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\serialization.py", line 163, in format_stream
    for ev, item in stream:
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 219, in _coalesce
    for ev, item in stream:
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 177, in _track
    for p in stream:
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\filter.py", line 24, in apply_matches
    for ev, item in stream:
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 177, in _track
    for p in stream:
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 227, in _coalesce
    text += to_unicode(value, encoding)
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 204, in to_unicode
    return unicode(value, encoding)
type error: function takes exactly 5 arguments (1 given)


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
buffalob  
View profile  
 More options Jun 26 2007, 10:06 pm
From: buffalob <bruce....@gmail.com>
Date: Wed, 27 Jun 2007 02:06:59 -0000
Local: Tues, Jun 26 2007 10:06 pm
Subject: Re: Using FeedController get_feed_data, mysterious "return unicode(value, encoding) type error: function takes exactly 5 arguments (1 given)"
Here's my code that causes the above:

from turbogears import controllers, expose, flash, redirect
from turbogears.feed import FeedController

test_string = "default"
input_feed_url_string = "none_yet"

import logging
log = logging.getLogger("hello.controllers")

import xml.sax.handler
import sgmllib

class ParseError(Exception):
    pass

class HTML_Stripper(sgmllib.SGMLParser):
    def __init__(self):
        sgmllib.SGMLParser.__init__(self)

    def strip(self, some_html):
        self.theString = ""
        self.feed(some_html)
        self.close()
        return self.theString

    def handle_data(self, data):
        self.theString += data

class Feed(FeedController):

    def get_feed_data(self, **kwargs):
        input_feed_url_string = "http://rss.news.yahoo.com/rss/health"
        feed_title = "modified version"
        html_stripper = HTML_Stripper()
        entries = []
        import feedparser
        feed_data = feedparser.parse(input_feed_url_string)
        for e in feed_data.entries:
            item = {}
            item["title"] = "some title"
            from datetime import datetime
            item["published"] = datetime.now()
            item["updated"] = datetime.now()
            item["author"] = "B"
            safe_summary = e.summary_detail.value.encode('ascii',
'ignore')
            modified_summary = safe_summary
            log.error("safe_summary=" + safe_summary)
            modified_summary = html_stripper.strip(safe_summary)
            log.error("modified_summary=" + modified_summary)
            item["summary"] = modified_summary
            item["link"] = e.link
            entries.append(item)
        return dict( \
            title = feed_title, link = "http://some_link.com", \
            author = {"name": "B", "email": "test@some_link.com"}, \
            subtitle = "info", id = "http://id_link.com", entries =
entries)

class Root(controllers.RootController):
    feed = Feed()

    def __init__(self):
        controllers.RootController.__init__(self)

    @expose(template="hello.templates.welcome")
    def index(self):
        import time
        log.debug("TurboGears Controller Responding For Duty")
        flash("index called... Your application is now running")
        return dict(now=time.ctime())

    @expose(template="hello.templates.hello")
    def hello(self, *args, **kwargs):
        return dict(greeting="Greetings again from the Controller")

On Jun 26, 10:06 pm, buffalob <bruce....@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
buffalob  
View profile  
 More options Jun 26 2007, 10:11 pm
From: buffalob <bruce....@gmail.com>
Date: Wed, 27 Jun 2007 02:11:26 -0000
Local: Tues, Jun 26 2007 10:11 pm
Subject: Re: Using FeedController get_feed_data, mysterious "return unicode(value, encoding) type error: function takes exactly 5 arguments (1 given)"
One more quick note about the "intermittently" aspect:
It seems likely that the intermittence is due to varying data content
in this RSS feed which my code reads via "feedparser" Python module
(which is Mark Pilgrim's well-known open source RSS reading software
Universal Feed Parser):

http://rss.news.yahoo.com/rss/health

Sometimes perhaps some data in the above feed results in the "5 args
vs. 1 arg" think happening in a behind the scenes unicode processing
step of TG.

On Jun 26, 10:06 pm, buffalob <bruce....@gmail.com> wrote:


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
buffalob  
View profile  
 More options Jul 4 2007, 10:42 pm
From: buffalob <bruce....@gmail.com>
Date: Thu, 05 Jul 2007 02:42:18 -0000
Local: Wed, Jul 4 2007 10:42 pm
Subject: Re: Using FeedController get_feed_data, mysterious "return unicode(value, encoding) type error: function takes exactly 5 arguments (1 given)"
The "type error: function takes exactly 5 arguments (1 given)" crash I
was getting a couple weeks ago has returned today, and I think I've
narrowed down the data that is causing it to happen.

As before, the crash occurs in some internal TG functions for KID
processing (none of which I have ever changed at all) toward the
latter part of the call to my "get_feed_data" function in my
FeedControllers module, ending with the following lines (full set of
lines posted previously in this thread so I won't repeat them here):
...
\parser.py", line 227, in _coalesce
    text += to_unicode(value, encoding)
  File "c:\python24\lib\site-packages\kid-0.9.5-py2.4.egg\kid
\parser.py", line 204, in to_unicode
    return unicode(value, encoding)
type error: function takes exactly 5 arguments (1 given)

Processing the following snippet of XML seems to cause the crash.
It's a portion of an RSS news feed, and when I put some debug
conditional processing to skip over this snippet then the crash does
not happen.

I am guessing that the offending character is the "&amp;#151"
following the word "education".

<item>
<title>Review finds nutrition education failing (AP)</title>
<link>http://us.rd.yahoo.com/dailynews/rss/health/*http://
news.yahoo.com/s/ap/20070704/ap_on_he_me/failing_to_fight_fat</link>
 <guid isPermaLink="false">ap/20070704/failing_to_fight_fat</guid>
<pubDate>Wed, 04 Jul 2007 21:06:50 GMT</pubDate>
<description>AP - The federal government will spend more than &#36;1
billion this year on nutrition education &amp;#151; fresh carrot and
celery snacks, videos of dancing fruit, hundreds of hours of lively
lessons about how great you will feel if you eat well.</description>
</item>

I notice that in a browser the "&amp;#151" character is displayed as a
long dash.

Can anyone please offer me any suggestions for work-arounds I can add
to my "get feed_data" function so that when some external RSS feed I
process happens to have a character like this it can recover and
proceed without crashing?

Thanks much in advance for any help.  This problem is beyond the outer
edge of my Python expertise, but I hope the solution can help me
advance that a bit and also make my project perform much more
reliably.

Researching this a little I found this discussion of a range of
character codes 128-159, of which the character 151 is within, so
maybe that has something to do with this?
http://www.cs.tut.fi/~jkorpela/chars.html#win
"In the Windows character set, some positions in the range 128 - 159
are assigned to printable characters, such as "smart quotes", em dash,
en dash, and trademark symbol. Thus, the character repertoire is
larger than ISO Latin 1. The use of octets in the range 128 - 159 in
any data to be processed by a program that expects ISO 8859-1 encoded
data is an error which might cause just anything. They might for
example get ignored, or be processed in a manner which looks
meaningful, or be interpreted as control characters. See my document
On the use of some MS Windows characters in HTML for a discussion of
the problems of using these characters."


 
You must Sign in before you can post messages.
To post a message you must first join this group.
Please update your nickname on the subscription settings page before posting.
You do not have the permission required to post.
End of messages
« Back to Discussions « Newer topic     Older topic »