Author: richard.m.tew
Date: Thu Feb 26 18:27:07 2009
New Revision: 6
Added:
trunk/README.libgmail (contents, props changed)
trunk/libgmail/
trunk/libgmail/CHANGELOG (contents, props changed)
trunk/libgmail/COPYING (contents, props changed)
trunk/libgmail/PKG-INFO (contents, props changed)
trunk/libgmail/README (contents, props changed)
trunk/libgmail/__init__.py (contents, props changed)
trunk/libgmail/gmail_transport.py (contents, props changed)
trunk/libgmail/lgconstants.py (contents, props changed)
trunk/libgmail/libgmail.py (contents, props changed)
trunk/libgmail/mechanize/
trunk/libgmail/mechanize/COPYING.txt (contents, props changed)
trunk/libgmail/mechanize/ClientForm.py (contents, props changed)
trunk/libgmail/mechanize/__init__.py (contents, props changed)
trunk/libgmail/mechanize/_auth.py (contents, props changed)
trunk/libgmail/mechanize/_beautifulsoup.py (contents, props changed)
trunk/libgmail/mechanize/_clientcookie.py (contents, props changed)
trunk/libgmail/mechanize/_debug.py (contents, props changed)
trunk/libgmail/mechanize/_file.py (contents, props changed)
trunk/libgmail/mechanize/_firefox3cookiejar.py (contents, props
changed)
trunk/libgmail/mechanize/_gzip.py (contents, props changed)
trunk/libgmail/mechanize/_headersutil.py (contents, props changed)
trunk/libgmail/mechanize/_html.py (contents, props changed)
trunk/libgmail/mechanize/_http.py (contents, props changed)
trunk/libgmail/mechanize/_lwpcookiejar.py (contents, props changed)
trunk/libgmail/mechanize/_mechanize.py (contents, props changed)
trunk/libgmail/mechanize/_mozillacookiejar.py (contents, props changed)
trunk/libgmail/mechanize/_msiecookiejar.py (contents, props changed)
trunk/libgmail/mechanize/_opener.py (contents, props changed)
trunk/libgmail/mechanize/_pullparser.py (contents, props changed)
trunk/libgmail/mechanize/_request.py (contents, props changed)
trunk/libgmail/mechanize/_response.py (contents, props changed)
trunk/libgmail/mechanize/_rfc3986.py (contents, props changed)
trunk/libgmail/mechanize/_seek.py (contents, props changed)
trunk/libgmail/mechanize/_sockettimeout.py (contents, props changed)
trunk/libgmail/mechanize/_testcase.py (contents, props changed)
trunk/libgmail/mechanize/_upgrade.py (contents, props changed)
trunk/libgmail/mechanize/_urllib2.py (contents, props changed)
trunk/libgmail/mechanize/_useragent.py (contents, props changed)
trunk/libgmail/mechanize/_util.py (contents, props changed)
trunk/libgmail/setup.py (contents, props changed)
Modified:
trunk/README
trunk/updateCache.py
Log:
Added a snapshot of libgmail and all its dependencies. Made a bug fix to
libgmail to deal with unicode text, and bad decoding of it, which broke the
backup process.
Modified: trunk/README
==============================================================================
--- trunk/README (original)
+++ trunk/README Thu Feb 26 18:27:07 2009
@@ -32,24 +32,30 @@
data when it prompts until you are sure the account has been
updated.
-In order to use the 'updateCache.py' script, you will need to have
-libgmail installed, which you can obtain from:
+All of the dependencies required, are included with the gmail-backup
+source code under the 'libgmail' directory. These include:
http://libgmail.sourceforge.net/
-
-There are further comments at the top of the 'updateCache.py' file.
+
http://wwwsearch.sourceforge.net/mechanize
+
http://wwwsearch.sourceforge.net/ClientForm
+See 'README.libgmail' for information on their individual licenses.
+
POTENTIAL PROBLEMS:
-You may have to tweak the script to get it to work. I backed up my
-account while developing it, and do not know how it handles backing
-up an account from the beginning. Let me know if this is beyond
-you, if it does not work, and I will sort this out.
+libgmail seems to be buggy. I have experienced bugs in its unicode
+support. And I've had bugs reported in its attachment handling, as
+bugs with this program.
+
+gmail may change its interface. This may require the updating of
+the libgmail source code which comes with this, to the latest
+version, which will presumably be upgraded to deal with these
+changes.
WHY USE THIS SCRIPT:
Quoted from my blog:
-
http://nameless-sorrows.blogspot.com/2006/08/backing-up-gmail-account.html
+
http://posted-stuff.blogspot.com/2006/08/backing-up-gmail-account.html
"There are a lot of interesting and useful services on the internet these
days, like gmail and blogger. But when I take advantage of them, I cannot
Added: trunk/README.libgmail
==============================================================================
--- (empty file)
+++ trunk/README.libgmail Thu Feb 26 18:27:07 2009
@@ -0,0 +1,29 @@
+libgmail has a nest of dependencies which are awkward to locate
+and install. The source to it, and its dependencies are lumped
+in the 'libgmail' folder.
+
+These projects have their own licenses, and are not governed in
+any way by the license for the 'gmail-backup' project.
+
+There are unicode related fixes to libgmail, given errors discovered
+in its use.
+
+libgmail
+
http://libgmail.sourceforge.net
+
+ libgmail is licensed under the GPL.
+ See the file named libgmail/COPYING for more information.
+
+mechanize
+
http://wwwsearch.sourceforge.net/mechanize
+
+ This code is free software; you can redistribute it and/or modify it
+ under the terms of the BSD or ZPL 2.1 licenses (see the file
+ libgmail/mechanize/COPYING.txt included with the distribution).
+
+ClientForm.py
+
http://wwwsearch.sourceforge.net/ClientForm
+
+ This code is free software; you can redistribute it and/or modify it
+ under the terms of the BSD or ZPL 2.1 licenses (see the file
+ libgmail/mechanize/COPYING.txt included with the distribution).
Added: trunk/libgmail/CHANGELOG
==============================================================================
--- (empty file)
+++ trunk/libgmail/CHANGELOG Thu Feb 26 18:27:07 2009
@@ -0,0 +1,332 @@
+== Version 0.1.11 ==
+libgmail.py
+ * Fixed bug that broke attachment support (SF bug #2034927)
+ * added .author_fullname field for messages
+ * Don't crash on threads with google chat log (Debian bug #502458)
+
+== Version 0.1.10 ==
+libgmail.py
+ * Use mechanize instead of ClientCookie [Patch #2014779]
+ * Very basic Unicode support [Patch #1926861]
+
+gmail_transport.py
+ * New version that uses mechanize
+ (owing again to Jose Rodriguez)
+
+NOTE: libgmail now depends on mechanize, which
+can be downloaded from:
+
http://wwwsearch.sourceforge.net/mechanize/#download
+ (in Debian/Ubuntu as python-mechanize, and an easy_install
+ installer is also available)
+
+== Version 0.1.9 ==
+libgmail.py
+ * Fixed login that was broken for a bunch of new
+ gmail accounts, thanks to a patch by rhauer
+
+NOTE: libgmail now depends on ClientCookie, which
+can be downloaded from:
+
http://wwwsearch.sourceforge.net/ClientCookie/#download
+
+== Version 0.1.8 ==
+libgmail.py
+ * Added 'search' method to contactLists that returns
+ an array of contacts who match a given search term
+ (at some point, the contacts API is long overdue
+ for a revamp, but for now, hey, why not)
+ This is a patch by Alex Chiang --WD--
+ * libgmail now asks for the old Gmail interface,
+ so that it isn't broken by the new Gmail updates.
+ (Thanks to Aaron and Stu for work on this)
+ (Fixes SF bug #1822662)
+
+== Version 0.1.7 ==
+libgmail.py
+gmail_transport.py
+ * Applied patch that adds proxy support, both
+ for passwordless and password-ful proxies
+ (is that a word?), by Jose Rodriguez --WD+SZ--
+
+== Version 0.1.6.2 ==
+libgmail.py
+ * Bugfix for attachment problems --WD--
+ (SF Bug #1793026, Patch #1799605 by 'stephster')
+archive.py
+ * Protect messages with a "from" line in them --WD--
+ (SF Patch #1790809 by 'scop')
+
+== Version 0.1.6.1 ==
+libgmail.py
+ * Bugfix for login problems --WD--
+
+== Version 0.1.6 ==
+libgmail.py
+ * Added support for "Gmail Apps" aka "Gmail For Your Domain" --WD--
+
+== Version 0.1.5.1 ==
+libgmail.py
+ * Minor bugfix release -- logging in with the wrong
+ username and password caused a crash instead of
+ the appropriate thrown exception --WD--
+
+== Version 0.1.5 ==
+libgmail.py
+ * Fixed exception in the testcode (SF bug #1486703) --SZ--
+ * Fixed broken login caused by slight format change
+ (SF Bug #1534275 - Thanks, anonymous tipster!) --WD--
+ * Added another attribute to the message class: to
+ (SF Bug #1528766) --WD--
+ * Fixed problems caused by repeated commas
+ (SF Bug #1512361) --SZ--
+
+== Version 0.1.4 ==
+libgmail.py
+ * Started new contacts code. --SZ--
+ * Bugfix involving 404 error raised when trying to send
+ an email (SF bug #1398323) --WD--
+ * Bugfix for broken len() iterator in GmailSearchResult
+ (SF bug #1365166) --WD--
+ * Bugfix for improper marking of messages as read
+ (SF bug #1365188) --WD--
+
+ NOTE: Expect an improved Contacts API in the next release.
+ We will strive for backwards-compatibility where
+ possible, but be prepared for possible changes.
+ Please feel free to contact us if you have
+ questions/comments/concerns about this.
+
+== Version 0.1.3.3 ==
+libgmail.py
+ * Fixed some bugs in the return values of the label methods. --SZ--
+
+== Version 0.1.3.2 ==
+libgmail.py
+ * Added some attributes to the message class: cc, bcc, sender. --SZ--
+ * Fixed the value returnt by a __len__ call to the threads. --SZ--
+ * Fixed bug in the sendmessage result --SZ--
+ * Added a exception catch to the getUnreadMsgCount method. --SZ--
+ * Added a method to only retrieve the unread messages from the inbox.
--SZ--
+
+== Version 0.1.3.1 ==
+libgmail.py
+ * Fixed the problem that not all the messages from a thread were
+ returnt. --SZ--
+ * Added a exception catch for a "500 server error" --SZ--
+
+== Version 0.1.3 ==
+ libgmail.py
+ * Fixed bugs that crashes libgmail when accessing an empty account --SZ--
+ * Fixed returning not all the messages in large accounts. --SZ--
+
+== Version 0.1.2 ==
+libgmail.py
+ * Added a \r to the line endings in the VCard export function. This is
done
+ to comply with rfc2425 section 5.8.1 --SZ--
+ * Fixed a security bug in the page parser. --SZ--
+
+
+== Version: 0.1.1 ==
+All
+ * Renamed the shabang to use the 'env' program in all executables. --SZ--
+ * Fixed the redirect bug caused by the changed Gmail login pages. --WD--
+
+== Version: 0.1.0 ==
+libgmail.py
+ * Added contacts support. --WD--
+ * Added contacts test suite. --WD--
+ * Added finer-grained debugging control --WD--
+ * Applied patch that handles login redirect URL properly now
+ Login now works. --WD--
+ * Removed fork message. It was a left over from the initial forking.
--SZ--
+
+constants.py
+ * Renamed to lgconstants.py to avoid name conflicts --WD--
+
+== Version: 0.0.8 (23 August 2004) ==
+libgmail.py
+ * Fixed login to work again after it was broken by a Gmail change.
+ Centralised cookie extraction. Added debug-level logging of cookie
+ extraction & storage.
+
+ * Add trash/delete message thread functionality to account object.
+
+constants.py, libgmail.py, mkconstants.py
+ * Add trash/delete single message functionality to account object.
+
+demos/gmailpopd.py
+ * Initial rough POP3 proxy server demo. Works with Mail.app when I
+ tried it... :-) Sometimes causes items to be downloaded even when
+ they don't *really* need to be. Causes some items to be marked as
+ read even if the client doesn't actually request them.
+
+ * Refactored message retrieval from account snapshot to allow
+ partial message retrieval (for TOP functionality).
+
+ * Added POP3 TOP command functionality which is required by Mozilla as it
+ (wrongly) doesn't work with the absolute minimum command set
+ specified by the RFC and requires TOP.
+
+ * Fixed copy/paste error to change 'ftp_QUIT' to 'pop_QUIT'.
+
+ * Moved byte-stuffing and message massaging into separate functions.
+
+libgmail.py, demos/archive.py, demos/gmailftpd.py, demos/gmailpopd.py,
demos/gmailsmtp.py, demos/sendmsg.py
+ * Added `GmailLoginFailure` exception to enable tidier handling of
+ login failures (which could be bad username/password or a Gmail
+ change).
+
+ * Updated demos to catch `GmailLoginFailure` exception.
+
+ * Removed non-supported "LOGIN" authentication method in SMTP demo
+ that was included in the server capability response in error.
+
+ANNOUNCE
+ * Minor typo fix.
+
+
+== Version: 0.0.7 (03 August 2004) ==
+
+constants.py, mkconstants.py
+ * Added attachment related constants.
+
+libgmail.py, demos/gmailsmtp.py
+ * Allow file data to be specified directly (rather than via an on-
+ disk file) when specifying attachments (this allows using existing
+ Message instance payloads mostly directly). Modify SMTP Proxy demo
+ to handle sending attachments.
+
+demos/gmailftpd.py
+ * Initial import of Gmail attachments FTP Proxy!
+
+libgmail.py
+ * Corrected version info for previous release.
+
+ * Added 'getMessagesByQuery' function. Added initial attachment
+ retrieval handling. Clean up handling of references to parent
+ objects & account objects. Version info update.
+
+ * Handle sending attachments. Works, but implementation is extremely
+ *cough* sub-optimal...
+
+ * Don't try to attach files if there are none.
+
+
+== Version: 0.0.6 (15 July 2004) ==
+
+demos/gmailsmtp.py
+ * That was too easy, there oughta be a law! Thanks to Python's
+ undocumented SMTP server module we can now send mail with a
+ standard mail client via (E)SMTP. Extended standard SMTP class to
+ handle ESMTP EHLO & AUTH PLAIN commands.
+
+libgmail.py
+ * Added utility function '_retrieveJavascript' to 'GmailAccount' to
+ help developers who want to look at it. (In theory also so you can
+ regenerate 'constants.py' but the Javascript Gmail now uses isn't
+ actually useful for that anymore...) (Added by request.)
+
+
+== Version: 0.0.5 (11 July 2004) ==
+
+libgmail.py, demos/sendmsg.py
+ * Added functionality to enable message sending. Modified automatic
+ cookie handling. Added command line example to send a message.
+ Enabled page requests to be either a URL or a Request instance.
+
+constants.py, mkconstants.py
+ * Added more useful constants.
+
+
+== Version: 0.0.4 (11 July 2004) ==
+
+constants.py, mkconstants.py
+ * Include standard folder/search name constants.
+
+ * Add more useful constants.
+
+constants.py, libgmail.py, mkconstants.py
+ * Added category name retrieval.
+
+mkconstants.py
+ * 'mkconstants' isn't really useful anymore with the new JS version.
+
+libgmail.py
+ * Add ability to get number of unread messages.
+
+ * Handle items that might be 'bunched' such as thread lists better.
+
+ * Only warn about mismatched Javascript versions once module import.
+ (Note: This may mean the Javascript version may change more than
+ once in a session and the second change won't be warned, but that
+ shouldn't be much of an issue...)
+
+ * Refactor URL construction. Refactor query/search operation in
+ preparation for adding searches.
+
+ * More refactoring. Made thread search query more generic to allow
+ use by (to come) label searches etc. Threads now belong to
+ 'GmailSearchResult' instances rather than folders. Threads now
+ retrieve their own messages rather than relying on their parent to
+ do so.
+
+ * We now refer to categories as labels, as the UI does. Enable
+ retrieval by label.
+
+libgmail.py, demos/archive.py
+ * Allow all pages of results to be returned for a 'getFolder'
+ request. (Not tested much.)
+
+ * Provide easy access to standard folder names. Added length
+ property to folders. Examples now handle empty folders gracefully.
+
+ * Now uses 'getMessagesByXXXXX' style method names for folders &
+ labels. Now refer to original message source as 'source' & not
+ 'body'. Enable demos to search by folder name or label name.
+
+
+
+== Version: 0.0.3 (8 July 2004) ==
+
+libgmail.py
+ * Allow username to be specified on the command line instead of prompting.
+ * Rough special case handling of when more than one set of thread
information data is present on a page (seemed to occur when using 'all'
search after a certain number of items). TODO: Make this fix work at the
page parsing level, but splitting all tuples into individual items.
+ * Add cookie handling code to enable us to remove requirement for
ClientCookie package. (Especially for Adrian... :-) )
+
+demos/archive.py
+ * *Extremely* rough mbox creation--turns out the mails retrieved had '\r'
characters at the end of the headers. The mbox file appears to be
successfully imported by OS X's Mail.app client.
+ * Allow username to be specified on the command line instead of prompting.
+
+
+== Version: 0.0.2a (~6 July 2004) ==
+
+* No code change, renamed to try to avoid SourceForge mirroring problems.
+
+
+== Version: 0.0.2 (5 July 2004) ==
+
+constants.py
+ * Useful constants from the Gmail Javascript code as Python module.
+ * Update to match current live Javascript.
+ * Fudge some enumerations that we need to start at 0.
+
+libgmail.py
+ * Refactor to make use of Folder/Thread/Message model. Standardised some
naming. Make use of imported Gmail constants. Centralise page retrieval &
parsing.
+ * Calculate number of messages in thread.
+ * Refactor & reorganise code. Minor style edits. Refine design of folder,
thread & message classes. Modify folders, threads & messages to be as lazy
as possible when it comes to retrieving data from the net. Enable message
instances to retrieve their original mail text. Add Gmail implementation
notes. Hide password entry. Demo now displays threads & messages.
+ * Version date change.
+
+mkconstants.py
+ * Tool to make useful constants from the Gmail Javascript code available
via a Python module.
+ * Fudge some enumerations that we need to start at 0.
+
+demos/archive.py
+ * Initial rough demo to archive all messages into text files.
+
+CHANGELOG
+ * Added.
+
+
+== Version: 0.0.1 (2 July 2004) ==
+
+libgmail.py
+ * Initial import of version 0.0.1 (as posted in comp.lang.python).
Added: trunk/libgmail/COPYING
==============================================================================
--- (empty file)
+++ trunk/libgmail/COPYING Thu Feb 26 18:27:07 2009
@@ -0,0 +1,340 @@
+ GNU GENERAL PUBLIC LICENSE
+ Version 2, June 1991
+
+ Copyright (C) 1989, 1991 Free Software Foundation, Inc.
+ 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
+ Everyone is permitted to copy and distribute verbatim copies
+ of this license document, but changing it is not allowed.
+
+ Preamble
+
+ The licenses for most software are designed to take away your
+freedom to share and change it. By contrast, the GNU General Public
+License is intended to guarantee your freedom to share and change free
+software--to make sure the software is free for all its users. This
+General Public License applies to most of the Free Software
+Foundation's software and to any other program whose authors commit to
+using it. (Some other Free Software Foundation software is covered by
+the GNU Library General Public License instead.) You can apply it to
+your programs, too.
+
+ When we speak of free software, we are referring to freedom, not
+price. Our General Public Licenses are designed to make sure that you
+have the freedom to distribute copies of free software (and charge for
+this service if you wish), that you receive source code or can get it
+if you want it, that you can change the software or use pieces of it
+in new free programs; and that you know you can do these things.
+
+ To protect your rights, we need to make restrictions that forbid
+anyone to deny you these rights or to ask you to surrender the rights.
+These restrictions translate to certain responsibilities for you if you
+distribute copies of the software, or if you modify it.
+
+ For example, if you distribute copies of such a program, whether
+gratis or for a fee, you must give the recipients all the rights that
+you have. You must make sure that they, too, receive or can get the
+source code. And you must show them these terms so they know their
+rights.
+
+ We protect your rights with two steps: (1) copyright the software, and
+(2) offer you this license which gives you legal permission to copy,
+distribute and/or modify the software.
+
+ Also, for each author's protection and ours, we want to make certain
+that everyone understands that there is no warranty for this free
+software. If the software is modified by someone else and passed on, we
+want its recipients to know that what they have is not the original, so
+that any problems introduced by others will not reflect on the original
+authors' reputations.
+
+ Finally, any free program is threatened constantly by software
+patents. We wish to avoid the danger that redistributors of a free
+program will individually obtain patent licenses, in effect making the
+program proprietary. To prevent this, we have made it clear that any
+patent must be licensed for everyone's free use or not licensed at all.
+
+ The precise terms and conditions for copying, distribution and
+modification follow.
+
+ GNU GENERAL PUBLIC LICENSE
+ TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION
+
+ 0. This License applies to any program or other work which contains
+a notice placed by the copyright holder saying it may be distributed
+under the terms of this General Public License. The "Program", below,
+refers to any such program or work, and a "work based on the Program"
+means either the Program or any derivative work under copyright law:
+that is to say, a work containing the Program or a portion of it,
+either verbatim or with modifications and/or translated into another
+language. (Hereinafter, translation is included without limitation in
+the term "modification".) Each licensee is addressed as "you".
+
+Activities other than copying, distribution and modification are not
+covered by this License; they are outside its scope. The act of
+running the Program is not restricted, and the output from the Program
+is covered only if its contents constitute a work based on the
+Program (independent of having been made by running the Program).
+Whether that is true depends on what the Program does.
+
+ 1. You may copy and distribute verbatim copies of the Program's
+source code as you receive it, in any medium, provided that you
+conspicuously and appropriately publish on each copy an appropriate
+copyright notice and disclaimer of warranty; keep intact all the
+notices that refer to this License and to the absence of any warranty;
+and give any other recipients of the Program a copy of this License
+along with the Program.
+
+You may charge a fee for the physical act of transferring a copy, and
+you may at your option offer warranty protection in exchange for a fee.
+
+ 2. You may modify your copy or copies of the Program or any portion
+of it, thus forming a work based on the Program, and copy and
+distribute such modifications or work under the terms of Section 1
+above, provided that you also meet all of these conditions:
+
+ a) You must cause the modified files to carry prominent notices
+ stating that you changed the files and the date of any change.
+
+ b) You must cause any work that you distribute or publish, that in
+ whole or in part contains or is derived from the Program or any
+ part thereof, to be licensed as a whole at no charge to all third
+ parties under the terms of this License.
+
+ c) If the modified program normally reads commands interactively
+ when run, you must cause it, when started running for such
+ interactive use in the most ordinary way, to print or display an
+ announcement including an appropriate copyright notice and a
+ notice that there is no warranty (or else, saying that you provide
+ a warranty) and that users may redistribute the program under
+ these conditions, and telling the user how to view a copy of this
+ License. (Exception: if the Program itself is interactive but
+ does not normally print such an announcement, your work based on
+ the Program is not required to print an announcement.)
+
+These requirements apply to the modified work as a whole. If
+identifiable sections of that work are not derived from the Program,
+and can be reasonably considered independent and separate works in
+themselves, then this License, and its terms, do not apply to those
+sections when you distribute them as separate works. But when you
+distribute the same sections as part of a whole which is a work based
+on the Program, the distribution of the whole must be on the terms of
+this License, whose permissions for other licensees extend to the
+entire whole, and thus to each and every part regardless of who wrote it.
+
+Thus, it is not the intent of this section to claim rights or contest
+your rights to work written entirely by you; rather, the intent is to
+exercise the right to control the distribution of derivative or
+collective works based on the Program.
+
+In addition, mere aggregation of another work not based on the Program
+with the Program (or with a work based on the Program) on a volume of
+a storage or distribution medium does not bring the other work under
+the scope of this License.
+
+ 3. You may copy and distribute the Program (or a work based on it,
+under Section 2) in object code or executable form under the terms of
+Sections 1 and 2 above provided that you also do one of the following:
+
+ a) Accompany it with the complete corresponding machine-readable
+ source code, which must be distributed under the terms of Sections
+ 1 and 2 above on a medium customarily used for software interchange;
or,
+
+ b) Accompany it with a written offer, valid for at least three
+ years, to give any third party, for a charge no more than your
+ cost of physically performing source distribution, a complete
+ machine-readable copy of the corresponding source code, to be
+ distributed under the terms of Sections 1 and 2 above on a medium
+ customarily used for software interchange; or,
+
+ c) Accompany it with the information you received as to the offer
+ to distribute corresponding source code. (This alternative is
+ allowed only for noncommercial distribution and only if you
+ received the program in object code or executable form with such
+ an offer, in accord with Subsection b above.)
+
+The source code for a work means the preferred form of the work for
+making modifications to it. For an executable work, complete source
+code means all the source code for all modules it contains, plus any
+associated interface definition files, plus the scripts used to
+control compilation and installation of the executable. However, as a
+special exception, the source code distributed need not include
+anything that is normally distributed (in either source or binary
+form) with the major components (compiler, kernel, and so on) of the
+operating system on which the executable runs, unless that component
+itself accompanies the executable.
+
+If distribution of executable or object code is made by offering
+access to copy from a designated place, then offering equivalent
+access to copy the source code from the same place counts as
+distribution of the source code, even though third parties are not
+compelled to copy the source along with the object code.
+
+ 4. You may not copy, modify, sublicense, or distribute the Program
+except as expressly provided under this License. Any attempt
+otherwise to copy, modify, sublicense or distribute the Program is
+void, and will automatically terminate your rights under this License.
+However, parties who have received copies, or rights, from you under
+this License will not have their licenses terminated so long as such
+parties remain in full compliance.
+
+ 5. You are not required to accept this License, since you have not
+signed it. However, nothing else grants you permission to modify or
+distribute the Program or its derivative works. These actions are
+prohibited by law if you do not accept this License. Therefore, by
+modifying or distributing the Program (or any work based on the
+Program), you indicate your acceptance of this License to do so, and
+all its terms and conditions for copying, distributing or modifying
+the Program or works based on it.
+
+ 6. Each time you redistribute the Program (or any work based on the
+Program), the recipient automatically receives a license from the
+original licensor to copy, distribute or modify the Program subject to
+these terms and conditions. You may not impose any further
+restrictions on the recipients' exercise of the rights granted herein.
+You are not responsible for enforcing compliance by third parties to
+this License.
+
+ 7. If, as a consequence of a court judgment or allegation of patent
+infringement or for any other reason (not limited to patent issues),
+conditions are imposed on you (whether by court order, agreement or
+otherwise) that contradict the conditions of this License, they do not
+excuse you from the conditions of this License. If you cannot
+distribute so as to satisfy simultaneously your obligations under this
+License and any other pertinent obligations, then as a consequence you
+may not distribute the Program at all. For example, if a patent
+license would not permit royalty-free redistribution of the Program by
+all those who receive copies directly or indirectly through you, then
+the only way you could satisfy both it and this License would be to
+refrain entirely from distribution of the Program.
+
+If any portion of this section is held invalid or unenforceable under
+any particular circumstance, the balance of the section is intended to
+apply and the section as a whole is intended to apply in other
+circumstances.
+
+It is not the purpose of this section to induce you to infringe any
+patents or other property right claims or to contest validity of any
+such claims; this section has the sole purpose of protecting the
+integrity of the free software distribution system, which is
+implemented by public license practices. Many people have made
+generous contributions to the wide range of software distributed
+through that system in reliance on consistent application of that
+system; it is up to the author/donor to decide if he or she is willing
+to distribute software through any other system and a licensee cannot
+impose that choice.
+
+This section is intended to make thoroughly clear what is believed to
+be a consequence of the rest of this License.
+
+ 8. If the distribution and/or use of the Program is restricted in
+certain countries either by patents or by copyrighted interfaces, the
+original copyright holder who places the Program under this License
+may add an explicit geographical distribution limitation excluding
+those countries, so that distribution is permitted only in or among
+countries not thus excluded. In such case, this License incorporates
+the limitation as if written in the body of this License.
+
+ 9. The Free Software Foundation may publish revised and/or new versions
+of the General Public License from time to time. Such new versions will
+be similar in spirit to the present version, but may differ in detail to
+address new problems or concerns.
+
+Each version is given a distinguishing version number. If the Program
+specifies a version number of this License which applies to it and "any
+later version", you have the option of following the terms and conditions
+either of that version or of any later version published by the Free
+Software Foundation. If the Program does not specify a version number of
+this License, you may choose any version ever published by the Free
Software
+Foundation.
+
+ 10. If you wish to incorporate parts of the Program into other free
+programs whose distribution conditions are different, write to the author
+to ask for permission. For software which is copyrighted by the Free
+Software Foundation, write to the Free Software Foundation; we sometimes
+make exceptions for this. Our decision will be guided by the two goals
+of preserving the free status of all derivatives of our free software and
+of promoting the sharing and reuse of software generally.
+
+ NO WARRANTY
+
+ 11. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY
+FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN
+OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES
+PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED
+OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
+MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS
+TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE
+PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING,
+REPAIR OR CORRECTION.
+
+ 12. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING
+WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR
+REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES,
+INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING
+OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED
+TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY
+YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER
+PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE
+POSSIBILITY OF SUCH DAMAGES.
+
+ END OF TERMS AND CONDITIONS
+
+ How to Apply These Terms to Your New Programs
+
+ If you develop a new program, and you want it to be of the greatest
+possible use to the public, the best way to achieve this is to make it
+free software which everyone can redistribute and change under these terms.
+
+ To do so, attach the following notices to the program. It is safest
+to attach them to the start of each source file to most effectively
+convey the exclusion of warranty; and each file should have at least
+the "copyright" line and a pointer to where the full notice is found.
+
+ <one line to give the program's name and a brief idea of what it does.>
+ Copyright (C) <year> <name of author>
+
+ This program is free software; you can redistribute it and/or modify
+ it under the terms of the GNU General Public License as published by
+ the Free Software Foundation; either version 2 of the License, or
+ (at your option) any later version.
+
+ This program is distributed in the hope that it will be useful,
+ but WITHOUT ANY WARRANTY; without even the implied warranty of
+ MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ GNU General Public License for more details.
+
+ You should have received a copy of the GNU General Public License
+ along with this program; if not, write to the Free Software
+ Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307
USA
+
+
+Also add information on how to contact you by electronic and paper mail.
+
+If the program is interactive, make it output a short notice like this
+when it starts in an interactive mode:
+
+ Gnomovision version 69, Copyright (C) year name of author
+ Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type `show
w'.
+ This is free software, and you are welcome to redistribute it
+ under certain conditions; type `show c' for details.
+
+The hypothetical commands `show w' and `show c' should show the appropriate
+parts of the General Public License. Of course, the commands you use may
+be called something other than `show w' and `show c'; they could even be
+mouse-clicks or menu items--whatever suits your program.
+
+You should also get your employer (if you work as a programmer) or your
+school, if any, to sign a "copyright disclaimer" for the program, if
+necessary. Here is a sample; alter the names:
+
+ Yoyodyne, Inc., hereby disclaims all copyright interest in the program
+ `Gnomovision' (which makes passes at compilers) written by James Hacker.
+
+ <signature of Ty Coon>, 1 April 1989
+ Ty Coon, President of Vice
+
+This General Public License does not permit incorporating your program into
+proprietary programs. If your program is a subroutine library, you may
+consider it more useful to permit linking proprietary applications with the
+library. If this is what you want to do, use the GNU Library General
+Public License instead of this License.
Added: trunk/libgmail/PKG-INFO
==============================================================================
--- (empty file)
+++ trunk/libgmail/PKG-INFO Thu Feb 26 18:27:07 2009
@@ -0,0 +1,10 @@
+Metadata-Version: 1.0
+Name: libgmail
+Version: 0.1.11
+Summary: python bindings to access Gmail
+Home-page:
http://libgmail.sourceforge.net/
+Author:
wda...@mit.edu,
st...@linux.isbeter.nl,
foll...@myrealbox.com
+Author-email:
libgmail-...@lists.sf.net
+License: GPL
+Description: UNKNOWN
+Platform: UNKNOWN
Added: trunk/libgmail/README
==============================================================================
--- (empty file)
+++ trunk/libgmail/README Thu Feb 26 18:27:07 2009
@@ -0,0 +1,47 @@
+libgmail is licensed under the GPL.
+See the file named COPYING for more information.
+
+Please refer to the libgmail website or project page at sourceforge if
+you encounter problems using libgmail.
+
http://libgmail.sf.net/
+
http://sourceforge.net/projects/libgmail/
+
+You can contact us by email:
+
libgmail-...@lists.sf.net,
+or, individually at
+
stas.zy...@gmail.com
+wdaher AT mit DOT edu
+follower AT myrealbox DOT com
+
+-----------------------------------------------
+Possible usage:
+
+Run this:
+
+ python libgmail.py
+
+When you have the demos package installed you could do this:
+
+ python demos/archive.py
+
+or even this:
+
+ python demos/sendmsg.py <account> <to address> <subject> <body>
+
+or perhaps this:
+
+ python demos/gmailsmtp.py # (Then connect to SMTP proxy on local port
8025)
+
+or how about this:
+
+ python demos/gmailftpd.py # (Then connect to FTP proxy on local port
8021,
+ # after creating a label named 'ftp' and
+ # applying it to some messages with
attachments.)
+
+or maybe this:
+
+ python demos/gmailpopd.py # (Then connect to POP3 proxy on local port
8110)
+
+for hours of fun!(*)
+
+(*) Note: Fun may not last for hours. Use at your own risk, blah, blah,
etc...
\ No newline at end of file
Added: trunk/libgmail/__init__.py
==============================================================================
Added: trunk/libgmail/gmail_transport.py
==============================================================================
--- (empty file)
+++ trunk/libgmail/gmail_transport.py Thu Feb 26 18:27:07 2009
@@ -0,0 +1,143 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+
+#
----------------------------------------------------------------------------------
+# Copyleft (K) by Jose Rodriguez. This source is free (GPL)
+# Partially based on John Nielsen ASPN recipe
(
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/301740)
+# Partially based on Alessandro Budai recipe
(
http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/456195)
+#
----------------------------------------------------------------------------------
+
+
+# ClientCookie to connection through a proxy using the CONNECT method,
(useful for SSL)
+# tested with python 2.4
+
+import mechanize as ClientCookie
+import urllib
+import httplib
+import socket
+import base64
+
+
+def split_proxy_URL(proxy):
+ if proxy is None:
+ return None, None, None
+
+ try:
+ if proxy[:7] != 'http://': # Ensures proxy string begins
with 'http://'
+ proxy = 'http://' + proxy
+ except:
+ pass
+
+ proxy_username = proxy_password = None
+
+ urltype, r_type = urllib.splittype(proxy)
+ proxy, XXX = urllib.splithost(r_type)
+ if '@' in proxy:
+ proxy_username, proxy = proxy.split('@', 1)
+ if ':' in proxy_username:
+ proxy_username, proxy_password = proxy_username.split(':', 1)
+
+ return proxy, proxy_username, proxy_password
+
+
+
+class ProxyHTTPConnection(httplib.HTTPConnection):
+
+ _ports = {'http' : 80, 'https' : 443}
+
+ def request(self, method, url, body=None, headers={}):
+ #request is called before connect, so can interpret url and get
+ #real host/port to be used to make CONNECT request to proxy
+ proto, rest = urllib.splittype(url)
+ if proto is None:
+ raise ValueError, "unknown URL type: %s" % url
+
+ host, rest = urllib.splithost(rest) # get host
+ host, port = urllib.splitport(host) #try to get port
+
+ #if port is not defined try to get from proto
+ if port is None:
+ try:
+ port = self._ports[proto]
+ except KeyError:
+ raise ValueError, "unknown protocol for: %s" % url
+
+ self._real_host = host
+ self._real_port = port
+ httplib.HTTPConnection.request(self, method, url, body, headers)
+
+
+ def connect(self):
+ httplib.HTTPConnection.connect(self)
+
+ self.send("CONNECT %s:%d HTTP/1.0\r\n" % (self._real_host,
self._real_port))
+ if self.proxy_user is not None and self.proxy_passwd is not None:
+ cred = base64.encodestring("%s:%s" % (urllib.unquote(self.proxy_user),
urllib.unquote(self.proxy_passwd))).strip()
+ self.send("Proxy-authorization: Basic %s\r\n" % cred)
+
+ self.send("User-Agent: Mozilla/5.0 (Compatible;
libgmail-python)\r\n\r\n")
+ response = self.response_class(self.sock, strict=self.strict,
method=self._method)
+ (version, code, message) = response._read_status()
+ #probably here we can handle auth requests...
+ if code != 200:
+ #proxy returned and error, abort connection, and raise exception
+ self.close()
+ raise socket.error, "Proxy connection failed: %d %s" % (code,
message.strip())
+
+ #eat up header block from proxy....
+ while True:
+ line = response.fp.readline() #should not use directly fp probablu
+ if line == '\r\n': break
+
+
+ @classmethod
+ def new_auth(cls, proxy_host, proxy_user = None, proxy_passwd = None):
+ cls.proxy_host = proxy_host
+ cls.proxy_user = proxy_user
+ cls.proxy_passwd = proxy_passwd
+
+ return cls
+
+
+
+class ProxyHTTPSConnection(ProxyHTTPConnection):
+
+ default_port = 443
+
+ def __init__(self, host, port = None, key_file = None, cert_file = None,
strict = None):
+ ProxyHTTPConnection.__init__(self, host, port)
+ self.key_file = key_file
+ self.cert_file = cert_file
+
+ def connect(self):
+ ProxyHTTPConnection.connect(self)
+ #make the sock ssl-aware
+ ssl = socket.ssl(self.sock, self.key_file, self.cert_file)
+ self.sock = httplib.FakeSocket(self.sock, ssl)
+
+
+class ConnectHTTPHandler(ClientCookie.HTTPHandler):
+
+ def __init__(self, proxy=None, debuglevel=0):
+ self.proxy, self.proxy_user, self.proxy_passwd = split_proxy_URL(proxy)
+ ClientCookie.HTTPHandler.__init__(self, debuglevel)
+
+ def do_open(self, http_class, req):
+ if self.proxy is not None:
+ req.set_proxy(self.proxy, 'http')
+ return ClientCookie.HTTPHandler.do_open(self,
ProxyHTTPConnection.new_auth(self.proxy, self.proxy_user,
self.proxy_passwd), req)
+
+
+
+class ConnectHTTPSHandler(ClientCookie.HTTPSHandler):
+
+ def __init__(self, proxy=None, debuglevel=0):
+ self.proxy, self.proxy_user, self.proxy_passwd = split_proxy_URL(proxy)
+ ClientCookie.HTTPSHandler.__init__(self, debuglevel)
+
+ def do_open(self, http_class, req):
+ if self.proxy is not None:
+ req.set_proxy(self.proxy, 'https')
+ return ClientCookie.HTTPSHandler.do_open(self,
ProxyHTTPSConnection.new_auth(self.proxy, self.proxy_user,
self.proxy_passwd), req)
+
+
Added: trunk/libgmail/lgconstants.py
==============================================================================
--- (empty file)
+++ trunk/libgmail/lgconstants.py Thu Feb 26 18:27:07 2009
@@ -0,0 +1,231 @@
+#
+# Generated file -- DO NOT EDIT
+#
+# Note: This file is now edited! 2005-04-25
+#
+# constants.py -- Useful constants extracted from Gmail Javascript code
+#
+# Source version: 44f09303f2d4f76f
+#
+# Generated: 2004-08-10 13:08 UTC
+#
+
+
+URL_LOGIN = "
https://www.google.com/accounts/ServiceLoginBoxAuth"
+URL_GMAIL = "
https://mail.google.com/mail/"
+
+
+# Constants with names not from the Gmail Javascript:
+U_SAVEDRAFT_VIEW = "sd"
+
+D_DRAFTINFO = "di"
+# NOTE: All other DI_* field offsets seem to match the MI_* field offsets
+DI_BODY = 19
+
+versionWarned = False # If the Javascript version is different have we
+ # warned about it?
+
+
+js_version = '44f09303f2d4f76f'
+
+D_VERSION = "v"
+D_QUOTA = "qu"
+D_DEFAULTSEARCH_SUMMARY = "ds"
+D_THREADLIST_SUMMARY = "ts"
+D_THREADLIST_END = "te"
+D_THREAD = "t"
+D_CONV_SUMMARY = "cs"
+D_CONV_END = "ce"
+D_MSGINFO = "mi"
+D_MSGBODY = "mb"
+D_MSGATT = "ma"
+D_COMPOSE = "c"
+D_CONTACT = "co"
+D_CATEGORIES = "ct"
+D_CATEGORIES_COUNT_ALL = "cta"
+D_ACTION_RESULT = "ar"
+D_SENDMAIL_RESULT = "sr"
+D_PREFERENCES = "p"
+D_PREFERENCES_PANEL = "pp"
+D_FILTERS = "fi"
+D_GAIA_NAME = "gn"
+D_INVITE_STATUS = "i"
+D_END_PAGE = "e"
+D_LOADING = "l"
+D_LOADED_SUCCESS = "ld"
+D_LOADED_ERROR = "le"
+D_QUICKLOADED = "ql"
+QU_SPACEUSED = 0
+QU_QUOTA = 1
+QU_PERCENT = 2
+QU_COLOR = 3
+TS_START = 0
+TS_NUM = 1
+TS_TOTAL = 2
+TS_ESTIMATES = 3
+TS_TITLE = 4
+TS_TIMESTAMP = 5 + 1
+TS_TOTAL_MSGS = 6 + 1
+T_THREADID = 0
+T_UNREAD = 1
+T_STAR = 2
+T_DATE_HTML = 3
+T_AUTHORS_HTML = 4
+T_FLAGS = 5
+T_SUBJECT_HTML = 6
+T_SNIPPET_HTML = 7
+T_CATEGORIES = 8
+T_ATTACH_HTML = 9
+T_MATCHING_MSGID = 10
+T_EXTRA_SNIPPET = 11
+CS_THREADID = 0
+CS_SUBJECT = 1
+CS_TITLE_HTML = 2
+CS_SUMMARY_HTML = 3
+CS_CATEGORIES = 4
+CS_PREVNEXTTHREADIDS = 5
+CS_THREAD_UPDATED = 6
+CS_NUM_MSGS = 7
+CS_ADKEY = 8
+CS_MATCHING_MSGID = 9
+MI_FLAGS = 0
+MI_NUM = 1
+MI_MSGID = 2
+MI_STAR = 3
+MI_REFMSG = 4
+MI_AUTHORNAME = 5
+MI_AUTHORFIRSTNAME = 6 # ? -- Name supplied by rj
+MI_AUTHOREMAIL = 6 + 1
+MI_MINIHDRHTML = 7 + 1
+MI_DATEHTML = 8 + 1
+MI_TO = 9 + 1
+MI_CC = 10 + 1
+MI_BCC = 11 + 1
+MI_REPLYTO = 12 + 1
+MI_DATE = 13 + 1
+MI_SUBJECT = 14 + 1
+MI_SNIPPETHTML = 15 + 1
+MI_ATTACHINFO = 16 + 1
+MI_KNOWNAUTHOR = 17 + 1
+MI_PHISHWARNING = 18 + 1
+A_ID = 0
+A_FILENAME = 1
+A_MIMETYPE = 2
+A_FILESIZE = 3
+CT_NAME = 0
+CT_COUNT = 1
+AR_SUCCESS = 0
+AR_MSG = 1
+SM_COMPOSEID = 0
+SM_SUCCESS = 1
+SM_MSG = 2
+SM_NEWTHREADID = 3
+CMD_SEARCH = "SEARCH"
+ACTION_TOKEN_COOKIE = "GMAIL_AT"
+U_VIEW = "view"
+U_PAGE_VIEW = "page"
+U_THREADLIST_VIEW = "tl"
+U_CONVERSATION_VIEW = "cv"
+U_COMPOSE_VIEW = "cm"
+U_PRINT_VIEW = "pt"
+U_PREFERENCES_VIEW = "pr"
+U_JSREPORT_VIEW = "jr"
+U_UPDATE_VIEW = "up"
+U_SENDMAIL_VIEW = "sm"
+U_AD_VIEW = "ad"
+U_REPORT_BAD_RELATED_INFO_VIEW = "rbri"
+U_ADDRESS_VIEW = "address"
+U_ADDRESS_IMPORT_VIEW = "ai"
+U_SPELLCHECK_VIEW = "sc"
+U_INVITE_VIEW = "invite"
+U_ORIGINAL_MESSAGE_VIEW = "om"
+U_ATTACHMENT_VIEW = "att"
+U_DEBUG_ADS_RESPONSE_VIEW = "da"
+U_SEARCH = "search"
+U_INBOX_SEARCH = "inbox"
+U_STARRED_SEARCH = "starred"
+U_ALL_SEARCH = "all"
+U_DRAFTS_SEARCH = "drafts"
+U_SENT_SEARCH = "sent"
+U_SPAM_SEARCH = "spam"
+U_TRASH_SEARCH = "trash"
+U_QUERY_SEARCH = "query"
+U_ADVANCED_SEARCH = "adv"
+U_CREATEFILTER_SEARCH = "cf"
+U_CATEGORY_SEARCH = "cat"
+U_AS_FROM = "as_from"
+U_AS_TO = "as_to"
+U_AS_SUBJECT = "as_subj"
+U_AS_SUBSET = "as_subset"
+U_AS_HAS = "as_has"
+U_AS_HASNOT = "as_hasnot"
+U_AS_ATTACH = "as_attach"
+U_AS_WITHIN = "as_within"
+U_AS_DATE = "as_date"
+U_AS_SUBSET_ALL = "all"
+U_AS_SUBSET_INBOX = "inbox"
+U_AS_SUBSET_STARRED = "starred"
+U_AS_SUBSET_SENT = "sent"
+U_AS_SUBSET_DRAFTS = "drafts"
+U_AS_SUBSET_SPAM = "spam"
+U_AS_SUBSET_TRASH = "trash"
+U_AS_SUBSET_ALLSPAMTRASH = "ast"
+U_AS_SUBSET_READ = "read"
+U_AS_SUBSET_UNREAD = "unread"
+U_AS_SUBSET_CATEGORY_PREFIX = "cat_"
+U_THREAD = "th"
+U_PREV_THREAD = "prev"
+U_NEXT_THREAD = "next"
+U_DRAFT_MSG = "draft"
+U_START = "start"
+U_ACTION = "act"
+U_ACTION_TOKEN = "at"
+U_INBOX_ACTION = "ib"
+U_MARKREAD_ACTION = "rd"
+U_MARKUNREAD_ACTION = "ur"
+U_MARKSPAM_ACTION = "sp"
+U_UNMARKSPAM_ACTION = "us"
+U_MARKTRASH_ACTION = "tr"
+U_ADDCATEGORY_ACTION = "ac_"
+U_REMOVECATEGORY_ACTION = "rc_"
+U_ADDSTAR_ACTION = "st"
+U_REMOVESTAR_ACTION = "xst"
+U_ADDSENDERTOCONTACTS_ACTION = "astc"
+U_DELETEMESSAGE_ACTION = "dm"
+U_DELETE_ACTION = "dl"
+U_EMPTYSPAM_ACTION = "es_"
+U_EMPTYTRASH_ACTION = "et_"
+U_SAVEPREFS_ACTION = "prefs"
+U_ADDRESS_ACTION = "a"
+U_CREATECATEGORY_ACTION = "cc_"
+U_DELETECATEGORY_ACTION = "dc_"
+U_RENAMECATEGORY_ACTION = "nc_"
+U_CREATEFILTER_ACTION = "cf"
+U_REPLACEFILTER_ACTION = "rf"
+U_DELETEFILTER_ACTION = "df_"
+U_ACTION_THREAD = "t"
+U_ACTION_MESSAGE = "m"
+U_ACTION_PREF_PREFIX = "p_"
+U_REFERENCED_MSG = "rm"
+U_COMPOSEID = "cmid"
+U_COMPOSE_MODE = "cmode"
+U_COMPOSE_SUBJECT = "su"
+U_COMPOSE_TO = "to"
+U_COMPOSE_CC = "cc"
+U_COMPOSE_BCC = "bcc"
+U_COMPOSE_BODY = "body"
+U_PRINT_THREAD = "pth"
+CONV_VIEW = "conv"
+TLIST_VIEW = "tlist"
+PREFS_VIEW = "prefs"
+HIST_VIEW = "hist"
+COMPOSE_VIEW = "comp"
+HIDDEN_ACTION = 0
+USER_ACTION = 1
+BACKSPACE_ACTION = 2
+
+# TODO: Get these on the fly?
+STANDARD_FOLDERS = [U_INBOX_SEARCH, U_STARRED_SEARCH,
+ U_ALL_SEARCH, U_DRAFTS_SEARCH,
+ U_SENT_SEARCH, U_SPAM_SEARCH]
+
Added: trunk/libgmail/libgmail.py
==============================================================================
--- (empty file)
+++ trunk/libgmail/libgmail.py Thu Feb 26 18:27:07 2009
@@ -0,0 +1,1630 @@
+#!/usr/bin/env python
+#
+# libgmail -- Gmail access via Python
+#
+## To get the version number of the available libgmail version.
+## Reminder: add date before next release. This attribute is also
+## used in the setup script.
+Version = '0.1.11' # (August 2008)
+
+# Original author:
foll...@rancidbacon.com
+# Maintainers: Waseem (
wda...@mit.edu) and Stas Z (
st...@linux.isbeter.nl)
+#
+# License: GPL 2.0
+#
+# NOTE:
+# You should ensure you are permitted to use this script before using it
+# to access Google's Gmail servers.
+#
+#
+# Gmail Implementation Notes
+# ==========================
+#
+# * Folders contain message threads, not individual messages. At present I
+# do not know any way to list all messages without processing thread
list.
+#
+
+LG_DEBUG=0
+from lgconstants import *
+
+import os,pprint
+import re
+import urllib
+import urllib2
+import mimetypes
+import types
+import mechanize as ClientCookie
+from cPickle import load, dump
+
+from email.MIMEBase import MIMEBase
+from email.MIMEText import MIMEText
+from email.MIMEMultipart import MIMEMultipart
+
+GMAIL_URL_LOGIN = "
https://www.google.com/accounts/ServiceLoginBoxAuth"
+GMAIL_URL_GMAIL = "
https://mail.google.com/mail/?ui=1&"
+
+# Set to any value to use proxy.
+PROXY_URL = None # e.g. libgmail.PROXY_URL = '
myproxy.org:3128'
+
+# TODO: Get these on the fly?
+STANDARD_FOLDERS = [U_INBOX_SEARCH, U_STARRED_SEARCH,
+ U_ALL_SEARCH, U_DRAFTS_SEARCH,
+ U_SENT_SEARCH, U_SPAM_SEARCH]
+
+# Constants with names not from the Gmail Javascript:
+# TODO: Move to `lgconstants.py`?
+U_SAVEDRAFT_VIEW = "sd"
+
+D_DRAFTINFO = "di"
+# NOTE: All other DI_* field offsets seem to match the MI_* field offsets
+DI_BODY = 19
+
+versionWarned = False # If the Javascript version is different have we
+ # warned about it?
+
+
+RE_SPLIT_PAGE_CONTENT = re.compile("D\((.*?)\);", re.DOTALL)
+
+class GmailError(Exception):
+ '''
+ Exception thrown upon gmail-specific failures, in particular a
+ failure to log in and a failure to parse responses.
+
+ '''
+ pass
+
+class GmailSendError(Exception):
+ '''
+ Exception to throw if we are unable to send a message
+ '''
+ pass
+
+def _parsePage(pageContent):
+ """
+ Parse the supplied HTML page and extract useful information from
+ the embedded Javascript.
+
+ """
+ lines = pageContent.splitlines()
+ data = '\n'.join([x for x in lines if x and x[0] in
['D', ')', ',', ']']])
+ #data = data.replace(',,',',').replace(',,',',')
+ data = re.sub(r'("(?:[^\\"]|\\.)*")', r'u\1', data)
+ data = re.sub(',{2,}', ',', data)
+
+ result = []
+ try:
+ exec data in {'__builtins__': None}, {'D': lambda x:
result.append(x)}
+ except SyntaxError,info:
+ print info
+ raise GmailError, 'Failed to parse data returned from gmail.'
+
+ items = result
+ itemsDict = {}
+ namesFoundTwice = []
+ for item in items:
+ name = item[0]
+ try:
+ parsedValue = item[1:]
+ except Exception:
+ parsedValue = ['']
+ if itemsDict.has_key(name):
+ # This handles the case where a name key is used more than
+ # once (e.g. mail items, mail body etc) and automatically
+ # places the values into list.
+ # TODO: Check this actually works properly, it's early... :-)
+
+ if len(parsedValue) and type(parsedValue[0]) is types.ListType:
+ for item in parsedValue:
+ itemsDict[name].append(item)
+ else:
+ itemsDict[name].append(parsedValue)
+ else:
+ if len(parsedValue) and type(parsedValue[0]) is types.ListType:
+ itemsDict[name] = []
+ for item in parsedValue:
+ itemsDict[name].append(item)
+ else:
+ itemsDict[name] = [parsedValue]
+
+ return itemsDict
+
+def _splitBunches(infoItems):# Is this still needed ?? Stas
+ """
+ Utility to help make it easy to iterate over each item separately,
+ even if they were bunched on the page.
+ """
+ result= []
+ # TODO: Decide if this is the best approach.
+ for group in infoItems:
+ if type(group) == tuple:
+ result.extend(group)
+ else:
+ result.append(group)
+ return result
+
+class SmartRedirectHandler(ClientCookie.HTTPRedirectHandler):
+ def __init__(self, cookiejar):
+ self.cookiejar = cookiejar
+
+ def http_error_302(self, req, fp, code, msg, headers):
+ # The location redirect doesn't seem to change
+ # the hostname header appropriately, so we do
+ # by hand. (Is this a bug in urllib2?)
+ new_host = re.match(r'http[s]*://(.*?\.google\.com)',
+ headers.getheader('Location'))
+ if new_host:
+ req.add_header("Host", new_host.groups()[0])
+ result = ClientCookie.HTTPRedirectHandler.http_error_302(
+ self, req, fp, code, msg, headers)
+ return result
+
+
+def _buildURL(**kwargs):
+ """
+ """
+ return "%s%s" % (URL_GMAIL, urllib.urlencode(kwargs))
+
+
+
+def _paramsToMime(params, filenames, files):
+ """
+ """
+ mimeMsg = MIMEMultipart("form-data")
+
+ for name, value in params.iteritems():
+ mimeItem = MIMEText(value)
+ mimeItem.add_header("Content-Disposition", "form-data", name=name)
+
+ # TODO: Handle this better...?
+ for hdr in
['Content-Type','MIME-Version','Content-Transfer-Encoding']:
+ del mimeItem[hdr]
+
+ mimeMsg.attach(mimeItem)
+
+ if filenames or files:
+ filenames = filenames or []
+ files = files or []
+ for idx, item in enumerate(filenames + files):
+ # TODO: This is messy, tidy it...
+ if isinstance(item, str):
+ # We assume it's a file path...
+ filename = item
+ contentType = mimetypes.guess_type(filename)[0]
+ payload = open(filename, "rb").read()
+ else:
+ # We assume it's an `email.Message.Message` instance...
+ # TODO: Make more use of the pre-encoded information?
+ filename = item.get_filename()
+ contentType = item.get_content_type()
+ payload = item.get_payload(decode=True)
+
+ if not contentType:
+ contentType = "application/octet-stream"
+
+ mimeItem = MIMEBase(*contentType.split("/"))
+ mimeItem.add_header("Content-Disposition", "form-data",
+ name="file%s" % idx, filename=filename)
+ # TODO: Encode the payload?
+ mimeItem.set_payload(payload)
+
+ # TODO: Handle this better...?
+ for hdr in ['MIME-Version','Content-Transfer-Encoding']:
+ del mimeItem[hdr]
+
+ mimeMsg.attach(mimeItem)
+
+ del mimeMsg['MIME-Version']
+
+ return mimeMsg
+
+
+class GmailLoginFailure(Exception):
+ """
+ Raised whenever the login process fails--could be wrong
username/password,
+ or Gmail service error, for example.
+ Extract the error message like this:
+ try:
+ foobar
+ except GmailLoginFailure,e:
+ mesg = e.message# or
+ print e# uses the __str__
+ """
+ def __init__(self,message):
+ self.message = message
+ def __str__(self):
+ return repr(self.message)
+
+class GmailAccount:
+ """
+ """
+
+ def __init__(self, name = "", pw = "", state = None, domain = None):
+ global URL_LOGIN, URL_GMAIL
+ """
+ """
+ self.domain = domain
+ if self.domain:
+ URL_LOGIN = "
https://www.google.com/a/" + self.domain
+ "/LoginAction2"
+ URL_GMAIL = "
http://mail.google.com/a/" + self.domain
+ "/?ui=1&"
+
+ else:
+ URL_LOGIN = GMAIL_URL_LOGIN
+ URL_GMAIL = GMAIL_URL_GMAIL
+ if name and pw:
+
self.name = name
+ self._pw = pw
+
+ self._cookieJar = ClientCookie.LWPCookieJar()
+ opener =
ClientCookie.build_opener(ClientCookie.HTTPCookieProcessor(self._cookieJar))
+ ClientCookie.install_opener(opener)
+
+ if PROXY_URL is not None:
+ import gmail_transport
+
+ self.opener =
ClientCookie.build_opener(gmail_transport.ConnectHTTPHandler(proxy =
PROXY_URL),
+
gmail_transport.ConnectHTTPSHandler(proxy = PROXY_URL),
+ SmartRedirectHandler(self._cookieJar))
+ else:
+ self.opener = ClientCookie.build_opener(
+ ClientCookie.HTTPHandler(),
+ ClientCookie.HTTPSHandler(),
+ SmartRedirectHandler(self._cookieJar))
+ elif state:
+ # TODO: Check for stale state cookies?
+
self.name, self._cookieJar = state.state
+ else:
+ raise ValueError("GmailAccount must be instantiated with " \
+ "either GmailSessionState object or name " \
+ "and password.")
+
+ self._cachedQuotaInfo = None
+ self._cachedLabelNames = None
+
+
+ def login(self):
+ """
+ """
+ # TODO: Throw exception if we were instantiated with state?
+ if self.domain:
+ data = urllib.urlencode({'continue': URL_GMAIL,
+ 'at' : 'null',
+ 'service' : 'mail',
+ 'Email':
self.name,
+ 'Passwd': self._pw,
+ })
+ else:
+ data = urllib.urlencode({'continue': URL_GMAIL,
+ 'Email':
self.name,
+ 'Passwd': self._pw,
+ })
+
+ headers = {'Host': '
www.google.com',
+ 'User-Agent': 'Mozilla/5.0 (Compatible;
libgmail-python)'}
+
+ req = ClientCookie.Request(URL_LOGIN, data=data, headers=headers)
+ pageData = self._retrievePage(req)
+
+ if not self.domain:
+ # The GV cookie no longer comes in this page for
+ # "Apps", so this bottom portion is unnecessary for it.
+ # This requests the page that provides the required "GV"
cookie.
+ RE_PAGE_REDIRECT = 'CheckCookie\?continue=([^"\']+)'
+
+ # TODO: Catch more failure exceptions here...?
+ try:
+ link = re.search(RE_PAGE_REDIRECT, pageData).group(1)
+ redirectURL = urllib2.unquote(link)
+ redirectURL = redirectURL.replace('\\x26', '&')
+
+ except AttributeError:
+ raise GmailLoginFailure("Login failed. (Wrong
username/password?)")
+ # We aren't concerned with the actual content of this page,
+ # just the cookie that is returned with it.
+ pageData = self._retrievePage(redirectURL)
+
+ def getCookie(self,cookiename):
+ # TODO: Is there a way to extract the value directly?
+ for index, cookie in enumerate(self._cookieJar):
+ if
cookie.name == cookiename:
+ return cookie.value
+ return ""
+
+ def _retrievePage(self, urlOrRequest):
+ """
+ """
+ if self.opener is None:
+ raise "Cannot find urlopener"
+
+ # ClientCookieify it, if it hasn't been already
+ if not isinstance(urlOrRequest, urllib2.Request):
+ req = ClientCookie.Request(urlOrRequest)
+ else:
+ req = urlOrRequest
+
+ req.add_header('User-Agent',
+ 'Mozilla/5.0 (Compatible; libgmail-python)')
+
+ try:
+ resp = self.opener.open(req)
+ except urllib2.HTTPError,info:
+ print info
+ return None
+ pageData = resp.read()
+
+ # TODO: This, for some reason, is still necessary?
+ self._cookieJar.extract_cookies(resp, req)
+
+ # TODO: Enable logging of page data for debugging purposes?
+ return pageData
+
+ def _parsePage(self, urlOrRequest):
+ """
+ Retrieve & then parse the requested page content.
+
+ """
+ items = _parsePage(self._retrievePage(urlOrRequest))
+ # Automatically cache some things like quota usage.
+ # TODO: Cache more?
+ # TODO: Expire cached values?
+ # TODO: Do this better.
+ try:
+ self._cachedQuotaInfo = items[D_QUOTA]
+ except KeyError:
+ pass
+ #pprint.pprint(items)
+
+ try:
+ self._cachedLabelNames = [category[CT_NAME] for category in
items[D_CATEGORIES][0]]
+ except KeyError:
+ pass
+
+ return items
+
+
+ def _parseSearchResult(self, searchType, start = 0, **kwargs):
+ """
+ """
+ params = {U_SEARCH: searchType,
+ U_START: start,
+ U_VIEW: U_THREADLIST_VIEW,
+ }
+ params.update(kwargs)
+ return self._parsePage(_buildURL(**params))
+
+
+ def _parseThreadSearch(self, searchType, allPages = False, **kwargs):
+ """
+
+ Only works for thread-based results at present. # TODO: Change
this?
+ """
+ start = 0
+ tot = 0
+ threadsInfo = []
+ # Option to get *all* threads if multiple pages are used.
+ while (start == 0) or (allPages and
+ len(threadsInfo) <
threadListSummary[TS_TOTAL]):
+
+ items = self._parseSearchResult(searchType, start,
**kwargs)
+ #TODO: Handle single & zero result case better? Does this
work?
+ try:
+ threads = items[D_THREAD]
+ except KeyError:
+ break
+ else:
+ for th in threads:
+ if not type(th[0]) is types.ListType:
+ th = [th]
+ threadsInfo.append(th)
+ # TODO: Check if the total or per-page values have
changed?
+ threadListSummary = items[D_THREADLIST_SUMMARY][0]
+ threadsPerPage = threadListSummary[TS_NUM]
+
+ start += threadsPerPage
+
+ # TODO: Record whether or not we retrieved all pages..?
+ return GmailSearchResult(self, (searchType, kwargs), threadsInfo)
+
+
+ def _retrieveJavascript(self, version = ""):
+ """
+
+ Note: `version` seems to be ignored.
+ """
+ return self._retrievePage(_buildURL(view = U_PAGE_VIEW,
+ name = "js",
+ ver = version))
+
+
+ def getMessagesByFolder(self, folderName, allPages = False):
+ """
+
+ Folders contain conversation/message threads.
+
+ `folderName` -- As set in Gmail interface.
+
+ Returns a `GmailSearchResult` instance.
+
+ *** TODO: Change all "getMessagesByX" to "getThreadsByX"? ***
+ """
+ return self._parseThreadSearch(folderName, allPages = allPages)
+
+
+ def getMessagesByQuery(self, query, allPages = False):
+ """
+
+ Returns a `GmailSearchResult` instance.
+ """
+ return self._parseThreadSearch(U_QUERY_SEARCH, q = query,
+ allPages = allPages)
+
+
+ def getQuotaInfo(self, refresh = False):
+ """
+
+ Return MB used, Total MB and percentage used.
+ """
+ # TODO: Change this to a property.
+ if not self._cachedQuotaInfo or refresh:
+ # TODO: Handle this better...
+ self.getMessagesByFolder(U_INBOX_SEARCH)
+
+ return self._cachedQuotaInfo[0][:3]
+
+
+ def getLabelNames(self, refresh = False):
+ """
+ """
+ # TODO: Change this to a property?
+ if not self._cachedLabelNames or refresh:
+ # TODO: Handle this better...
+ self.getMessagesByFolder(U_INBOX_SEARCH)
+
+ return self._cachedLabelNames
+
+
+ def getMessagesByLabel(self, label, allPages = False):
+ """
+ """
+ return self._parseThreadSearch(U_CATEGORY_SEARCH,
+ cat=label, allPages = allPages)
+
+ def getRawMessage(self, msgId):
+ """
+ """
+ # U_ORIGINAL_MESSAGE_VIEW seems the only one that returns a page.
+ # All the other U_* results in a 404 exception. Stas
+ PageView = U_ORIGINAL_MESSAGE_VIEW
+ return self._retrievePage(
+ _buildURL(view=PageView, th=msgId))
+
+ def getUnreadMessages(self):
+ """
+ """
+ return self._parseThreadSearch(U_QUERY_SEARCH,
+ q = "is:" + U_AS_SUBSET_UNREAD)
+
+
+ def getUnreadMsgCount(self):
+ """
+ """
+ items = self._parseSearchResult(U_QUERY_SEARCH,
+ q = "is:" + U_AS_SUBSET_UNREAD)
+ try:
+ result = items[D_THREADLIST_SUMMARY][0][TS_TOTAL_MSGS]
+ except KeyError:
+ result = 0
+ return result
+
+
+ def _getActionToken(self):
+ """
+ """
+ try:
+ at = self.getCookie(ACTION_TOKEN_COOKIE)
+ except KeyError:
+ self.getLabelNames(True)
+ at = self.getCookie(ACTION_TOKEN_COOKIE)
+
+ return at
+
+
+ def sendMessage(self, msg, asDraft = False, _extraParams = None):
+ """
+
+ `msg` -- `GmailComposedMessage` instance.
+
+ `_extraParams` -- Dictionary containing additional parameters
+ to put into POST message. (Not officially
+ for external use, more to make feature
+ additional a little easier to play with.)
+
+ Note: Now returns `GmailMessageStub` instance with populated
+ `id` (and `_account`) fields on success or None on failure.
+
+ """
+ # TODO: Handle drafts separately?
+ params = {U_VIEW: [U_SENDMAIL_VIEW, U_SAVEDRAFT_VIEW][asDraft],
+ U_REFERENCED_MSG: "",
+ U_THREAD: "",
+ U_DRAFT_MSG: "",
+ U_COMPOSEID: "1",
+ U_ACTION_TOKEN: self._getActionToken(),
+ U_COMPOSE_TO:
msg.to,
+ U_COMPOSE_CC: msg.cc,
+ U_COMPOSE_BCC: msg.bcc,
+ "subject": msg.subject,
+ "msgbody": msg.body,
+ }
+
+ if _extraParams:
+ params.update(_extraParams)
+
+ # Amongst other things, I used the following post to work out this:
+ # <
http://groups.google.com/groups?
+ # selm=mailman.1047080233.20095.python-list%
40python.org>
+ mimeMessage = _paramsToMime(params, msg.filenames, msg.files)
+
+ #### TODO: Ughh, tidy all this up & do it better...
+ ## This horrible mess is here for two main reasons:
+ ## 1. The `Content-Type` header (which also contains the boundary
+ ## marker) needs to be extracted from the MIME message so
+ ## we can send it as the request `Content-Type` header instead.
+ ## 2. It seems the form submission needs to use "\r\n" for new
+ ## lines instead of the "\n" returned by `as_string()`.
+ ## I tried changing the value of `NL` used by the `Generator`
class
+ ## but it didn't work so I'm doing it this way until I figure
+ ## out how to do it properly. Of course, first try, if the
payloads
+ ## contained "\n" sequences they got replaced too, which
corrupted
+ ## the attachments. I could probably encode the submission,
+ ## which would probably be nicer, but in the meantime I'm
kludging
+ ## this workaround that replaces all non-text payloads with a
+ ## marker, changes all "\n" to "\r\n" and finally replaces the
+ ## markers with the original payloads.
+ ## Yeah, I know, it's horrible, but hey it works doesn't it? If
you've
+ ## got a problem with it, fix it yourself & give me the patch!
+ ##
+ origPayloads = {}
+ FMT_MARKER = "&&&&&&%s&&&&&&"
+
+ for i, m in enumerate(mimeMessage.get_payload()):
+ if not isinstance(m, MIMEText): #Do we care if we change text
ones?
+ origPayloads[i] = m.get_payload()
+ m.set_payload(FMT_MARKER % i)
+
+ mimeMessage.epilogue = ""
+ msgStr = mimeMessage.as_string()
+ contentTypeHeader, data = msgStr.split("\n\n", 1)
+ contentTypeHeader = contentTypeHeader.split(":", 1)
+ data = data.replace("\n", "\r\n")
+ for k,v in origPayloads.iteritems():
+ data = data.replace(FMT_MARKER % k, v)
+ ####
+
+ req = ClientCookie.Request(_buildURL(), data = data)
+ req.add_header(*contentTypeHeader)
+ items = self._parsePage(req)
+
+ # TODO: Check composeid?
+ # Sometimes we get the success message
+ # but the id is 0 and no message is sent
+ result = None
+ resultInfo = items[D_SENDMAIL_RESULT][0]
+
+ if resultInfo[SM_SUCCESS]:
+ result = GmailMessageStub(id = resultInfo[SM_NEWTHREADID],
+ _account = self)
+ else:
+ raise GmailSendError, resultInfo[SM_MSG]
+ return result
+
+
+ def trashMessage(self, msg):
+ """
+ """
+ # TODO: Decide if we should make this a method of `GmailMessage`.
+ # TODO: Should we check we have been given a `GmailMessage`
instance?
+ params = {
+ U_ACTION: U_DELETEMESSAGE_ACTION,
+ U_ACTION_MESSAGE:
msg.id,
+ U_ACTION_TOKEN: self._getActionToken(),
+ }
+
+ items = self._parsePage(_buildURL(**params))
+
+ # TODO: Mark as trashed on success?
+ return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
+
+
+ def _doThreadAction(self, actionId, thread):
+ """
+ """
+ # TODO: Decide if we should make this a method of `GmailThread`.
+ # TODO: Should we check we have been given a `GmailThread`
instance?
+ params = {
+ U_SEARCH: U_ALL_SEARCH, #TODO:Check this search value always
works.
+ U_VIEW: U_UPDATE_VIEW,
+ U_ACTION: actionId,
+ U_ACTION_THREAD:
thread.id,
+ U_ACTION_TOKEN: self._getActionToken(),
+ }
+
+ items = self._parsePage(_buildURL(**params))
+
+ return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
+
+
+ def trashThread(self, thread):
+ """
+ """
+ # TODO: Decide if we should make this a method of `GmailThread`.
+ # TODO: Should we check we have been given a `GmailThread`
instance?
+
+ result = self._doThreadAction(U_MARKTRASH_ACTION, thread)
+
+ # TODO: Mark as trashed on success?
+ return result
+
+
+ def _createUpdateRequest(self, actionId): #extraData):
+ """
+ Helper method to create a Request instance for an update (view)
+ action.
+
+ Returns populated `Request` instance.
+ """
+ params = {
+ U_VIEW: U_UPDATE_VIEW,
+ }
+
+ data = {
+ U_ACTION: actionId,
+ U_ACTION_TOKEN: self._getActionToken(),
+ }
+
+ #data.update(extraData)
+
+ req = ClientCookie.Request(_buildURL(**params),
+ data = urllib.urlencode(data))
+
+ return req
+
+
+ # TODO: Extract additional common code from handling of labels?
+ def createLabel(self, labelName):
+ """
+ """
+ req = self._createUpdateRequest(U_CREATECATEGORY_ACTION +
labelName)
+
+ # Note: Label name cache is updated by this call as well. (Handy!)
+ items = self._parsePage(req)
+ print items
+ return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
+
+
+ def deleteLabel(self, labelName):
+ """
+ """
+ # TODO: Check labelName exits?
+ req = self._createUpdateRequest(U_DELETECATEGORY_ACTION +
labelName)
+
+ # Note: Label name cache is updated by this call as well. (Handy!)
+ items = self._parsePage(req)
+
+ return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
+
+
+ def renameLabel(self, oldLabelName, newLabelName):
+ """
+ """
+ # TODO: Check oldLabelName exits?
+ req = self._createUpdateRequest("%s%s^%s" %
(U_RENAMECATEGORY_ACTION,
+ oldLabelName,
newLabelName))
+
+ # Note: Label name cache is updated by this call as well. (Handy!)
+ items = self._parsePage(req)
+
+ return (items[D_ACTION_RESULT][0][AR_SUCCESS] == 1)
+
+ def storeFile(self, filename, label = None):
+ """
+ """
+ # TODO: Handle files larger than single attachment size.
+ # TODO: Allow file data objects to be supplied?
+ FILE_STORE_VERSION = "FSV_01"
+ FILE_STORE_SUBJECT_TEMPLATE = "%s %s" % (FILE_STORE_VERSION, "%s")
+
+ subject = FILE_STORE_SUBJECT_TEMPLATE % os.path.basename(filename)
+
+ msg = GmailComposedMessage(to="", subject=subject, body="",
+ filenames=[filename])
+
+ draftMsg = self.sendMessage(msg, asDraft = True)
+
+ if draftMsg and label:
+ draftMsg.addLabel(label)
+
+ return draftMsg
+
+ ## CONTACTS SUPPORT
+ def getContacts(self):
+ """
+ Returns a GmailContactList object
+ that has all the contacts in it as
+ GmailContacts
+ """
+ contactList = []
+ # pnl = a is necessary to get *all* contacts
+ myUrl = _buildURL(view='cl',search='contacts', pnl='a')
+ myData = self._parsePage(myUrl)
+ # This comes back with a dictionary
+ # with entry 'cl'
+ addresses = myData['cl']
+ for entry in addresses:
+ if len(entry) >= 6 and entry[0]=='ce':
+ newGmailContact = GmailContact(entry[1], entry[2],
entry[4], entry[5])
+ #### new code used to get all the notes
+ #### not used yet due to lockdown problems
+ ##rawnotes = self._getSpecInfo(entry[1])
+ ##print rawnotes
+ ##newGmailContact = GmailContact(entry[1], entry[2],
entry[4],rawnotes)
+ contactList.append(newGmailContact)
+
+ return GmailContactList(contactList)
+
+ def addContact(self, myContact, *extra_args):
+ """
+ Attempts to add a GmailContact to the gmail
+ address book. Returns true if successful,
+ false otherwise
+
+ Please note that after version 0.1.3.3,
+ addContact takes one argument of type
+ GmailContact, the contact to add.
+
+ The old signature of:
+ addContact(name, email, notes='') is still
+ supported, but deprecated.
+ """
+ if len(extra_args) > 0:
+ # The user has passed in extra arguments
+ # He/she is probably trying to invoke addContact
+ # using the old, deprecated signature of:
+ # addContact(self, name, email, notes='')
+ # Build a GmailContact object and use that instead
+ (name, email) = (myContact, extra_args[0])
+ if len(extra_args) > 1:
+ notes = extra_args[1]
+ else:
+ notes = ''
+ myContact = GmailContact(-1, name, email, notes)
+
+ # TODO: In the ideal world, we'd extract these specific
+ # constants into a nice constants file
+
+ # This mostly comes from the Johnvey Gmail API,
+ # but also from the gmail.py cited earlier
+ myURL = _buildURL(view='up')
+
+ myDataList = [ ('act','ec'),
+ ('at', self.getCookie(ACTION_TOKEN_COOKIE)),
+ ('ct_nm', myContact.getName()),
+ ('ct_em', myContact.getEmail()),
+ ('ct_id', -1 )
+ ]
+
+ notes = myContact.getNotes()
+ if notes != '':
+ myDataList.append( ('ctf_n', notes) )
+
+ validinfokeys = [
+ 'i', # IM
+ 'p', # Phone
+ 'd', # Company
+ 'a', # ADR
+ 'e', # Email
+ 'm', # Mobile
+ 'b', # Pager
+ 'f', # Fax
+ 't', # Title
+ 'o', # Other
+ ]
+
+ moreInfo = myContact.getMoreInfo()
+ ctsn_num = -1
+ if moreInfo != {}:
+ for ctsf,ctsf_data in moreInfo.items():
+ ctsn_num += 1
+ # data section header, WORK, HOME,...
+ sectionenum ='ctsn_%02d' % ctsn_num
+ myDataList.append( ( sectionenum, ctsf ))
+ ctsf_num = -1
+
+ if isinstance(ctsf_data[0],str):
+ ctsf_num += 1
+ # data section
+ subsectionenum = 'ctsf_%02d_%02d_%s' % (ctsn_num,
ctsf_num, ctsf_data[0]) # ie. ctsf_00_01_p
+ myDataList.append( (subsectionenum, ctsf_data[1]) )
+ else:
+ for info in ctsf_data:
+ if validinfokeys.count(info[0]) > 0:
+ ctsf_num += 1
+ # data section
+ subsectionenum = 'ctsf_%02d_%02d_%s' %
(ctsn_num, ctsf_num, info[0]) # ie. ctsf_00_01_p
+ myDataList.append( (subsectionenum, info[1]) )
+
+ myData = urllib.urlencode(myDataList)
+ request = ClientCookie.Request(myURL,
+ data = myData)
+ pageData = self._retrievePage(request)
+
+ if pageData.find("The contact was successfully added") == -1:
+ print pageData
+ if pageData.find("already has the email address") > 0:
+ raise Exception("Someone with same email already exists in
Gmail.")
+ elif
pageData.find("
https://www.google.com/accounts/ServiceLogin"):
+ raise Exception("Login has expired.")
+ return False
+ else:
+ return True
+
+ def _removeContactById(self, id):
+ """
+ Attempts to remove the contact that occupies
+ id "id" from the gmail address book.
+ Returns True if successful,
+ False otherwise.
+
+ This is a little dangerous since you don't really
+ know who you're deleting. Really,
+ this should return the name or something of the
+ person we just killed.
+
+ Don't call this method.
+ You should be using removeContact instead.
+ """
+ myURL = _buildURL(search='contacts', ct_id = id, c=id, act='dc',
at=self.getCookie(ACTION_TOKEN_COOKIE), view='up')
+ pageData = self._retrievePage(myURL)
+
+ if pageData.find("The contact has been deleted") == -1:
+ return False
+ else:
+ return True
+
+ def removeContact(self, gmailContact):
+ """
+ Attempts to remove the GmailContact passed in
+ Returns True if successful, False otherwise.
+ """
+ # Let's re-fetch the contact list to make
+ # sure we're really deleting the guy
+ # we think we're deleting
+ newContactList = self.getContacts()
+ newVersionOfPersonToDelete =
newContactList.getContactById(gmailContact.getId())
+ # Ok, now we need to ensure that gmailContact
+ # is the same as newVersionOfPersonToDelete
+ # and then we can go ahead and delete him/her
+ if (gmailContact == newVersionOfPersonToDelete):
+ return self._removeContactById(gmailContact.getId())
+ else:
+ # We have a cache coherency problem -- someone
+ # else now occupies this ID slot.
+ # TODO: Perhaps signal this in some nice way
+ # to the end user?
+
+ print "Unable to delete."
+ print "Has someone else been modifying the contacts list while
we have?"
+ print "Old version of person:",gmailContact
+ print "New version of person:",newVersionOfPersonToDelete
+ return False
+
+## Don't remove this. contact stas
+## def _getSpecInfo(self,id):
+## """
+## Return all the notes data.
+## This is currently not used due to the fact that it requests
pages in
+## a dos attack manner.
+## """
+## myURL =_buildURL(search='contacts',ct_id=id,c=id,\
+##
at=self._cookieJar._cookies['GMAIL_AT'],view='ct')
+## pageData = self._retrievePage(myURL)
+## myData = self._parsePage(myURL)
+## #print "\nmyData form _getSpecInfo\n",myData
+## rawnotes = myData['cov'][7]
+## return rawnotes
+
+class GmailContact:
+ """
+ Class for storing a Gmail Contacts list entry
+ """
+ def __init__(self, name, email, *extra_args):
+ """
+ Returns a new GmailContact object
+ (you can then call addContact on this to commit
+ it to the Gmail addressbook, for example)
+
+ Consider calling setNotes() and setMoreInfo()
+ to add extended information to this contact
+ """
+ # Support populating other fields if we're trying
+ # to invoke this the old way, with the old constructor
+ # whose signature was __init__(self, id, name, email, notes='')
+ id = -1
+ notes = ''
+
+ if len(extra_args) > 0:
+ (id, name) = (name, email)
+ email = extra_args[0]
+ if len(extra_args) > 1:
+ notes = extra_args[1]
+ else:
+ notes = ''
+
+
self.id = id
+
self.name = name
+ self.email = email
+ self.notes = notes
+ self.moreInfo = {}
+ def __str__(self):
+ return "%s %s %s %s" % (
self.id,
self.name, self.email, self.notes)
+ def __eq__(self, other):
+ if not isinstance(other, GmailContact):
+ return False
+ return (self.getId() == other.getId()) and \
+ (self.getName() == other.getName()) and \
+ (self.getEmail() == other.getEmail()) and \
+ (self.getNotes() == other.getNotes())
+ def getId(self):
+ return
self.id
+ def getName(self):
+ return
self.name
+ def getEmail(self):
+ return self.email
+ def getNotes(self):
+ return self.notes
+ def setNotes(self, notes):
+ """
+ Sets the notes field for this GmailContact
+ Note that this does NOT change the note
+ field on Gmail's end; only adding or removing
+ contacts modifies them
+ """
+ self.notes = notes
+
+ def getMoreInfo(self):
+ return self.moreInfo
+ def setMoreInfo(self, moreInfo):
+ """
+ moreInfo format
+ ---------------
+ Use special key values::
+ 'i' = IM
+ 'p' = Phone
+ 'd' = Company
+ 'a' = ADR
+ 'e' = Email
+ 'm' = Mobile
+ 'b' = Pager
+ 'f' = Fax
+ 't' = Title
+ 'o' = Other
+
+ Simple example::
+
+ moreInfo = {'Home': ( ('a','852 W Barry'),
+ ('p', '
1-773-244-1980'),
+ ('i', 'aim:brianray34') ) }
+
+ Complex example::
+
+ moreInfo = {
+ 'Personal': (('e', 'Home Email'),
+ ('f', 'Home Fax')),
+ 'Work': (('d', 'Sample Company'),
+ ('t', 'Job Title'),
+ ('o', 'Department: Department1'),
+ ('o', 'Department: Department2'),
+ ('p', 'Work Phone'),
+ ('m', 'Mobile Phone'),
+ ('f', 'Work Fax'),
+ ('b', 'Pager')) }
+ """
+ self.moreInfo = moreInfo
+ def getVCard(self):
+ """Returns a vCard 3.0 for this
+ contact, as a string"""
+ # The \r is is to comply with the RFC2425 section 5.8.1
+ vcard = "BEGIN:VCARD\r\n"
+ vcard += "VERSION:3.0\r\n"
+ ## Deal with multiline notes
+ ##vcard += "NOTE:%s\n" % self.getNotes().replace("\n","\\n")
+ vcard += "NOTE:%s\r\n" % self.getNotes()
+ # Fake-out N by splitting up whatever we get out of getName
+ # This might not always do 'the right thing'
+ # but it's a *reasonable* compromise
+ fullname = self.getName().split()
+ fullname.reverse()
+ vcard += "N:%s" % ';'.join(fullname) + "\r\n"
+ vcard += "FN:%s\r\n" % self.getName()
+ vcard += "EMAIL;TYPE=INTERNET:%s\r\n" % self.getEmail()
+ vcard += "END:VCARD\r\n\r\n"
+ # Final newline in case we want to put more than one in a file
+ return vcard
+
+class GmailContactList:
+ """
+ Class for storing an entire Gmail contacts list
+ and retrieving contacts by Id, Email address, and name
+ """
+ def __init__(self, contactList):
+ self.contactList = contactList
+ def __str__(self):
+ return '\n'.join([str(item) for item in self.contactList])
+ def getCount(self):
+ """
+ Returns number of contacts
+ """
+ return len(self.contactList)
+ def getAllContacts(self):
+ """
+ Returns an array of all the
+ GmailContacts
+ """
+ return self.contactList
+ def getContactByName(self, name):
+ """
+ Gets the first contact in the
+ address book whose name is 'name'.
+
+ Returns False if no contact
+ could be found
+ """
+ nameList = self.getContactListByName(name)
+ if len(nameList) > 0:
+ return nameList[0]
+ else:
+ return False
+ def getContactByEmail(self, email):
+ """
+ Gets the first contact in the
+ address book whose name is 'email'.
+ As of this writing, Gmail insists
+ upon a unique email; i.e. two contacts
+ cannot share an email address.
+
+ Returns False if no contact
+ could be found
+ """
+ emailList = self.getContactListByEmail(email)
+ if len(emailList) > 0:
+ return emailList[0]
+ else:
+ return False
+ def getContactById(self, myId):
+ """
+ Gets the first contact in the
+ address book whose id is 'myId'.
+
+ REMEMBER: ID IS A STRING
+
+ Returns False if no contact
+ could be found
+ """
+ idList = self.getContactListById(myId)
+ if len(idList) > 0:
+ return idList[0]
+ else:
+ return False
+ def getContactListByName(self, name):
+ """
+ This function returns a LIST
+ of GmailContacts whose name is
+ 'name'.
+
+ Returns an empty list if no contacts
+ were found
+ """
+ nameList = []
+ for entry in self.contactList:
+ if entry.getName() == name:
+ nameList.append(entry)
+ return nameList
+ def getContactListByEmail(self, email):
+ """
+ This function returns a LIST
+ of GmailContacts whose email is
+ 'email'. As of this writing, two contacts
+ cannot share an email address, so this
+ should only return just one item.
+ But it doesn't hurt to be prepared?
+
+ Returns an empty list if no contacts
+ were found
+ """
+ emailList = []
+ for entry in self.contactList:
+ if entry.getEmail() == email:
+ emailList.append(entry)
+ return emailList
+ def getContactListById(self, myId):
+ """
+ This function returns a LIST
+ of GmailContacts whose id is
+ 'myId'. We expect there only to
+ be one, but just in case!
+
+ Remember: ID IS A STRING
+
+ Returns an empty list if no contacts
+ were found
+ """
+ idList = []
+ for entry in self.contactList:
+ if entry.getId() == myId:
+ idList.append(entry)
+ return idList
+ def search(self, searchTerm):
+ """
+ This function returns a LIST
+ of GmailContacts whose name or
+ email address matches the 'searchTerm'.
+
+ Returns an empty list if no matches
+ were found.
+ """
+ searchResults = []
+ for entry in self.contactList:
+ p = re.compile(searchTerm, re.IGNORECASE)
+ if p.search(entry.getName()) or p.search(entry.getEmail()):
+ searchResults.append(entry)
+ return searchResults
+
+class GmailSearchResult:
+ """
+ """
+
+ def __init__(self, account, search, threadsInfo):
+ """
+
+ `threadsInfo` -- As returned from Gmail but unbunched.
+ """
+ #print "\nthreadsInfo\n",threadsInfo
+ try:
+ if not type(threadsInfo[0]) is types.ListType:
+ threadsInfo = [threadsInfo]
+ except IndexError:
+ # print "No messages found"
+ pass
+
+ self._account = account
+ self.search = search # TODO: Turn into object + format nicely.
+ self._threads = []
+
+ for thread in threadsInfo:
+ self._threads.append(GmailThread(self, thread[0]))
+
+
+ def __iter__(self):
+ """
+ """
+ return iter(self._threads)
+
+ def __len__(self):
+ """
+ """
+ return len(self._threads)
+
+ def __getitem__(self,key):
+ """
+ """
+ return self._threads.__getitem__(key)
+
+
+class GmailSessionState:
+ """
+ """
+
+ def __init__(self, account = None, filename = ""):
+ """
+ """
+ if account:
+ self.state = (
account.name, account._cookieJar)
+ elif filename:
+ self.state = load(open(filename, "rb"))
+ else:
+ raise ValueError("GmailSessionState must be instantiated
with " \
+ "either GmailAccount object or filename.")
+
+
+ def save(self, filename):
+ """
+ """
+ dump(self.state, open(filename, "wb"), -1)
+
+
+class _LabelHandlerMixin(object):
+ """
+
+ Note: Because a message id can be used as a thread id this works for
+ messages as well as threads.
+ """
+ def __init__(self):
+ self._labels = None
+
+ def _makeLabelList(self, labelList):
+ self._labels = labelList
+
+ def addLabel(self, labelName):
+ """
+ """
+ # Note: It appears this also automatically creates new labels.
+ result =
self._account._doThreadAction(U_ADDCATEGORY_ACTION+labelName,
+ self)
+ if not self._labels:
+ self._makeLabelList([])
+ # TODO: Caching this seems a little dangerous; suppress duplicates
maybe?
+ self._labels.append(labelName)
+ return result
+
+
+ def removeLabel(self, labelName):
+ """
+ """
+ # TODO: Check label is already attached?
+ # Note: An error is not generated if the label is not already
attached.
+ result = \
+
self._account._doThreadAction(U_REMOVECATEGORY_ACTION+labelName,
+ self)
+
+ removeLabel = True
+ try:
+ self._labels.remove(labelName)
+ except:
+ removeLabel = False
+ pass
+
+ # If we don't check both, we might end up in some weird
inconsistent state
+ return result and removeLabel
+
+ def getLabels(self):
+ return self._labels
+
+
+
+class GmailThread(_LabelHandlerMixin):
+ """
+ Note: As far as I can tell, the "canonical" thread id is always the
same
+ as the id of the last message in the thread. But it appears that
+ the id of any message in the thread can be used to retrieve
+ the thread information.
+
+ """
+
+ def __init__(self, parent, threadsInfo):
+ """
+ """
+ _LabelHandlerMixin.__init__(self)
+
+ # TODO Handle this better?
+ self._parent = parent
+ self._account = self._parent._account
+
+
self.id = threadsInfo[T_THREADID] # TODO: Change when canonical
updated?
+ self.subject = threadsInfo[T_SUBJECT_HTML]
+
+ self.snippet = threadsInfo[T_SNIPPET_HTML]
+ #self.extraSummary = threadInfo[T_EXTRA_SNIPPET] #TODO: What is
this?
+
+ # TODO: Store other info?
+ # Extract number of messages in thread/conversation.
+
+ self._authors = threadsInfo[T_AUTHORS_HTML]
+
self.info = threadsInfo
+
+ try:
+ # TODO: Find out if this information can be found another
way...
+ # (Without another page request.)
+ self._length = int(re.search("\((\d+?)\)\Z",
+ self._authors).group(1))
+ except AttributeError,info:
+ # If there's no message count then the thread only has one
message.
+ self._length = 1
+
+ # TODO: Store information known about the last message (e.g. id)?
+ self._messages = []
+
+ # Populate labels
+ self._makeLabelList(threadsInfo[T_CATEGORIES])
+
+ def __getattr__(self, name):
+ """
+ Dynamically dispatch some interesting thread properties.
+ """
+ attrs = { 'unread': T_UNREAD,
+ 'star': T_STAR,
+ 'date': T_DATE_HTML,
+ 'authors': T_AUTHORS_HTML,
+ 'flags': T_FLAGS,
+ 'subject': T_SUBJECT_HTML,
+ 'snippet': T_SNIPPET_HTML,
+ 'categories': T_CATEGORIES,
+ 'attach': T_ATTACH_HTML,
+ 'matching_msgid': T_MATCHING_MSGID,
+ 'extra_snippet': T_EXTRA_SNIPPET }
+ if name in attrs:
+ return
self.info[ attrs[name] ];
+
+ raise AttributeError("no attribute %s" % name)
+
+ def __len__(self):
+ """
+ """
+ return self._length
+
+
+ def __iter__(self):
+ """
+ """
+ if not self._messages:
+ self._messages = self._getMessages(self)
+
+ return iter(self._messages)
+
+ def __getitem__(self, key):
+ """
+ """
+ if not self._messages:
+ self._messages = self._getMessages(self)
+ try:
+ result = self._messages.__getitem__(key)
+ except IndexError:
+ result = []
+ return result
+
+ def _getMessages(self, thread):
+ """
+ """
+ # TODO: Do this better.
+ # TODO: Specify the query folder using our specific search?
+ items = self._account._parseSearchResult(U_QUERY_SEARCH,
+ view =
U_CONVERSATION_VIEW,
+ th =
thread.id,
+ q = "in:anywhere")
+ result = []
+ # TODO: Handle this better?
+ # Note: This handles both draft & non-draft messages in a thread...
+ for key, isDraft in [(D_MSGINFO, False), (D_DRAFTINFO, True)]:
+ try:
+ msgsInfo = items[key]
+ except KeyError:
+ # No messages of this type (e.g. draft or non-draft)
+ continue
+ else:
+ # TODO: Handle special case of only 1 message in thread
better?
+ if type(msgsInfo[0]) != types.ListType:
+ msgsInfo = [msgsInfo]
+ for msg in msgsInfo:
+ result += [GmailMessage(thread, msg, isDraft =
isDraft)]
+
+ return result
+
+class GmailMessageStub(_LabelHandlerMixin):
+ """
+
+ Intended to be used where not all message information is
known/required.
+
+ NOTE: This may go away.
+ """
+
+ # TODO: Provide way to convert this to a full `GmailMessage` instance
+ # or allow `GmailMessage` to be created without all info?
+
+ def __init__(self, id = None, _account = None):
+ """
+ """
+ _LabelHandlerMixin.__init__(self)
+
self.id = id
+ self._account = _account
+
+
+
+class GmailMessage(object):
+ """
+ """
+
+ def __init__(self, parent, msgData, isDraft = False):
+ """
+
+ Note: `msgData` can be from either D_MSGINFO or D_DRAFTINFO.
+ """
+ # TODO: Automatically detect if it's a draft or not?
+ # TODO Handle this better?
+ self._parent = parent
+ self._account = self._parent._account
+
+ self.author = msgData[MI_AUTHORFIRSTNAME]
+ self.author_fullname = msgData[MI_AUTHORNAME]
+
self.id = msgData[MI_MSGID]
+ self.number = msgData[MI_NUM]
+ self.subject = msgData[MI_SUBJECT]
+
self.to = [x for x in msgData[MI_TO]]
+ self.cc = [x for x in msgData[MI_CC]]
+ self.bcc = [x for x in msgData[MI_BCC]]
+ self.sender = msgData[MI_AUTHOREMAIL]
+
+ # Messages created by google chat (from reply with chat, etc.)
+ # don't have any attachments, so we need this check not to choke
+ # on them
+ try:
+ self.attachments = [GmailAttachment(self, attachmentInfo)
+ for attachmentInfo in msgData[MI_ATTACHINFO]]
+ except TypeError:
+ self.attachments = []
+
+ # TODO: Populate additional fields & cache...(?)
+
+ # TODO: Handle body differently if it's from a draft?
+ self.isDraft = isDraft
+
+ self._source = None
+
+
+ def _getSource(self):
+ """
+ """
+ if not self._source:
+ # TODO: Do this more nicely...?
+ # TODO: Strip initial white space & fix up last line ending
+ # to make it legal as per RFC?
+ self._source = self._account.getRawMessage(
self.id)
+
+ return self._source
+
+ source = property(_getSource, doc = "")
+
+
+
+class GmailAttachment:
+ """
+ """
+
+ def __init__(self, parent, attachmentInfo):
+ """
+ """
+ # TODO Handle this better?
+ self._parent = parent
+ self._account = self._parent._account
+
+
self.id = attachmentInfo[A_ID]
+ self.filename = attachmentInfo[A_FILENAME]
+ self.mimetype = attachmentInfo[A_MIMETYPE]
+ self.filesize = attachmentInfo[A_FILESIZE]
+
+ self._content = None
+
+
+ def _getContent(self):
+ """
+ """
+ if not self._content:
+ # TODO: Do this a more nicely...?
+ self._content = self._account._retrievePage(
+ _buildURL(view=U_ATTACHMENT_VIEW, disp="attd",
+ attid=
self.id, th=self._parent._
parent.id))
+
+ return self._content
+
+ content = property(_getContent, doc = "")
+
+
+ def _getFullId(self):
+ """
+
+ Returns the "full path"/"full id" of the attachment. (Used
+ to refer to the file when forwarding.)
+
+ The id is of the form: "<thread_id>_<msg_id>_<attachment_id>"
+
+ """
+ return "%s_%s_%s" % (self._parent._
parent.id,
+ self._
parent.id,
+
self.id)
+
+ _fullId = property(_getFullId, doc = "")
+
+
+
+class GmailComposedMessage:
+ """
+ """
+
+ def __init__(self, to, subject, body, cc = None, bcc = None,
+ filenames = None, files = None):
+ """
+
+ `filenames` - list of the file paths of the files to attach.
+ `files` - list of objects implementing sub-set of
+ `email.Message.Message` interface (`get_filename`,
+ `get_content_type`, `get_payload`). This is to
+ allow use of payloads from Message instances.
+ TODO: Change this to be simpler class we define
ourselves?
+ """
+
self.to = to
+ self.subject = subject
+ self.body = body
+ self.cc = cc
+ self.bcc = bcc
+ self.filenames = filenames
+ self.files = files
+
+
+
+if __name__ == "__main__":
+ import sys
+ from getpass import getpass
+
+ try:
+ name = sys.argv[1]
+ except IndexError:
+ name = raw_input("Gmail account name: ")
+
+ pw = getpass("Password: ")
+ domain = raw_input("Domain? [leave blank for Gmail]: ")
+
+ ga = GmailAccount(name, pw, domain=domain)
+
+ print "\nPlease wait, logging in..."
+
+ try:
+ ga.login()
+ except GmailLoginFailure,e:
+ print "\nLogin failed. (%s)" % e.message
+ else:
+ print "Login successful.\n"
+
+ # TODO: Use properties instead?
+ quotaInfo = ga.getQuotaInfo()
+ quotaMbUsed = quotaInfo[QU_SPACEUSED]
+ quotaMbTotal = quotaInfo[QU_QUOTA]
+ quotaPercent = quotaInfo[QU_PERCENT]
+ print "%s of %s used. (%s)\n" % (quotaMbUsed, quotaMbTotal,
quotaPercent)
+
+ searches = STANDARD_FOLDERS + ga.getLabelNames()
+ name = None
+ while 1:
+ try:
+ print "Select folder or label to list: (Ctrl-C to exit)"
+ for optionId, optionName in enumerate(searches):
+ print " %d. %s" % (optionId, optionName)
+ while not name:
+ try:
+ name = searches[int(raw_input("Choice: "))]
+ except ValueError,info:
+ print info
+ name = None
+ if name in STANDARD_FOLDERS:
+ result = ga.getMessagesByFolder(name, True)
+ else:
+ result = ga.getMessagesByLabel(name, True)
+
+ if not len(result):
+ print "No threads found in `%s`." % name
+ break
+ name = None
+ tot = len(result)
+
+ i = 0
+ for thread in result:
+ print "%s messages in thread" % len(thread)
+ print
thread.id, len(thread), thread.subject
+ for msg in thread:
+ print "\n ",
msg.id, msg.number,
msg.author,msg.subject
+ # Just as an example of other usefull things
+ #print " ", msg.cc, msg.bcc,msg.sender
+ i += 1
+ print
+ print "number of threads:",tot
+ print "number of messages:",i
+ except KeyboardInterrupt:
+ break
+
+ print "\n\nDone."
Added: trunk/libgmail/mechanize/COPYING.txt
==============================================================================
--- (empty file)
+++ trunk/libgmail/mechanize/COPYING.txt Thu Feb 26 18:27:07 2009
@@ -0,0 +1,101 @@
+All the code with the exception of _gzip.py is covered under either
+the BSD-style license immediately below, or (at your choice) the ZPL
+2.1. The code in _gzip.py is taken from the
effbot.org library, and
+falls under the
effbot.org license (also BSD-style) that appears at
+the end of this file.
+
+
+Copyright (c) 2002-2006 John J. Lee
j...@pobox.com
+Copyright (c) 1997-1999 Gisle Aas
+Copyright (c) 1997-1999 Johnny Lee
+Copyright (c) 2003 Andy Lester
+
+
+BSD-style License
+==================
+
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without
+modification, are permitted provided that the following conditions are
+met
+
+Redistributions of source code must retain the above copyright notice,
+this list of conditions and the following disclaimer.
+
+Redistributions in binary form must reproduce the above copyright
+notice, this list of conditions and the following disclaimer in the
+documentation andor other materials provided with the distribution.
+
+Neither the name of the contributors nor the names of their employers
+may be used to endorse or promote products derived from this software
+without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS
+AS IS AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT
+LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR
+A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT
+OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
+SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
+LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,
+DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY
+THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT
+(INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
+OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
+
+
+
+
+
+ZPL 2.1
+==================
+
+Zope Public License (ZPL) Version 2.1
+
+A copyright notice accompanies this license document that identifies the
copyright holders.
+
+This license has been certified as open source. It has also been
designated as GPL compatible by the Free Software Foundation (FSF).
+
+Redistribution and use in source and binary forms, with or without
modification, are permitted provided that the following conditions are met
+
+ 1. Redistributions in source code must retain the accompanying
copyright notice, this list of conditions, and the following disclaimer.
+ 2. Redistributions in binary form must reproduce the accompanying
copyright notice, this list of conditions, and the following disclaimer in
the documentation andor other materials provided with the distribution.
+ 3. Names of the copyright holders must not be used to endorse or
promote products derived from this software without prior written
permission from the copyright holders.
+ 4. The right to distribute this software or to use it for any purpose
does not give you the right to use Servicemarks (sm) or Trademarks (tm) of
the copyright holders. Use of them is covered by separate agreement with
the copyright holders.
+ 5. If any files are modified, you must cause the modified files to
carry prominent notices stating that you changed the files and the date of
any change.
+
+Disclaimer
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY
EXPRESSED OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDERS BE LIABLE FOR ANY
DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
(INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
DAMAGE.
+
+
+
+
+
+--------------------------------------------------------------------
+The
effbot.org Library is
+
+Copyright (c) 1999-2004 by Secret Labs AB
+Copyright (c) 1999-2004 by Fredrik Lundh
+
+By obtaining, using, andor copying this software andor its
+associated documentation, you agree that you have read, understood,
+and will comply with the following terms and conditions
+
+Permission to use, copy, modify, and distribute this software and its
+associated documentation for any purpose and without fee is hereby
+granted, provided that the above copyright notice appears in all
+copies, and that both that copyright notice and this permission notice
+appear in supporting documentation, and that the name of Secret Labs
+AB or the author not be used in advertising or publicity pertaining to
+distribution of the software without specific, written prior
+permission.
+
+SECRET LABS AB AND THE AUTHOR DISCLAIMS ALL WARRANTIES WITH REGARD TO
+THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
+FITNESS. IN NO EVENT SHALL SECRET LABS AB OR THE AUTHOR BE LIABLE FOR
+ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
+WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
+ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
+OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
+--------------------------------------------------------------------
Added: trunk/libgmail/mechanize/ClientForm.py
==============================================================================
--- (empty file)
+++ trunk/libgmail/mechanize/ClientForm.py Thu Feb 26 18:27:07 2009
@@ -0,0 +1,3405 @@
+"""HTML form handling for web clients.
+
+ClientForm is a Python module for handling HTML forms on the client
+side, useful for parsing HTML forms, filling them in and returning the
+completed forms to the server. It has developed from a port of Gisle
+Aas' Perl module HTML::Form, from the libwww-perl library, but the
+interface is not the same.
+
+The most useful docstring is the one for HTMLForm.
+
+RFC 1866: HTML 2.0
+RFC 1867: Form-based File Upload in HTML
+RFC 2388: Returning Values from Forms: multipart/form-data
+HTML 3.2 Specification, W3C Recommendation 14 January 1997 (for ISINDEX)
+HTML 4.01 Specification, W3C Recommendation 24 December 1999
+
+
+Copyright 2002-2007 John J. Lee <
j...@pobox.com>
+Copyright 2005 Gary Poster
+Copyright 2005 Zope Corporation
+Copyright 1998-2000 Gisle Aas.
+
+This code is free software; you can redistribute it and/or modify it
+under the terms of the BSD or ZPL 2.1 licenses (see the file
+COPYING.txt included with the distribution).
+
+"""
+
+# XXX
+# Remove parser testing hack
+# safeUrl()-ize action
+# Switch to unicode throughout (would be 0.3.x)
+# See Wichert Akkerman's 2004-01-22 message to
c.l.py.
+# Add charset parameter to Content-type headers? How to find value??
+# Add some more functional tests
+# Especially single and multiple file upload on the internet.
+# Does file upload work when name is missing? Sourceforge tracker form
+# doesn't like it. Check standards, and test with Apache. Test
+# binary upload with Apache.
+# mailto submission & enctype text/plain
+# I'm not going to fix this unless somebody tells me what real servers
+# that want this encoding actually expect: If enctype is
+# application/x-www-form-urlencoded and there's a FILE control present.
+# Strictly, it should be 'name=data' (see HTML 4.01 spec., section
+# 17.13.2), but I send "name=" ATM. What about multiple file upload??
+
+# Would be nice, but I'm not going to do it myself:
+# -------------------------------------------------
+# Maybe a 0.4.x?
+# Replace by_label etc. with moniker / selector concept. Allows, eg.,
+# a choice between selection by value / id / label / element
+# contents. Or choice between matching labels exactly or by
+# substring. Etc.
+# Remove deprecated methods.
+# ...what else?
+# Work on DOMForm.
+# XForms? Don't know if there's a need here.
+
+__all__ = ['AmbiguityError', 'CheckboxControl', 'Control',
+ 'ControlNotFoundError', 'FileControl', 'FormParser', 'HTMLForm',
+ 'HiddenControl', 'IgnoreControl', 'ImageControl', 'IsindexControl',
+ 'Item', 'ItemCountError', 'ItemNotFoundError', 'Label',
+ 'ListControl', 'LocateError', 'Missing', 'ParseError', 'ParseFile',
+ 'ParseFileEx', 'ParseResponse', 'ParseResponseEx','PasswordControl',
+ 'RadioControl', 'ScalarControl', 'SelectControl',
+ 'SubmitButtonControl', 'SubmitControl', 'TextControl',
+ 'TextareaControl', 'XHTMLCompatibleFormParser']
+
+try: True
+except NameError:
+ True = 1
+ False = 0
+
+try: bool
+except NameError:
+ def bool(expr):
+ if expr: return True
+ else: return False
+
+try:
+ import logging
+ import inspect
+except ImportError:
+ def debug(msg, *args, **kwds):
+ pass
+else:
+ _logger = logging.getLogger("ClientForm")
+ OPTIMIZATION_HACK = True
+
+ def debug(msg, *args, **kwds):
+ if OPTIMIZATION_HACK:
+ return
+
+ caller_name = inspect.stack()[1][3]
+ extended_msg = '%%s %s' % msg
+ extended_args = (caller_name,)+args
+ debug = _logger.debug(extended_msg, *extended_args, **kwds)
+
+ def _show_debug_messages():
+ global OPTIMIZATION_HACK
+ OPTIMIZATION_HACK = False
+ _logger.setLevel(logging.DEBUG)
+ handler = logging.StreamHandler(sys.stdout)
+ handler.setLevel(logging.DEBUG)
+ _logger.addHandler(handler)
+
+import sys, urllib, urllib2, types, mimetools, copy, urlparse, \
+ htmlentitydefs, re, random
+from cStringIO import StringIO
+
+import sgmllib
+# monkeypatch to fix
http://www.python.org/sf/803422 :-(
+sgmllib.charref = re.compile("&#(x?[0-9a-fA-F]+)[^0-9a-fA-F]")
+
+# HTMLParser.HTMLParser is recent, so live without it if it's not available
+# (also, sgmllib.SGMLParser is much more tolerant of bad HTML)
+try:
+ import HTMLParser
+except ImportError:
+ HAVE_MODULE_HTMLPARSER = False
+else:
+ HAVE_MODULE_HTMLPARSER = True
+
+try:
+ import warnings
+except ImportError:
+ def deprecation(message, stack_offset=0):
+ pass
+else:
+ def deprecation(message, stack_offset=0):
+ warnings.warn(message, DeprecationWarning,
stacklevel=3+stack_offset)
+
+VERSION = "0.2.11"
+
+CHUNK = 1024 # size of chunks fed to parser, in bytes
+
+DEFAULT_ENCODING = "latin-1"
+
+class Missing: pass
+
+_compress_re = re.compile(r"\s+")
+def compress_text(text): return _compress_re.sub(" ", text.strip())
+
+def normalize_line_endings(text):
+ return re.sub(r"(?:(?<!\r)\n)|(?:\r(?!\n))", "\r\n", text)
+
+
+# This version of urlencode is from my Python 1.5.2 back-port of the
+# Python 2.1 CVS maintenance branch of urllib. It will accept a sequence
+# of pairs instead of a mapping -- the 2.0 version only accepts a mapping.
+def urlencode(query,doseq=False,):
+ """Encode a sequence of two-element tuples or dictionary into a URL
query \
+string.
+
+ If any values in the query arg are sequences and doseq is true, each
+ sequence element is converted to a separate parameter.
+
+ If the query arg is a sequence of two-element tuples, the order of the
+ parameters in the output will match the order of parameters in the
+ input.
+ """
+
+ if hasattr(query,"items"):
+ # mapping objects
+ query = query.items()
+ else:
+ # it's a bother at times that strings and string-like objects are
+ # sequences...
+ try:
+ # non-sequence items should not work with len()
+ x = len(query)
+ # non-empty strings will fail this
+ if len(query) and type(query[0]) != types.TupleType:
+ raise TypeError()
+ # zero-length sequences of all types will get here and succeed,
+ # but that's a minor nit - since the original implementation
+ # allowed empty dicts that type of behavior probably should be
+ # preserved for consistency
+ except TypeError:
+ ty,va,tb = sys.exc_info()
+ raise TypeError("not a valid non-string sequence or mapping "
+ "object", tb)
+
+ l = []
+ if not doseq:
+ # preserve old behavior
+ for k, v in query:
+ k = urllib.quote_plus(str(k))
+ v = urllib.quote_plus(str(v))
+ l.append(k + '=' + v)
+ else:
+ for k, v in query:
+ k = urllib.quote_plus(str(k))
+ if type(v) == types.StringType:
+ v = urllib.quote_plus(v)
+ l.append(k + '=' + v)
+ elif type(v) == types.UnicodeType:
+ # is there a reasonable way to convert to ASCII?
+ # encode generates a string, but "replace" or "ignore"
+ # lose information and "strict" can raise UnicodeError
+ v = urllib.quote_plus(v.encode("ASCII","replace"))
+ l.append(k + '=' + v)
+ else:
+ try:
+ # is this a sufficient test for sequence-ness?
+ x = len(v)
+ except TypeError:
+ # not a sequence
+ v = urllib.quote_plus(str(v))
+ l.append(k + '=' + v)
+ else:
+ # loop over the sequence
+ for elt in v:
+ l.append(k + '=' + urllib.quote_plus(str(elt)))
+ return '&'.join(l)
+
+def unescape(data, entities, encoding=DEFAULT_ENCODING):
+ if data is None or "&" not in data:
+ return data
+
+ def replace_entities(match, entities=entities, encoding=encoding):
+ ent = match.group()
+ if ent[1] == "#":
+ return unescape_charref(ent[2:-1], encoding)
+
+ repl = entities.get(ent)
+ if repl is not None:
+ if type(repl) != type(""):
+ try:
+ repl = repl.encode(encoding)
+ except UnicodeError:
+ repl = ent
+ else:
+ repl = ent
+
+ return repl
+
+ return re.sub(r"&#?[A-Za-z0-9]+?;", replace_entities, data)
+
+def unescape_charref(data, encoding):
+ name, base = data, 10
+ if name.startswith("x"):
+ name, base= name[1:], 16
+ uc = unichr(int(name, base))
+ if encoding is None:
+ return uc
+ else:
+ try:
+ repl = uc.encode(encoding)
+ except UnicodeError:
+ repl = "&#%s;" % data
+ return repl
+
+def get_entitydefs():
+ import htmlentitydefs
+ from codecs import latin_1_decode
+ entitydefs = {}
+ try:
+ htmlentitydefs.name2codepoint
+ except AttributeError:
+ entitydefs = {}
+ for name, char in htmlentitydefs.entitydefs.items():
+ uc = latin_1_decode(char)[0]
+ if uc.startswith("&#") and uc.endswith(";"):
+ uc = unescape_charref(uc[2:-1], None)
+ entitydefs["&%s;" % name] = uc
+ else:
+ for name, codepoint in htmlentitydefs.name2codepoint.items():
+ entitydefs["&%s;" % name] = unichr(codepoint)
+ return entitydefs
+
+
+def issequence(x):
+ try:
+ x[0]
+ except (TypeError, KeyError):
+ return False
+ except IndexError:
+ pass
+ return True
+
+def isstringlike(x):
+ try: x+""
+ except: return False
+ else: return True
+
+
+def choose_boundary():
+ """Return a string usable as a multipart boundary."""
+ # follow IE and firefox
+ nonce = "".join([str(random.randint(0, sys.maxint-1)) for i in 0,1,2])
+ return "-"*27 + nonce
+
+# This cut-n-pasted MimeWriter from standard library is here so can add
+# to HTTP headers rather than message body when appropriate. It also uses
+# \r\n in place of \n. This is a bit nasty.
+class MimeWriter:
+
+ """Generic MIME writer.
+
+ Methods:
+
+ __init__()
+ addheader()
+ flushheaders()
+ startbody()
+ startmultipartbody()
+ nextpart()
+ lastpart()
+
+ A MIME writer is much more primitive than a MIME parser. It
+ doesn't seek around on the output file, and it doesn't use large
+ amounts of buffer space, so you have to write the parts in the
+ order they should occur on the output file. It does buffer the
+ headers you add, allowing you to rearrange their order.
+
+ General usage is:
+
+ f = <open the output file>
+ w = MimeWriter(f)
+ ...call w.addheader(key, value) 0 or more times...
+
+ followed by either:
+
+ f = w.startbody(content_type)
+ ...call f.write(data) for body data...
+
+ or:
+
+ w.startmultipartbody(subtype)
+ for each part:
+ subwriter = w.nextpart()
+ ...use the subwriter's methods to create the subpart...
+ w.lastpart()
+
+ The subwriter is another MimeWriter instance, and should be
+ treated in the same way as the toplevel MimeWriter. This way,
+ writing recursive body parts is easy.
+
+ Warning: don't forget to call lastpart()!
+
+ XXX There should be more state so calls made in the wrong order
+ are detected.
+
+ Some special cases:
+
+ - startbody() just returns the file passed to the constructor;
+ but don't use this knowledge, as it may be changed.
+
+ - startmultipartbody() actually returns a file as well;
+ this can be used to write the initial 'if you can read this your
+ mailer is not MIME-aware' message.
+
+ - If you call flushheaders(), the headers accumulated so far are
+ written out (and forgotten); this is useful if you don't need a
+ body part at all, e.g. for a subpart of type message/rfc822
+ that's (mis)used to store some header-like information.
+
+ - Passing a keyword argument 'prefix=<flag>' to addheader(),
+ start*body() affects where the header is inserted; 0 means
+ append at the end, 1 means insert at the start; default is
+ append for addheader(), but insert for start*body(), which use
+ it to determine where the Content-type header goes.
+
+ """
+
+ def __init__(self, fp, http_hdrs=None):
+ self._http_hdrs = http_hdrs
+ self._fp = fp
+ self._headers = []
+ self._boundary = []
+ self._first_part = True
+
+ def addheader(self, key, value, prefix=0,
+ add_to_http_hdrs=0):
+ """
+ prefix is ignored if add_to_http_hdrs is true.
+ """
+ lines = value.split("\r\n")
+ while lines and not lines[-1]: del lines[-1]
+ while lines and not lines[0]: del lines[0]
+ if add_to_http_hdrs:
+ value = "".join(lines)
+ # 2.2 urllib2 doesn't normalize header case
+ self._http_hdrs.append((key.capitalize(), value))
+ else:
+ for i in range(1, len(lines)):
+ lines[i] = " " + lines[i].strip()
+ value = "\r\n".join(lines) + "\r\n"
+ line = key.title() + ": " + value
+ if prefix:
+ self._headers.insert(0, line)
+ else:
+ self._headers.append(line)
+
+ def flushheaders(self):
+ self._fp.writelines(self._headers)
+ self._headers = []
+
+ def startbody(self, ctype=None, plist=[], prefix=1,
+ add_to_http_hdrs=0, content_type=1):
+ """
+ prefix is ignored if add_to_http_hdrs is true.
+ """
+ if content_type and ctype:
+ for name, value in plist:
+ ctype = ctype + ';\r\n %s=%s' % (name, value)
+ self.addheader("Content-Type", ctype, prefix=prefix,
+ add_to_http_hdrs=add_to_http_hdrs)
+ self.flushheaders()
+ if not add_to_http_hdrs: self._fp.write("\r\n")
+ self._first_part = True
+ return self._fp
+
+ def startmultipartbody(self, subtype, boundary=None, plist=[],
prefix=1,
+ add_to_http_hdrs=0, content_type=1):
+ boundary = boundary or choose_boundary()
+ self._boundary.append(boundary)
+ return self.startbody("multipart/" + subtype,
+ [("boundary", boundary)] + plist,
+ prefix=prefix,
+ add_to_http_hdrs=add_to_http_hdrs,
+ content_type=content_type)
+
+ def nextpart(self):
+ boundary = self._boundary[-1]
+ if self._first_part:
+ self._first_part = False
+ else:
+ self._fp.write("\r\n")
+ self._fp.write("--" + boundary + "\r\n")
+ return self.__class__(self._fp)
+
+ def lastpart(self):
+ if self._first_part:
+ self.nextpart()
+ boundary = self._boundary.pop()
+ self._fp.write("\r\n--" + boundary + "--\r\n")
+
+
+class LocateError(ValueError): pass
+class AmbiguityError(LocateError): pass
+class ControlNotFoundError(LocateError): pass
+class ItemNotFoundError(LocateError): pass
+
+class ItemCountError(ValueError): pass
+
+# for backwards compatibility, ParseError derives from exceptions that were
+# raised by versions of ClientForm <= 0.2.5
+if HAVE_MODULE_HTMLPARSER:
+ SGMLLIB_PARSEERROR = sgmllib.SGMLParseError
+ class ParseError(sgmllib.SGMLParseError,
+ HTMLParser.HTMLParseError,
+ ):
+ pass
+else:
+ if hasattr(sgmllib, "SGMLParseError"):
+ SGMLLIB_PARSEERROR = sgmllib.SGMLParseError
+ class ParseError(sgmllib.SGMLParseError):
+ pass
+ else:
+ SGMLLIB_PARSEERROR = RuntimeError
+ class ParseError(RuntimeError):
+ pass
+
+
+class _AbstractFormParser:
+ """forms attribute contains HTMLForm instances on completion."""
+ # thanks to Moshe Zadka for an example of sgmllib/htmllib usage
+ def __init__(self, entitydefs=None, encoding=DEFAULT_ENCODING):
+ if entitydefs is None:
+ entitydefs = get_entitydefs()
+ self._entitydefs = entitydefs
+ self._encoding = encoding
+
+ self.base = None
+ self.forms = []
+ self.labels = []
+ self._current_label = None
+ self._current_form = None
+ self._select = None
+ self._optgroup = None
+ self._option = None
+ self._textarea = None
+
+ # forms[0] will contain all controls that are outside of any form
+ # self._global_form is an alias for self.forms[0]
+ self._global_form = None
+ self.start_form([])
+ self.end_form()
+ self._current_form = self._global_form = self.forms[0]
+
+ def do_base(self, attrs):
+ debug("%s", attrs)
+ for key, value in attrs:
+ if key == "href":
+ self.base = self.unescape_attr_if_required(value)
+
+ def end_body(self):
+ debug("")
+ if self._current_label is not None:
+ self.end_label()
+ if self._current_form is not self._global_form:
+ self.end_form()
+
+ def start_form(self, attrs):
+ debug("%s", attrs)
+ if self._current_form is not self._global_form:
+ raise ParseError("nested FORMs")
+ name = None
+ action = None
+ enctype = "application/x-www-form-urlencoded"
+ method = "GET"
+ d = {}
+ for key, value in attrs:
+ if key == "name":
+ name = self.unescape_attr_if_required(value)
+ elif key == "action":
+ action = self.unescape_attr_if_required(value)
+ elif key == "method":
+ method = self.unescape_attr_if_required(value.upper())
+ elif key == "enctype":
+ enctype = self.unescape_attr_if_required(value.lower())
+ d[key] = self.unescape_attr_if_required(value)
+ controls = []
+ self._current_form = (name, action, method, enctype), d, controls
+
+ def end_form(self):
+ debug("")
+ if self._current_label is not None:
+ self.end_label()
+ if self._current_form is self._global_form:
+ raise ParseError("end of FORM before start")
+ self.forms.append(self._current_form)
+ self._current_form = self._global_form
+
+ def start_select(self, attrs):
+ debug("%s", attrs)
+ if self._select is not None:
+ raise ParseError("nested SELECTs")
+ if self._textarea is not None:
+ raise ParseError("SELECT inside TEXTAREA")
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+
+ self._select = d
+ self._add_label(d)
+
+ self._append_select_control({"__select": d})
+
+ def end_select(self):
+ debug("")
+ if self._select is None:
+ raise ParseError("end of SELECT before start")
+
+ if self._option is not None:
+ self._end_option()
+
+ self._select = None
+
+ def start_optgroup(self, attrs):
+ debug("%s", attrs)
+ if self._select is None:
+ raise ParseError("OPTGROUP outside of SELECT")
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+
+ self._optgroup = d
+
+ def end_optgroup(self):
+ debug("")
+ if self._optgroup is None:
+ raise ParseError("end of OPTGROUP before start")
+ self._optgroup = None
+
+ def _start_option(self, attrs):
+ debug("%s", attrs)
+ if self._select is None:
+ raise ParseError("OPTION outside of SELECT")
+ if self._option is not None:
+ self._end_option()
+
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+
+ self._option = {}
+ self._option.update(d)
+ if (self._optgroup and self._optgroup.has_key("disabled") and
+ not self._option.has_key("disabled")):
+ self._option["disabled"] = None
+
+ def _end_option(self):
+ debug("")
+ if self._option is None:
+ raise ParseError("end of OPTION before start")
+
+ contents = self._option.get("contents", "").strip()
+ self._option["contents"] = contents
+ if not self._option.has_key("value"):
+ self._option["value"] = contents
+ if not self._option.has_key("label"):
+ self._option["label"] = contents
+ # stuff dict of SELECT HTML attrs into a special private key
+ # (gets deleted again later)
+ self._option["__select"] = self._select
+ self._append_select_control(self._option)
+ self._option = None
+
+ def _append_select_control(self, attrs):
+ debug("%s", attrs)
+ controls = self._current_form[2]
+ name = self._select.get("name")
+ controls.append(("select", name, attrs))
+
+ def start_textarea(self, attrs):
+ debug("%s", attrs)
+ if self._textarea is not None:
+ raise ParseError("nested TEXTAREAs")
+ if self._select is not None:
+ raise ParseError("TEXTAREA inside SELECT")
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+ self._add_label(d)
+
+ self._textarea = d
+
+ def end_textarea(self):
+ debug("")
+ if self._textarea is None:
+ raise ParseError("end of TEXTAREA before start")
+ controls = self._current_form[2]
+ name = self._textarea.get("name")
+ controls.append(("textarea", name, self._textarea))
+ self._textarea = None
+
+ def start_label(self, attrs):
+ debug("%s", attrs)
+ if self._current_label:
+ self.end_label()
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+ taken = bool(d.get("for")) # empty id is invalid
+ d["__text"] = ""
+ d["__taken"] = taken
+ if taken:
+ self.labels.append(d)
+ self._current_label = d
+
+ def end_label(self):
+ debug("")
+ label = self._current_label
+ if label is None:
+ # something is ugly in the HTML, but we're ignoring it
+ return
+ self._current_label = None
+ # if it is staying around, it is True in all cases
+ del label["__taken"]
+
+ def _add_label(self, d):
+ #debug("%s", d)
+ if self._current_label is not None:
+ if not self._current_label["__taken"]:
+ self._current_label["__taken"] = True
+ d["__label"] = self._current_label
+
+ def handle_data(self, data):
+ debug("%s", data)
+
+ if self._option is not None:
+ # self._option is a dictionary of the OPTION element's HTML
+ # attributes, but it has two special keys, one of which is the
+ # special "contents" key contains text between OPTION tags (the
+ # other is the "__select" key: see the end_option method)
+ map = self._option
+ key = "contents"
+ elif self._textarea is not None:
+ map = self._textarea
+ key = "value"
+ data = normalize_line_endings(data)
+ # not if within option or textarea
+ elif self._current_label is not None:
+ map = self._current_label
+ key = "__text"
+ else:
+ return
+
+ if data and not map.has_key(key):
+ # according to
+ #
http://www.w3.org/TR/html4/appendix/notes.html#h-B.3.1 line
break
+ # immediately after start tags or immediately before end tags
must
+ # be ignored, but real browsers only ignore a line break after
a
+ # start tag, so we'll do that.
+ if data[0:2] == "\r\n":
+ data = data[2:]
+ elif data[0:1] in ["\n", "\r"]:
+ data = data[1:]
+ map[key] = data
+ else:
+ map[key] = map[key] + data
+
+ def do_button(self, attrs):
+ debug("%s", attrs)
+ d = {}
+ d["type"] = "submit" # default
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+ controls = self._current_form[2]
+
+ type = d["type"]
+ name = d.get("name")
+ # we don't want to lose information, so use a type string that
+ # doesn't clash with INPUT TYPE={SUBMIT,RESET,BUTTON}
+ # e.g. type for BUTTON/RESET is "resetbutton"
+ # (type for INPUT/RESET is "reset")
+ type = type+"button"
+ self._add_label(d)
+ controls.append((type, name, d))
+
+ def do_input(self, attrs):
+ debug("%s", attrs)
+ d = {}
+ d["type"] = "text" # default
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+ controls = self._current_form[2]
+
+ type = d["type"]
+ name = d.get("name")
+ self._add_label(d)
+ controls.append((type, name, d))
+
+ def do_isindex(self, attrs):
+ debug("%s", attrs)
+ d = {}
+ for key, val in attrs:
+ d[key] = self.unescape_attr_if_required(val)
+ controls = self._current_form[2]
+
+ self._add_label(d)
+ # isindex doesn't have type or name HTML attributes
+ controls.append(("isindex", None, d))
+
+ def handle_entityref(self, name):
+ #debug("%s", name)
+ self.handle_data(unescape(
+ '&%s;' % name, self._entitydefs, self._encoding))
+
+ def handle_charref(self, name):
+ #debug("%s", name)
+ self.handle_data(unescape_charref(name, self._encoding))
+
+ def unescape_attr(self, name):
+ #debug("%s", name)
+ return unescape(name, self._entitydefs, self._encoding)
+
+ def unescape_attrs(self, attrs):
+ #debug("%s", attrs)
+ escaped_attrs = {}
+ for key, val in attrs.items():
+ try:
+ val.items
+ except AttributeError:
+ escaped_attrs[key] = self.unescape_attr(val)
+ else:
+ # e.g. "__select" -- yuck!
+ escaped_attrs[key] = self.unescape_attrs(val)
+ return escaped_attrs
+
+ def unknown_entityref(self, ref): self.handle_data("&%s;" % ref)
+ def unknown_charref(self, ref): self.handle_data("&#%s;" % ref)
+
+
+if not HAVE_MODULE_HTMLPARSER:
+ class XHTMLCompatibleFormParser:
+ def __init__(self, entitydefs=None, encoding=DEFAULT_ENCODING):
+ raise ValueError("HTMLParser could not be imported")
+else:
+ class XHTMLCompatibleFormParser(_AbstractFormParser,
HTMLParser.HTMLParser):
+ """Good for XHTML, bad for tolerance of incorrect HTML."""
+ # thanks to Michael Howitz for this!
+ def __init__(self, entitydefs=None, encoding=DEFAULT_ENCODING):
+ HTMLParser.HTMLParser.__init__(self)
+ _AbstractFormParser.__init__(self, entitydefs, encoding)
+
+ def feed(self, data):
+ try:
+ HTMLParser.HTMLParser.feed(self, data)
+ except HTMLParser.HTMLParseError, exc:
+ raise ParseError(exc)
+
+ def start_option(self, attrs):
+ _AbstractFormParser._start_option(self, attrs)
+
+ def end_option(self):
+ _AbstractFormParser._end_option(self)
+
+ def handle_starttag(self, tag, attrs):
+ try:
+ method = getattr(self, "start_" + tag)
+ except AttributeError:
+ try:
+ method = getattr(self, "do_" + tag)
+ except AttributeError:
+ pass # unknown tag
+ else:
+ method(attrs)
+ else:
+ method(attrs)
+
+ def handle_endtag(self, tag):
+ try:
+ method = getattr(self, "end_" + tag)
+ except AttributeError:
+ pass # unknown tag
+ else:
+ method()
+
+ def unescape(self, name):
+ # Use the entitydefs passed into constructor, not
+ # HTMLParser.HTMLParser's entitydefs.
+ return self.unescape_attr(name)
+
+ def unescape_attr_if_required(self, name):
+ return name # HTMLParser.HTMLParser already did it
+ def unescape_attrs_if_required(self, attrs):
+ return attrs # ditto
+
+ def close(self):
+ HTMLParser.HTMLParser.close(self)
+ self.end_body()
+
+
+class _AbstractSgmllibParser(_AbstractFormParser):
+
+ def do_option(self, attrs):
+ _AbstractFormParser._start_option(self, attrs)
+
+ if sys.version_info[:2] >= (2,5):
+ # we override this attr to decode hex charrefs
+ entity_or_charref = re.compile(
+ '&(?:([a-zA-Z][-.a-zA-Z0-9]*)|#(x?[0-9a-fA-F]+))(;?)')
+ def convert_entityref(self, name):
+ return unescape("&%s;" % name, self._entitydefs,
self._encoding)
+ def convert_charref(self, name):
+ return unescape_charref("%s" % name, self._encoding)
+ def unescape_attr_if_required(self, name):
+ return name # sgmllib already did it
+ def unescape_attrs_if_required(self, attrs):
+ return attrs # ditto
+ else:
+ def unescape_attr_if_required(self, name):
+ return self.unescape_attr(name)
+ def unescape_attrs_if_required(self, attrs):
+ return self.unescape_attrs(attrs)
+
+
+class FormParser(_AbstractSgmllibParser, sgmllib.SGMLParser):
+ """Good for tolerance of incorrect HTML, bad for XHTML."""
+ def __init__(self, entitydefs=None, encoding=DEFAULT_ENCODING):
+ sgmllib.SGMLParser.__init__(self)
+ _AbstractFormParser.__init__(self, entitydefs, encoding)
+
+ def feed(self, data):
+ try:
+ sgmllib.SGMLParser.feed(self, data)
+ except SGMLLIB_PARSEERROR, exc:
+ raise ParseError(exc)
+
+ def close(self):
+ sgmllib.SGMLParser.close(self)
+ self.end_body()
+
+
+# sigh, must support mechanize by allowing dynamic creation of classes
based on
+# its bundled copy of BeautifulSoup (which was necessary because of
dependency
+# problems)
+
+def _create_bs_classes(bs,
+ icbinbs,
+ ):
+ class _AbstractBSFormParser(_AbstractSgmllibParser):
+ bs_base_class = None
+ def __init__(self, entitydefs=None, encoding=DEFAULT_ENCODING):
+ _AbstractFormParser.__init__(self, entitydefs, encoding)
+ self.bs_base_class.__init__(self)
+ def handle_data(self, data):
+ _AbstractFormParser.handle_data(self, data)
+ self.bs_base_class.handle_data(self, data)
+ def feed(self, data):
+ try:
+ self.bs_base_class.feed(self, data)
+ except SGMLLIB_PARSEERROR, exc:
+ raise ParseError(exc)
+ def close(self):
+ self.bs_base_class.close(self)
+ self.end_body()
+
+ class RobustFormParser(_AbstractBSFormParser, bs):
+ """Tries to be highly tolerant of incorrect HTML."""
+ pass
+ RobustFormParser.bs_base_class = bs
+ class NestingRobustFormParser(_AbstractBSFormParser, icbinbs):
+ """Tries to be highly tolerant of incorrect HTML.
+
+ Different from RobustFormParser in that it more often guesses
nesting
+ above missing end tags (see BeautifulSoup docs).
+
+ """
+ pass
+ NestingRobustFormParser.bs_base_class = icbinbs
+
+ return RobustFormParser, NestingRobustFormParser
+
+try:
+ if sys.version_info[:2] < (2, 2):
+ raise ImportError # BeautifulSoup uses generators
+ import BeautifulSoup
+except ImportError:
+ pass
+else:
+ RobustFormParser, NestingRobustFormParser = _create_bs_classes(
+ BeautifulSoup.BeautifulSoup,
BeautifulSoup.ICantBelieveItsBeautifulSoup
+ )
+ __all__ += ['RobustFormParser', 'NestingRobustFormParser']
+
+
+#FormParser = XHTMLCompatibleFormParser # testing hack
+#FormParser = RobustFormParser # testing hack
+
+
+def ParseResponseEx(response,
+ select_default=False,
+ form_parser_class=FormParser,
+ request_class=urllib2.Request,
+ entitydefs=None,
+ encoding=DEFAULT_ENCODING,
+
+ # private
+ _urljoin=urlparse.urljoin,
+ _urlparse=urlparse.urlparse,
+ _urlunparse=urlparse.urlunparse,
+ ):
+ """Identical to ParseResponse, except that:
+
+ 1. The returned list contains an extra item. The first form in the
list
+ contains all controls not contained in any FORM element.
+
+ 2. The arguments ignore_errors and backwards_compat have been removed.
+
+ 3. Backwards-compatibility mode (backwards_compat=True) is not
available.
+ """
+ return _ParseFileEx(response, response.geturl(),
+ select_default,
+ False,
+ form_parser_class,
+ request_class,
+ entitydefs,
+ False,
+ encoding,
+ _urljoin=_urljoin,
+ _urlparse=_urlparse,
+ _urlunparse=_urlunparse,
+ )
+
+def ParseFileEx(file, base_uri,
+ select_default=False,
+ form_parser_class=FormParser,
+ request_class=urllib2.Request,
+ entitydefs=None,
+ encoding=DEFAULT_ENCODING,
+
+ # private
+ _urljoin=urlparse.urljoin,
+ _urlparse=urlparse.urlparse,
+ _urlunparse=urlparse.urlunparse,
+ ):
+ """Identical to ParseFile, except that:
+
+ 1. The returned list contains an extra item. The first form in the
list
+ contains all controls not contained in any FORM element.
+
+ 2. The arguments ignore_errors and backwards_compat have been removed.
+
+ 3. Backwards-compatibility mode (backwards_compat=True) is not
available.
+ """
+ return _ParseFileEx(file, base_uri,
+ select_default,
+ False,
+ form_parser_class,
+ request_class,
+ entitydefs,
+ False,
+ encoding,
+ _urljoin=_urljoin,
+ _urlparse=_urlparse,
+ _urlunparse=_urlunparse,
+ )
+
+def ParseString(text, base_uri, *args, **kwds):
+ fh = StringIO(text)
+ return ParseFileEx(fh, base_uri, *args, **kwds)
+
+def ParseResponse(response, *args, **kwds):
+ """Parse HTTP response and return a list of HTMLForm instances.
+
+ The return value of urllib2.urlopen can be conveniently passed to this
+ function as the response parameter.
+
+ ClientForm.ParseError is raised on parse errors.
+
+ response: file-like object (supporting read() method) with a method
+ geturl(), returning the URI of the HTTP response
+ select_default: for multiple-selection SELECT controls and RADIO
controls,
+ pick the first item as the default if none are selected in the HTML
+ form_parser_class: class to instantiate and use to pass
+ request_class: class to return from .click() method (default is
+ urllib2.Request)
+ entitydefs: mapping like {"&": "&", ...} containing HTML entity
+ definitions (a sensible default is used)
+ encoding: character encoding used for encoding numeric character
references
+ when matching link text. ClientForm does not attempt to find the
encoding
+ in a META HTTP-EQUIV attribute in the document itself (mechanize, for
+ example, does do that and will pass the correct value to ClientForm
using
+ this parameter).
+
+ backwards_compat: boolean that determines whether the returned HTMLForm
+ objects are backwards-compatible with old code. If backwards_compat
is
+ true:
+
+ - ClientForm 0.1 code will continue to work as before.
+
+ - Label searches that do not specify a nr (number or count) will
always
+ get the first match, even if other controls match. If
+ backwards_compat is False, label searches that have ambiguous
results
+ will raise an AmbiguityError.
+
+ - Item label matching is done by strict string comparison rather than
+ substring matching.
+
+ - De-selecting individual list items is allowed even if the Item is
+ disabled.
+
+ The backwards_compat argument will be deprecated in a future release.
+
+ Pass a true value for select_default if you want the behaviour
specified by
+ RFC 1866 (the HTML 2.0 standard), which is to select the first item in
a
+ RADIO or multiple-selection SELECT control if none were selected in the
+ HTML. Most browsers (including Microsoft Internet Explorer (IE) and
+ Netscape Navigator) instead leave all items unselected in these
cases. The
+ W3C HTML 4.0 standard leaves this behaviour undefined in the case of
+ multiple-selection SELECT controls, but insists that at least one RADIO
+ button should be checked at all times, in contradiction to browser
+ behaviour.
+
+ There is a choice of parsers. ClientForm.XHTMLCompatibleFormParser
(uses
+ HTMLParser.HTMLParser) works best for XHTML, ClientForm.FormParser
(uses
+ sgmllib.SGMLParser) (the default) works better for ordinary grubby
HTML.
+ Note that HTMLParser is only available in Python 2.2 and later. You
can
+ pass your own class in here as a hack to work around bad HTML, but at
your
+ own risk: there is no well-defined interface.
+
+ """
+ return _ParseFileEx(response, response.geturl(), *args, **kwds)[1:]
+
+def ParseFile(file, base_uri, *args, **kwds):
+ """Parse HTML and return a list of HTMLForm instances.
+
+ ClientForm.ParseError is raised on parse errors.
+
+ file: file-like object (supporting read() method) containing HTML with
zero
+ or more forms to be parsed
+ base_uri: the URI of the document (note that the base URI used to
submit
+ the form will be that given in the BASE element if present, not that
of
+ the document)
+
+ For the other arguments and further details, see ParseResponse.__doc__.
+
+ """
+ return _ParseFileEx(file, base_uri, *args, **kwds)[1:]
+
+def _ParseFileEx(file, base_uri,
+ select_default=False,
+ ignore_errors=False,
+ form_parser_class=FormParser,
+ request_class=urllib2.Request,
+ entitydefs=None,
+ backwards_compat=True,
+ encoding=DEFAULT_ENCODING,
+ _urljoin=urlparse.urljoin,
+ _urlparse=urlparse.urlparse,
+ _urlunparse=urlparse.urlunparse,
+ ):
+ if backwards_compat:
+ deprecation("operating in backwards-compatibility mode", 1)
+ fp = form_parser_class(entitydefs, encoding)
+ while 1:
+ data = file.read(CHUNK)
+ try:
+ fp.feed(data)
+ except ParseError, e:
+ e.base_uri = base_uri
+ raise
+ if len(data) != CHUNK: break
+ fp.close()
+ if fp.base is not None:
+ # HTML BASE element takes precedence over document URI
+ base_uri = fp.base
+ labels = [] # Label(label) for label in fp.labels]
+ id_to_labels = {}
+ for l in fp.labels:
+ label = Label(l)
+ labels.append(label)
+ for_id = l["for"]
+ coll = id_to_labels.get(for_id)
+ if coll is None:
+ id_to_labels[for_id] = [label]
+ else:
+ coll.append(label)
+ forms = []
+ for (name, action, method, enctype), attrs, controls in fp.forms:
+ if action is None:
+ action = base_uri
+ else:
+ action = _urljoin(base_uri, action)
+ # would be nice to make HTMLForm class (form builder) pluggable
+ form = HTMLForm(
+ action, method, enctype, name, attrs, request_class,
+ forms, labels, id_to_labels, backwards_compat)
+ form._urlparse = _urlparse
+ form._urlunparse = _urlunparse
+ for ii in range(len(controls)):
+ type, name, attrs = controls[ii]
+ # index=ii*10 allows ImageControl to return multiple ordered
pairs
+ form.new_control(
+ type, name, attrs, select_default=select_default,
index=ii*10)
+ forms.append(form)
+ for form in forms:
+ form.fixup()
+ return forms
+
+
+class Label:
+ def __init__(self, attrs):
+
self.id = attrs.get("for")
+ self._text = attrs.get("__text").strip()
+ self._ctext = compress_text(self._text)
+ self.attrs = attrs
+ self._backwards_compat = False # maintained by HTMLForm
+
+ def __getattr__(self, name):
+ if name == "text":
+ if self._backwards_compat:
+ return self._text
+ else:
+ return self._ctext
+ return getattr(Label, name)
+
+ def __setattr__(self, name, value):
+ if name == "text":
+ # don't see any need for this, so make it read-only
+ raise AttributeError("text attribute is read-only")
+ self.__dict__[name] = value
+
+ def __str__(self):
+ return "<Label(id=%r, text=%r)>" % (
self.id, self.text)
+
+
+def _get_label(attrs):
+ text = attrs.get("__label")
+ if text is not None:
+ return Label(text)
+ else:
+ return None
+
+class Control:
+ """An HTML form control.
+
+ An HTMLForm contains a sequence of Controls. The Controls in an
HTMLForm
+ are accessed using the HTMLForm.find_control method or the
+ HTMLForm.controls attribute.
+
+ Control instances are usually constructed using the ParseFile /
+ ParseResponse functions. If you use those functions, you can ignore
the
+ rest of this paragraph. A Control is only properly initialised after
the
+ fixup method has been called. In fact, this is only strictly
necessary for
+ ListControl instances. This is necessary because ListControls are
built up
+ from ListControls each containing only a single item, and their initial
+ value(s) can only be known after the sequence is complete.
+
+ The types and values that are acceptable for assignment to the value
+ attribute are defined by subclasses.
+
+ If the disabled attribute is true, this represents the state typically
+ represented by browsers by 'greying out' a control. If the disabled
+ attribute is true, the Control will raise AttributeError if an attempt
is
+ made to change its value. In addition, the control will not be
considered
+ 'successful' as defined by the W3C HTML 4 standard -- ie. it will
+ contribute no data to the return value of the HTMLForm.click*
methods. To
+ enable a control, set the disabled attribute to a false value.
+
+ If the readonly attribute is true, the Control will raise
AttributeError if
+ an attempt is made to change its value. To make a control writable,
set
+ the readonly attribute to a false value.
+
+ All controls have the disabled and readonly attributes, not only those
that
+ may have the HTML attributes of the same names.
+
+ On assignment to the value attribute, the following exceptions are
raised:
+ TypeError, AttributeError (if the value attribute should not be
assigned
+ to, because the control is disabled, for example) and ValueError.
+
+ If the name or value attributes are None, or the value is an empty
list, or
+ if the control is disabled, the control is not successful.
+
+ Public attributes:
+
+ type: string describing type of control (see the keys of the
+ HTMLForm.type2class dictionary for the allowable values) (readonly)
+ name: name of control (readonly)
+ value: current value of control (subclasses may allow a single value, a
+ sequence of values, or either)
+ disabled: disabled state
+ readonly: readonly state
+ id: value of id HTML attribute
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ """
+ type: string describing type of control (see the keys of the
+ HTMLForm.type2class dictionary for the allowable values)
+ name: control name
+ attrs: HTML attributes of control's HTML element
+
+ """
+ raise NotImplementedError()
+
+ def add_to_form(self, form):
+ self._form = form
+ form.controls.append(self)
+
+ def fixup(self):
+ pass
+
+ def is_of_kind(self, kind):
+ raise NotImplementedError()
+
+ def clear(self):
+ raise NotImplementedError()
+
+ def __getattr__(self, name): raise NotImplementedError()
+ def __setattr__(self, name, value): raise NotImplementedError()
+
+ def pairs(self):
+ """Return list of (key, value) pairs suitable for passing to
urlencode.
+ """
+ return [(k, v) for (i, k, v) in self._totally_ordered_pairs()]
+
+ def _totally_ordered_pairs(self):
+ """Return list of (key, value, index) tuples.
+
+ Like pairs, but allows preserving correct ordering even where
several
+ controls are involved.
+
+ """
+ raise NotImplementedError()
+
+ def _write_mime_data(self, mw, name, value):
+ """Write data for a subitem of this control to a MimeWriter."""
+ # called by HTMLForm
+ mw2 = mw.nextpart()
+ mw2.addheader("Content-Disposition",
+ 'form-data; name="%s"' % name, 1)
+ f = mw2.startbody(prefix=0)
+ f.write(value)
+
+ def __str__(self):
+ raise NotImplementedError()
+
+ def get_labels(self):
+ """Return all labels (Label instances) for this control.
+
+ If the control was surrounded by a <label> tag, that will be the
first
+ label; all other labels, connected by 'for' and 'id', are in the
order
+ that appear in the HTML.
+
+ """
+ res = []
+ if self._label:
+ res.append(self._label)
+ if
self.id:
+ res.extend(self._form._id_to_labels.get(
self.id, ()))
+ return res
+
+
+#---------------------------------------------------
+class ScalarControl(Control):
+ """Control whose value is not restricted to one of a prescribed set.
+
+ Some ScalarControls don't accept any value attribute. Otherwise,
takes a
+ single value, which must be string-like.
+
+ Additional read-only public attribute:
+
+ attrs: dictionary mapping the names of original HTML attributes of the
+ control to their values
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ self._index = index
+ self._label = _get_label(attrs)
+ self.__dict__["type"] = type.lower()
+ self.__dict__["name"] = name
+ self._value = attrs.get("value")
+ self.disabled = attrs.has_key("disabled")
+ self.readonly = attrs.has_key("readonly")
+
self.id = attrs.get("id")
+
+ self.attrs = attrs.copy()
+
+ self._clicked = False
+
+ self._urlparse = urlparse.urlparse
+ self._urlunparse = urlparse.urlunparse
+
+ def __getattr__(self, name):
+ if name == "value":
+ return self.__dict__["_value"]
+ else:
+ raise AttributeError("%s instance has no attribute '%s'" %
+ (self.__class__.__name__, name))
+
+ def __setattr__(self, name, value):
+ if name == "value":
+ if not isstringlike(value):
+ raise TypeError("must assign a string")
+ elif self.readonly:
+ raise AttributeError("control '%s' is readonly" %
self.name)
+ elif self.disabled:
+ raise AttributeError("control '%s' is disabled" %
self.name)
+ self.__dict__["_value"] = value
+ elif name in ("name", "type"):
+ raise AttributeError("%s attribute is readonly" % name)
+ else:
+ self.__dict__[name] = value
+
+ def _totally_ordered_pairs(self):
+ name =
self.name
+ value = self.value
+ if name is None or value is None or self.disabled:
+ return []
+ return [(self._index, name, value)]
+
+ def clear(self):
+ if self.readonly:
+ raise AttributeError("control '%s' is readonly" %
self.name)
+ self.__dict__["_value"] = None
+
+ def __str__(self):
+ name =
self.name
+ value = self.value
+ if name is None: name = "<None>"
+ if value is None: value = "<None>"
+
+ infos = []
+ if self.disabled: infos.append("disabled")
+ if self.readonly: infos.append("readonly")
+ info = ", ".join(infos)
+ if info: info = " (%s)" % info
+
+ return "<%s(%s=%s)%s>" % (self.__class__.__name__, name, value,
info)
+
+
+#---------------------------------------------------
+class TextControl(ScalarControl):
+ """Textual input control.
+
+ Covers:
+
+ INPUT/TEXT
+ INPUT/PASSWORD
+ INPUT/HIDDEN
+ TEXTAREA
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ ScalarControl.__init__(self, type, name, attrs, index)
+ if self.type == "hidden": self.readonly = True
+ if self._value is None:
+ self._value = ""
+
+ def is_of_kind(self, kind): return kind == "text"
+
+#---------------------------------------------------
+class FileControl(ScalarControl):
+ """File upload with INPUT TYPE=FILE.
+
+ The value attribute of a FileControl is always None. Use add_file
instead.
+
+ Additional public method: add_file
+
+ """
+
+ def __init__(self, type, name, attrs, index=None):
+ ScalarControl.__init__(self, type, name, attrs, index)
+ self._value = None
+ self._upload_data = []
+
+ def is_of_kind(self, kind): return kind == "file"
+
+ def clear(self):
+ if self.readonly:
+ raise AttributeError("control '%s' is readonly" %
self.name)
+ self._upload_data = []
+
+ def __setattr__(self, name, value):
+ if name in ("value", "name", "type"):
+ raise AttributeError("%s attribute is readonly" % name)
+ else:
+ self.__dict__[name] = value
+
+ def add_file(self, file_object, content_type=None, filename=None):
+ if not hasattr(file_object, "read"):
+ raise TypeError("file-like object must have read method")
+ if content_type is not None and not isstringlike(content_type):
+ raise TypeError("content type must be None or string-like")
+ if filename is not None and not isstringlike(filename):
+ raise TypeError("filename must be None or string-like")
+ if content_type is None:
+ content_type = "application/octet-stream"
+ self._upload_data.append((file_object, content_type, filename))
+
+ def _totally_ordered_pairs(self):
+ # XXX should it be successful even if unnamed?
+ if
self.name is None or self.disabled:
+ return []
+ return [(self._index,
self.name, "")]
+
+ def _write_mime_data(self, mw, _name, _value):
+ # called by HTMLForm
+ # assert _name ==
self.name and _value == ''
+ if len(self._upload_data) < 2:
+ if len(self._upload_data) == 0:
+ file_object = StringIO()
+ content_type = "application/octet-stream"
+ filename = ""
+ else:
+ file_object, content_type, filename = self._upload_data[0]
+ if filename is None:
+ filename = ""
+ mw2 = mw.nextpart()
+ fn_part = '; filename="%s"' % filename
+ disp = 'form-data; name="%s"%s' % (
self.name, fn_part)
+ mw2.addheader("Content-Disposition", disp, prefix=1)
+ fh = mw2.startbody(content_type, prefix=0)
+ fh.write(file_object.read())
+ else:
+ # multiple files
+ mw2 = mw.nextpart()
+ disp = 'form-data; name="%s"' %
self.name
+ mw2.addheader("Content-Disposition", disp, prefix=1)
+ fh = mw2.startmultipartbody("mixed", prefix=0)
+ for file_object, content_type, filename in self._upload_data:
+ mw3 = mw2.nextpart()
+ if filename is None:
+ filename = ""
+ fn_part = '; filename="%s"' % filename
+ disp = "file%s" % fn_part
+ mw3.addheader("Content-Disposition", disp, prefix=1)
+ fh2 = mw3.startbody(content_type, prefix=0)
+ fh2.write(file_object.read())
+ mw2.lastpart()
+
+ def __str__(self):
+ name =
self.name
+ if name is None: name = "<None>"
+
+ if not self._upload_data:
+ value = "<No files added>"
+ else:
+ value = []
+ for file, ctype, filename in self._upload_data:
+ if filename is None:
+ value.append("<Unnamed file>")
+ else:
+ value.append(filename)
+ value = ", ".join(value)
+
+ info = []
+ if self.disabled: info.append("disabled")
+ if self.readonly: info.append("readonly")
+ info = ", ".join(info)
+ if info: info = " (%s)" % info
+
+ return "<%s(%s=%s)%s>" % (self.__class__.__name__, name, value,
info)
+
+
+#---------------------------------------------------
+class IsindexControl(ScalarControl):
+ """ISINDEX control.
+
+ ISINDEX is the odd-one-out of HTML form controls. In fact, it isn't
really
+ part of regular HTML forms at all, and predates it. You're only
allowed
+ one ISINDEX per HTML document. ISINDEX and regular form submission are
+ mutually exclusive -- either submit a form, or the ISINDEX.
+
+ Having said this, since ISINDEX controls may appear in forms (which is
+ probably bad HTML), ParseFile / ParseResponse will include them in the
+ HTMLForm instances it returns. You can set the ISINDEX's value, as
with
+ any other control (but note that ISINDEX controls have no name, so
you'll
+ need to use the type argument of set_value!). When you submit the
form,
+ the ISINDEX will not be successful (ie., no data will get returned to
the
+ server as a result of its presence), unless you click on the ISINDEX
+ control, in which case the ISINDEX gets submitted instead of the form:
+
+ form.set_value("my isindex value", type="isindex")
+ urllib2.urlopen(form.click(type="isindex"))
+
+ ISINDEX elements outside of FORMs are ignored. If you want to submit
one
+ by hand, do it like so:
+
+ url = urlparse.urljoin(page_uri, "?"+urllib.quote_plus("my isindex
value"))
+ result = urllib2.urlopen(url)
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ ScalarControl.__init__(self, type, name, attrs, index)
+ if self._value is None:
+ self._value = ""
+
+ def is_of_kind(self, kind): return kind in ["text", "clickable"]
+
+ def _totally_ordered_pairs(self):
+ return []
+
+ def _click(self, form, coord, return_type,
request_class=urllib2.Request):
+ # Relative URL for ISINDEX submission: instead of "foo=bar+baz",
+ # want "bar+baz".
+ # This doesn't seem to be specified in HTML 4.01 spec. (ISINDEX is
+ # deprecated in 4.01, but it should still say how to submit it).
+ # Submission of ISINDEX is explained in the HTML 3.2 spec, though.
+ parts = self._urlparse(form.action)
+ rest, (query, frag) = parts[:-2], parts[-2:]
+ parts = rest + (urllib.quote_plus(self.value), None)
+ url = self._urlunparse(parts)
+ req_data = url, None, []
+
+ if return_type == "pairs":
+ return []
+ elif return_type == "request_data":
+ return req_data
+ else:
+ return request_class(url)
+
+ def __str__(self):
+ value = self.value
+ if value is None: value = "<None>"
+
+ infos = []
+ if self.disabled: infos.append("disabled")
+ if self.readonly: infos.append("readonly")
+ info = ", ".join(infos)
+ if info: info = " (%s)" % info
+
+ return "<%s(%s)%s>" % (self.__class__.__name__, value, info)
+
+
+#---------------------------------------------------
+class IgnoreControl(ScalarControl):
+ """Control that we're not interested in.
+
+ Covers:
+
+ INPUT/RESET
+ BUTTON/RESET
+ INPUT/BUTTON
+ BUTTON/BUTTON
+
+ These controls are always unsuccessful, in the terminology of HTML 4
(ie.
+ they never require any information to be returned to the server).
+
+ BUTTON/BUTTON is used to generate events for script embedded in HTML.
+
+ The value attribute of IgnoreControl is always None.
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ ScalarControl.__init__(self, type, name, attrs, index)
+ self._value = None
+
+ def is_of_kind(self, kind): return False
+
+ def __setattr__(self, name, value):
+ if name == "value":
+ raise AttributeError(
+ "control '%s' is ignored, hence read-only" %
self.name)
+ elif name in ("name", "type"):
+ raise AttributeError("%s attribute is readonly" % name)
+ else:
+ self.__dict__[name] = value
+
+
+#---------------------------------------------------
+# ListControls
+
+# helpers and subsidiary classes
+
+class Item:
+ def __init__(self, control, attrs, index=None):
+ label = _get_label(attrs)
+ self.__dict__.update({
+ "name": attrs["value"],
+ "_labels": label and [label] or [],
+ "attrs": attrs,
+ "_control": control,
+ "disabled": attrs.has_key("disabled"),
+ "_selected": False,
+ "id": attrs.get("id"),
+ "_index": index,
+ })
+ control.items.append(self)
+
+ def get_labels(self):
+ """Return all labels (Label instances) for this item.
+
+ For items that represent radio buttons or checkboxes, if the item
was
+ surrounded by a <label> tag, that will be the first label; all
other
+ labels, connected by 'for' and 'id', are in the order that appear
in
+ the HTML.
+
+ For items that represent select options, if the option had a label
+ attribute, that will be the first label. If the option has
contents
+ (text within the option tags) and it is not the same as the label
+ attribute (if any), that will be a label. There is nothing in the
+ spec to my knowledge that makes an option with an id unable to be
the
+ target of a label's for attribute, so those are included, if any,
for
+ the sake of consistency and completeness.
+
+ """
+ res = []
+ res.extend(self._labels)
+ if
self.id:
+ res.extend(self._control._form._id_to_labels.get(
self.id, ()))
+ return res
+
+ def __getattr__(self, name):
+ if name=="selected":
+ return self._selected
+ raise AttributeError(name)
+
+ def __setattr__(self, name, value):
+ if name == "selected":
+ self._control._set_selected_state(self, value)
+ elif name == "disabled":
+ self.__dict__["disabled"] = bool(value)
+ else:
+ raise AttributeError(name)
+
+ def __str__(self):
+ res =
self.name
+ if self.selected:
+ res = "*" + res
+ if self.disabled:
+ res = "(%s)" % res
+ return res
+
+ def __repr__(self):
+ # XXX appending the attrs without distinguishing them from name
and id
+ # is silly
+ attrs = [("name",
self.name), ("id",
self.id)]+self.attrs.items()
+ return "<%s %s>" % (
+ self.__class__.__name__,
+ " ".join(["%s=%r" % (k, v) for k, v in attrs])
+ )
+
+def disambiguate(items, nr, **kwds):
+ msgs = []
+ for key, value in kwds.items():
+ msgs.append("%s=%r" % (key, value))
+ msg = " ".join(msgs)
+ if not items:
+ raise ItemNotFoundError(msg)
+ if nr is None:
+ if len(items) > 1:
+ raise AmbiguityError(msg)
+ nr = 0
+ if len(items) <= nr:
+ raise ItemNotFoundError(msg)
+ return items[nr]
+
+class ListControl(Control):
+ """Control representing a sequence of items.
+
+ The value attribute of a ListControl represents the successful list
items
+ in the control. The successful list items are those that are selected
and
+ not disabled.
+
+ ListControl implements both list controls that take a length-1 value
+ (single-selection) and those that take length >1 values
+ (multiple-selection).
+
+ ListControls accept sequence values only. Some controls only accept
+ sequences of length 0 or 1 (RADIO, and single-selection SELECT).
+ In those cases, ItemCountError is raised if len(sequence) > 1.
CHECKBOXes
+ and multiple-selection SELECTs (those having the "multiple" HTML
attribute)
+ accept sequences of any length.
+
+ Note the following mistake:
+
+ control.value = some_value
+ assert control.value == some_value # not necessarily true
+
+ The reason for this is that the value attribute always gives the list
items
+ in the order they were listed in the HTML.
+
+ ListControl items can also be referred to by their labels instead of
names.
+ Use the label argument to .get(), and the .set_value_by_label(),
+ .get_value_by_label() methods.
+
+ Note that, rather confusingly, though SELECT controls are represented
in
+ HTML by SELECT elements (which contain OPTION elements, representing
+ individual list items), CHECKBOXes and RADIOs are not represented by
*any*
+ element. Instead, those controls are represented by a collection of
INPUT
+ elements. For example, this is a SELECT control, named "control1":
+
+ <select name="control1">
+ <option>foo</option>
+ <option value="1">bar</option>
+ </select>
+
+ and this is a CHECKBOX control, named "control2":
+
+ <input type="checkbox" name="control2" value="foo" id="cbe1">
+ <input type="checkbox" name="control2" value="bar" id="cbe2">
+
+ The id attribute of a CHECKBOX or RADIO ListControl is always that of
its
+ first element (for example, "cbe1" above).
+
+
+ Additional read-only public attribute: multiple.
+
+ """
+
+ # ListControls are built up by the parser from their component items by
+ # creating one ListControl per item, consolidating them into a single
+ # master ListControl held by the HTMLForm:
+
+ # -User calls form.new_control(...)
+ # -Form creates Control, and calls control.add_to_form(self).
+ # -Control looks for a Control with the same name and type in the form,
+ # and if it finds one, merges itself with that control by calling
+ # control.merge_control(self). The first Control added to the form,
of
+ # a particular name and type, is the only one that survives in the
+ # form.
+ # -Form calls control.fixup for all its controls. ListControls in the
+ # form know they can now safely pick their default values.
+
+ # To create a ListControl without an HTMLForm, use:
+
+ # control.merge_control(new_control)
+
+ # (actually, it's much easier just to use ParseFile)
+
+ _label = None
+
+ def __init__(self, type, name, attrs={}, select_default=False,
+ called_as_base_class=False, index=None):
+ """
+ select_default: for RADIO and multiple-selection SELECT controls,
pick
+ the first item as the default if no 'selected' HTML attribute is
+ present
+
+ """
+ if not called_as_base_class:
+ raise NotImplementedError()
+
+ self.__dict__["type"] = type.lower()
+ self.__dict__["name"] = name
+ self._value = attrs.get("value")
+ self.disabled = False
+ self.readonly = False
+
self.id = attrs.get("id")
+ self._closed = False
+
+ # As Controls are merged in with .merge_control(), self.attrs will
+ # refer to each Control in turn -- always the most recently merged
+ # control. Each merged-in Control instance corresponds to a single
+ # list item: see ListControl.__doc__.
+ self.items = []
+ self._form = None
+
+ self._select_default = select_default
+ self._clicked = False
+
+ def clear(self):
+ self.value = []
+
+ def is_of_kind(self, kind):
+ if kind == "list":
+ return True
+ elif kind == "multilist":
+ return bool(self.multiple)
+ elif kind == "singlelist":
+ return not self.multiple
+ else:
+ return False
+
+ def get_items(self, name=None, label=None, id=None,
+ exclude_disabled=False):
+ """Return matching items by name or label.
+
+ For argument docs, see the docstring for .get()
+
+ """
+ if name is not None and not isstringlike(name):
+ raise TypeError("item name must be string-like")
+ if label is not None and not isstringlike(label):
+ raise TypeError("item label must be string-like")
+ if id is not None and not isstringlike(id):
+ raise TypeError("item id must be string-like")
+ items = [] # order is important
+ compat = self._form.backwards_compat
+ for o in self.items:
+ if exclude_disabled and o.disabled:
+ continue
+ if name is not None and
o.name != name:
+ continue
+ if label is not None:
+ for l in o.get_labels():
+ if ((compat and l.text == label) or
+ (not compat and l.text.find(label) > -1)):
+ break
+ else:
+ continue
+ if id is not None and
o.id != id:
+ continue
+ items.append(o)
+ return items
+
+ def get(self, name=None, label=None, id=None, nr=None,
+ exclude_disabled=False):
+ """Return item by name or label, disambiguating if necessary with
nr.
+
+ All arguments must be passed by name, with the exception of 'name',
+ which may be used as a positional argument.
+
+ If name is specified, then the item must have the indicated name.
+
+ If label is specified, then the item must have a label whose
+ whitespace-compressed, stripped, text substring-matches the
indicated
+ label string (eg. label="please choose" will match
+ " Do please choose an item ").
+
+ If id is specified, then the item must have the indicated id.
+
+ nr is an optional 0-based index of the items matching the query.
+
+ If nr is the default None value and more than item is found, raises
+ AmbiguityError (unless the HTMLForm instance's backwards_compat
+ attribute is true).
+
+ If no item is found, or if items are found but nr is specified and
not
+ found, raises ItemNotFoundError.
+
+ Optionally excludes disabled items.
+
+ """
+ if nr is None and self._form.backwards_compat:
+ nr = 0 # :-/
+ items = self.get_items(name, label, id, exclude_disabled)
+ return disambiguate(items, nr, name=name, label=label, id=id)
+
+ def _get(self, name, by_label=False, nr=None, exclude_disabled=False):
+ # strictly for use by deprecated methods
+ if by_label:
+ name, label = None, name
+ else:
+ name, label = name, None
+ return self.get(name, label, nr, exclude_disabled)
+
+ def toggle(self, name, by_label=False, nr=None):
+ """Deprecated: given a name or label and optional disambiguating
index
+ nr, toggle the matching item's selection.
+
+ Selecting items follows the behavior described in the docstring of
the
+ 'get' method.
+
+ if the item is disabled, or this control is disabled or readonly,
+ raise AttributeError.
+
+ """
+ deprecation(
+ "item = control.get(...); item.selected = not item.selected")
+ o = self._get(name, by_label, nr)
+ self._set_selected_state(o, not o.selected)
+
+ def set(self, selected, name, by_label=False, nr=None):
+ """Deprecated: given a name or label and optional disambiguating
index
+ nr, set the matching item's selection to the bool value of
selected.
+
+ Selecting items follows the behavior described in the docstring of
the
+ 'get' method.
+
+ if the item is disabled, or this control is disabled or readonly,
+ raise AttributeError.
+
+ """
+ deprecation(
+ "control.get(...).selected = <boolean>")
+ self._set_selected_state(self._get(name, by_label, nr), selected)
+
+ def _set_selected_state(self, item, action):
+ # action:
+ # bool False: off
+ # bool True: on
+ if self.disabled:
+ raise AttributeError("control '%s' is disabled" %
self.name)
+ if self.readonly:
+ raise AttributeError("control '%s' is readonly" %
self.name)
+ action == bool(action)
+ compat = self._form.backwards_compat
+ if not compat and item.disabled:
+ raise AttributeError("item is disabled")
+ else:
+ if compat and item.disabled and action:
+ raise AttributeError("item is disabled")
+ if self.multiple:
+ item.__dict__["_selected"] = action
+ else:
+ if not action:
+ item.__dict__["_selected"] = False
+ else:
+ for o in self.items:
+ o.__dict__["_selected"] = False
+ item.__dict__["_selected"] = True
+
+ def toggle_single(self, by_label=None):
+ """Deprecated: toggle the selection of the single item in this
control.
+
+ Raises ItemCountError if the control does not contain only one
item.
+
+ by_label argument is ignored, and included only for backwards
+ compatibility.
+
+ """
+ deprecation(
+ "control.items[0].selected = not control.items[0].selected")
+ if len(self.items) != 1:
+ raise ItemCountError(
+ "'%s' is not a single-item control" %
self.name)
+ item = self.items[0]
+ self._set_selected_state(item, not item.selected)
+
+ def set_single(self, selected, by_label=None):
+ """Deprecated: set the selection of the single item in this
control.
+
+ Raises ItemCountError if the control does not contain only one
item.
+
+ by_label argument is ignored, and included only for backwards
+ compatibility.
+
+ """
+ deprecation(
+ "control.items[0].selected = <boolean>")
+ if len(self.items) != 1:
+ raise ItemCountError(
+ "'%s' is not a single-item control" %
self.name)
+ self._set_selected_state(self.items[0], selected)
+
+ def get_item_disabled(self, name, by_label=False, nr=None):
+ """Get disabled state of named list item in a ListControl."""
+ deprecation(
+ "control.get(...).disabled")
+ return self._get(name, by_label, nr).disabled
+
+ def set_item_disabled(self, disabled, name, by_label=False, nr=None):
+ """Set disabled state of named list item in a ListControl.
+
+ disabled: boolean disabled state
+
+ """
+ deprecation(
+ "control.get(...).disabled = <boolean>")
+ self._get(name, by_label, nr).disabled = disabled
+
+ def set_all_items_disabled(self, disabled):
+ """Set disabled state of all list items in a ListControl.
+
+ disabled: boolean disabled state
+
+ """
+ for o in self.items:
+ o.disabled = disabled
+
+ def get_item_attrs(self, name, by_label=False, nr=None):
+ """Return dictionary of HTML attributes for a single ListControl
item.
+
+ The HTML element types that describe list items are: OPTION for
SELECT
+ controls, INPUT for the rest. These elements have HTML attributes
that
+ you may occasionally want to know about -- for example, the "alt"
HTML
+ attribute gives a text string describing the item (graphical
browsers
+ usually display this as a tooltip).
+
+ The returned dictionary maps HTML attribute names to values. The
names
+ and values are taken from the original HTML.
+
+ """
+ deprecation(
+ "control.get(...).attrs")
+ return self._get(name, by_label, nr).attrs
+
+ def close_control(self):
+ self._closed = True
+
+ def add_to_form(self, form):
+ assert self._form is None or form == self._form, (
+ "can't add control to more than one form")
+ self._form = form
+ if
self.name is None:
+ # always count nameless elements as separate controls
+ Control.add_to_form(self, form)
+ else:
+ for ii in range(len(form.controls)-1, -1, -1):
+ control = form.controls[ii]
+ if
control.name ==
self.name and control.type == self.type:
+ if control._closed:
+ Control.add_to_form(self, form)
+ else:
+ control.merge_control(self)
+ break
+ else:
+ Control.add_to_form(self, form)
+
+ def merge_control(self, control):
+ assert bool(control.multiple) == bool(self.multiple)
+ # usually, isinstance(control, self.__class__)
+ self.items.extend(control.items)
+
+ def fixup(self):
+ """
+ ListControls are built up from component list items (which are also
+ ListControls) during parsing. This method should be called after
all
+ items have been added. See ListControl.__doc__ for the reason
this is
+ required.
+
+ """
+ # Need to set default selection where no item was indicated as
being
+ # selected by the HTML:
+
+ # CHECKBOX:
+ # Nothing should be selected.
+ # SELECT/single, SELECT/multiple and RADIO:
+ # RFC 1866 (HTML 2.0): says first item should be selected.
+ # W3C HTML 4.01 Specification: says that client behaviour is
+ # undefined in this case. For RADIO, exactly one must be
selected,
+ # though which one is undefined.
+ # Both Netscape and Microsoft Internet Explorer (IE) choose first
+ # item for SELECT/single. However, both IE5 and Mozilla (both
1.0
+ # and Firebird 0.6) leave all items unselected for RADIO and
+ # SELECT/multiple.
+
+ # Since both Netscape and IE all choose the first item for
+ # SELECT/single, we do the same. OTOH, both Netscape and IE
+ # leave SELECT/multiple with nothing selected, in violation of RFC
1866
+ # (but not in violation of the W3C HTML 4 standard); the same is
true
+ # of RADIO (which *is* in violation of the HTML 4 standard). We
follow
+ # RFC 1866 if the _select_default attribute is set, and Netscape
and IE
+ # otherwise. RFC 1866 and HTML 4 are always violated insofar as
you
+ # can deselect all items in a RadioControl.
+
+ for o in self.items:
+ # set items' controls to self, now that we've merged
+ o.__dict__["_control"] = self
+
+ def __getattr__(self, name):
+ if name == "value":
+ compat = self._form.backwards_compat
+ if
self.name is None:
+ return []
+ return [
o.name for o in self.items if o.selected and
+ (not o.disabled or compat)]
+ else:
+ raise AttributeError("%s instance has no attribute '%s'" %
+ (self.__class__.__name__, name))
+
+ def __setattr__(self, name, value):
+ if name == "value":
+ if self.disabled:
+ raise AttributeError("control '%s' is disabled" %
self.name)
+ if self.readonly:
+ raise AttributeError("control '%s' is readonly" %
self.name)
+ self._set_value(value)
+ elif name in ("name", "type", "multiple"):
+ raise AttributeError("%s attribute is readonly" % name)
+ else:
+ self.__dict__[name] = value
+
+ def _set_value(self, value):
+ if value is None or isstringlike(value):
+ raise TypeError("ListControl, must set a sequence")
+ if not value:
+ compat = self._form.backwards_compat
+ for o in self.items:
+ if not o.disabled or compat:
+ o.selected = False
+ elif self.multiple:
+ self._multiple_set_value(value)
+ elif len(value) > 1:
+ raise ItemCountError(
+ "single selection list, must set sequence of "
+ "length 0 or 1")
+ else:
+ self._single_set_value(value)
+
+ def _get_items(self, name, target=1):
+ all_items = self.get_items(name)
+ items = [o for o in all_items if not o.disabled]
+ if len(items) < target:
+ if len(all_items) < target:
+ raise ItemNotFoundError(
+ "insufficient items with name %r" % name)
+ else:
+ raise AttributeError(
+ "insufficient non-disabled items with name %s" % name)
+ on = []
+ off = []
+ for o in items:
+ if o.selected:
+ on.append(o)
+ else:
+ off.append(o)
+ return on, off
+
+ def _single_set_value(self, value):
+ assert len(value) == 1
+ on, off = self._get_items(value[0])
+ assert len(on) <= 1
+ if not on:
+ off[0].selected = True
+
+ def _multiple_set_value(self, value):
+ compat = self._form.backwards_compat
+ turn_on = [] # transactional-ish
+ turn_off = [item for item in self.items if
+ item.selected and (not item.disabled or compat)]
+ names = {}
+ for nn in value:
+ if nn in names.keys():
+ names[nn] += 1
+ else:
+ names[nn] = 1
+ for name, count in names.items():
+ on, off = self._get_items(name, count)
+ for i in range(count):
+ if on:
+ item = on[0]
+ del on[0]
+ del turn_off[turn_off.index(item)]
+ else:
+ item = off[0]
+ del off[0]
+ turn_on.append(item)
+ for item in turn_off:
+ item.selected = False
+ for item in turn_on:
+ item.selected = True
+
+ def set_value_by_label(self, value):
+ """Set the value of control by item labels.
+
+ value is expected to be an iterable of strings that are substrings
of
+ the item labels that should be selected. Before substring
matching is
+ performed, the original label text is whitespace-compressed
+ (consecutive whitespace characters are converted to a single space
+ character) and leading and trailing whitespace is stripped.
Ambiguous
+ labels are accepted without complaint if the form's
backwards_compat is
+ True; otherwise, it will not complain as long as all ambiguous
labels
+ share the same item name (e.g. OPTION value).
+
+ """
+ if isstringlike(value):
+ raise TypeError(value)
+ if not self.multiple and len(value) > 1:
+ raise ItemCountError(
+ "single selection list, must set sequence of "
+ "length 0 or 1")
+ items = []
+ for nn in value:
+ found = self.get_items(label=nn)
+ if len(found) > 1:
+ if not self._form.backwards_compat:
+ # ambiguous labels are fine as long as item names (e.g.
+ # OPTION values) are same
+ opt_name = found[0].name
+ if [o for o in found[1:] if
o.name != opt_name]:
+ raise AmbiguityError(nn)
+ else:
+ # OK, we'll guess :-( Assume first available item.
+ found = found[:1]
+ for o in found:
+ # For the multiple-item case, we could try to be smarter,
+ # saving them up and trying to resolve, but that's too
much.
+ if self._form.backwards_compat or o not in items:
+ items.append(o)
+ break
+ else: # all of them are used
+ raise ItemNotFoundError(nn)
+ # now we have all the items that should be on
+ # let's just turn everything off and then back on.
+ self.value = []
+ for o in items:
+ o.selected = True
+
+ def get_value_by_label(self):
+ """Return the value of the control as given by normalized
labels."""
+ res = []
+ compat = self._form.backwards_compat
+ for o in self.items:
+ if (not o.disabled or compat) and o.selected:
+ for l in o.get_labels():
+ if l.text:
+ res.append(l.text)
+ break
+ else:
+ res.append(None)
+ return res
+
+ def possible_items(self, by_label=False):
+ """Deprecated: return the names or labels of all possible items.
+
+ Includes disabled items, which may be misleading for some use
cases.
+
+ """
+ deprecation(
+ "[
item.name for item in self.items]")
+ if by_label:
+ res = []
+ for o in self.items:
+ for l in o.get_labels():
+ if l.text:
+ res.append(l.text)
+ break
+ else:
+ res.append(None)
+ return res
+ return [
o.name for o in self.items]
+
+ def _totally_ordered_pairs(self):
+ if self.disabled or
self.name is None:
+ return []
+ else:
+ return [(o._index,
self.name,
o.name) for o in self.items
+ if o.selected and not o.disabled]
+
+ def __str__(self):
+ name =
self.name
+ if name is None: name = "<None>"
+
+ display = [str(o) for o in self.items]
+
+ infos = []
+ if self.disabled: infos.append("disabled")
+ if self.readonly: infos.append("readonly")
+ info = ", ".join(infos)
+ if info: info = " (%s)" % info
+
+ return "<%s(%s=[%s])%s>" % (self.__class__.__name__,
+ name, ", ".join(display), info)
+
+
+class RadioControl(ListControl):
+ """
+ Covers:
+
+ INPUT/RADIO
+
+ """
+ def __init__(self, type, name, attrs, select_default=False,
index=None):
+ attrs.setdefault("value", "on")
+ ListControl.__init__(self, type, name, attrs, select_default,
+ called_as_base_class=True, index=index)
+ self.__dict__["multiple"] = False
+ o = Item(self, attrs, index)
+ o.__dict__["_selected"] = attrs.has_key("checked")
+
+ def fixup(self):
+ ListControl.fixup(self)
+ found = [o for o in self.items if o.selected and not o.disabled]
+ if not found:
+ if self._select_default:
+ for o in self.items:
+ if not o.disabled:
+ o.selected = True
+ break
+ else:
+ # Ensure only one item selected. Choose the last one,
+ # following IE and Firefox.
+ for o in found[:-1]:
+ o.selected = False
+
+ def get_labels(self):
+ return []
+
+class CheckboxControl(ListControl):
+ """
+ Covers:
+
+ INPUT/CHECKBOX
+
+ """
+ def __init__(self, type, name, attrs, select_default=False,
index=None):
+ attrs.setdefault("value", "on")
+ ListControl.__init__(self, type, name, attrs, select_default,
+ called_as_base_class=True, index=index)
+ self.__dict__["multiple"] = True
+ o = Item(self, attrs, index)
+ o.__dict__["_selected"] = attrs.has_key("checked")
+
+ def get_labels(self):
+ return []
+
+
+class SelectControl(ListControl):
+ """
+ Covers:
+
+ SELECT (and OPTION)
+
+
+ OPTION 'values', in HTML parlance, are Item 'names' in ClientForm
parlance.
+
+ SELECT control values and labels are subject to some messy defaulting
+ rules. For example, if the HTML representation of the control is:
+
+ <SELECT name=year>
+ <OPTION value=0 label="2002">current year</OPTION>
+ <OPTION value=1>2001</OPTION>
+ <OPTION>2000</OPTION>
+ </SELECT>
+
+ The items, in order, have labels "2002", "2001" and "2000", whereas
their
+ names (the OPTION values) are "0", "1" and "2000" respectively. Note
that
+ the value of the last OPTION in this example defaults to its contents,
as
+ specified by RFC 1866, as do the labels of the second and third
OPTIONs.
+
+ The OPTION labels are sometimes more meaningful than the OPTION values,
+ which can make for more maintainable code.
+
+ Additional read-only public attribute: attrs
+
+ The attrs attribute is a dictionary of the original HTML attributes of
the
+ SELECT element. Other ListControls do not have this attribute,
because in
+ other cases the control as a whole does not correspond to any single
HTML
+ element. control.get(...).attrs may be used as usual to get at the
HTML
+ attributes of the HTML elements corresponding to individual list items
(for
+ SELECT controls, these are OPTION elements).
+
+ Another special case is that the Item.attrs dictionaries have a
special key
+ "contents" which does not correspond to any real HTML attribute, but
rather
+ contains the contents of the OPTION element:
+
+ <OPTION>this bit</OPTION>
+
+ """
+ # HTML attributes here are treated slightly differently from other list
+ # controls:
+ # -The SELECT HTML attributes dictionary is stuffed into the OPTION
+ # HTML attributes dictionary under the "__select" key.
+ # -The content of each OPTION element is stored under the special
+ # "contents" key of the dictionary.
+ # After all this, the dictionary is passed to the SelectControl
constructor
+ # as the attrs argument, as usual. However:
+ # -The first SelectControl constructed when building up a SELECT
control
+ # has a constructor attrs argument containing only the __select key
-- so
+ # this SelectControl represents an empty SELECT control.
+ # -Subsequent SelectControls have both OPTION HTML-attribute in attrs
and
+ # the __select dictionary containing the SELECT HTML-attributes.
+
+ def __init__(self, type, name, attrs, select_default=False,
index=None):
+ # fish out the SELECT HTML attributes from the OPTION HTML
attributes
+ # dictionary
+ self.attrs = attrs["__select"].copy()
+ self.__dict__["_label"] = _get_label(self.attrs)
+ self.__dict__["id"] = self.attrs.get("id")
+ self.__dict__["multiple"] = self.attrs.has_key("multiple")
+ # the majority of the contents, label, and value dance already
happened
+ contents = attrs.get("contents")
+ attrs = attrs.copy()
+ del attrs["__select"]
+
+ ListControl.__init__(self, type, name, self.attrs, select_default,
+ called_as_base_class=True, index=index)
+ self.disabled = self.attrs.has_key("disabled")
+ self.readonly = self.attrs.has_key("readonly")
+ if attrs.has_key("value"):
+ # otherwise it is a marker 'select started' token
+ o = Item(self, attrs, index)
+ o.__dict__["_selected"] = attrs.has_key("selected")
+ # add 'label' label and contents label, if different. If both
are
+ # provided, the 'label' label is used for display in HTML
+ # 4.0-compliant browsers (and any lower spec? not sure) while
the
+ # contents are used for display in older or less-compliant
+ # browsers. We make label objects for both, if the values are
+ # different.
+ label = attrs.get("label")
+ if label:
+ o._labels.append(Label({"__text": label}))
+ if contents and contents != label:
+ o._labels.append(Label({"__text": contents}))
+ elif contents:
+ o._labels.append(Label({"__text": contents}))
+
+ def fixup(self):
+ ListControl.fixup(self)
+ # Firefox doesn't exclude disabled items from those considered here
+ # (i.e. from 'found', for both branches of the if below). Note
that
+ # IE6 doesn't support the disabled attribute on OPTIONs at all.
+ found = [o for o in self.items if o.selected]
+ if not found:
+ if not self.multiple or self._select_default:
+ for o in self.items:
+ if not o.disabled:
+ was_disabled = self.disabled
+ self.disabled = False
+ try:
+ o.selected = True
+ finally:
+ o.disabled = was_disabled
+ break
+ elif not self.multiple:
+ # Ensure only one item selected. Choose the last one,
+ # following IE and Firefox.
+ for o in found[:-1]:
+ o.selected = False
+
+
+#---------------------------------------------------
+class SubmitControl(ScalarControl):
+ """
+ Covers:
+
+ INPUT/SUBMIT
+ BUTTON/SUBMIT
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ ScalarControl.__init__(self, type, name, attrs, index)
+ # IE5 defaults SUBMIT value to "Submit Query"; Firebird 0.6 leaves
it
+ # blank, Konqueror 3.1 defaults to "Submit". HTML spec. doesn't
seem
+ # to define this.
+ if self.value is None: self.value = ""
+ self.readonly = True
+
+ def get_labels(self):
+ res = []
+ if self.value:
+ res.append(Label({"__text": self.value}))
+ res.extend(ScalarControl.get_labels(self))
+ return res
+
+ def is_of_kind(self, kind): return kind == "clickable"
+
+ def _click(self, form, coord, return_type,
request_class=urllib2.Request):
+ self._clicked = coord
+ r = form._switch_click(return_type, request_class)
+ self._clicked = False
+ return r
+
+ def _totally_ordered_pairs(self):
+ if not self._clicked:
+ return []
+ return ScalarControl._totally_ordered_pairs(self)
+
+
+#---------------------------------------------------
+class ImageControl(SubmitControl):
+ """
+ Covers:
+
+ INPUT/IMAGE
+
+ Coordinates are specified using one of the HTMLForm.click* methods.
+
+ """
+ def __init__(self, type, name, attrs, index=None):
+ SubmitControl.__init__(self, type, name, attrs, index)
+ self.readonly = False
+
+ def _totally_ordered_pairs(self):
+ clicked = self._clicked
+ if self.disabled or not clicked:
+ return []
+ name =
self.name
+ if
==============================================================================
Diff truncated at 200k characters