[Java] Google App Engine bulk loader upload error “OverflowError: date value out of range”

48 views
Skip to first unread message

Dennis Lo

unread,
Jun 11, 2011, 10:04:18 AM6/11/11
to google-a...@googlegroups.com

I'm using the bulk loader to download my Review kind from one GAE server and upload to another GAE server. This Review kind contains about 30,000 entities.

I use the command to upload:

appcfg.py upload_data --kind=Review --filename=review.csv --application=<MY_APP_ID> --url=http://<MY_APP_ID>.appspot.com/remote_api --rps_limit=500 --bandwidth_limit=2500000 --batch_size=100

NOTE: The entire error is pretty long. So I've included it at the bottom of this question.

The upload doesn't start at all and stops at the line: [INFO ] Opening database: bulkloader-progress-20110611.222101.sql3

I see at the bottom of the entire trace: lambda val: _EPOCH + datetime.timedelta(microseconds=val), OverflowError: date value out of range

I suspect that this is caused by badly formated date's found in my publisheddate column (see below for entire Review kind).


What I've already tried to fix it

I ran the GQL: SELECT * FROM Review where publisheddate = null in the GAE admin portal and found around 15 entities with null values for the publisheddate column. So I deleted those, redownloaded and tried reuploading the Review kind. I still get the same error.

I'm thinking, since I'm using the upload command without a config file parameter e.g. --config_file=bulkloader.yaml (see above upload that I'm not specifying the --config_file)

I'm relying on the datastore statistics to determine what is downloaded to my review.csv file.

Now since I just deleted the entities with null publisheddate this redownloading and then reuploading immediately is not doing anything different from my initial download and upload. Since the datastore statistics aren't updated yet.


My Questions

  1. As there are 30, 000 entities I'm unsure as to how I can locate entities with badly formated dates. Is there a quick away to locate entities with bad publisheddate values?

  2. Does anyone know how to fix this issue? I'm not sure from the error trace what the real problem is.


Review kind

I'm using the objectify3.0.jar to manipulate my GAE datastore. So my Review kind looks like this:

public class Review {
   
@Id private Long id; //This is my key property, auto generated by objectify  
   
private String reviewlink;  
   
private Date publisheddate;
   
private double rating;      
   
private Long gameid;    //to make querying easier

   
//ommitting getters and setters
}

Entire error

10:20:48>appcfg.py upload_data --kind=Review --filename=review.csv --application=<MY_APP_ID> --url=http://<MY_APP_ID>.appspot.com/remote_api --rps_limit=500 --bandwidth_limit=2500000 --batch_size=100
Uploading data records.
[INFO    ] Logging to bulkloader-log-20110611.222101
[INFO    ] Throttling transfers:
[INFO    ] Bandwidth: 2500000 bytes/second
[INFO    ] HTTP connections: 8/second
[INFO    ] Entities inserted/fetched/modified: 500/second
[INFO    ] Batch Size: 100
[INFO    ] Opening database: bulkloader-progress-20110611.222101.sql3
Traceback (most recent call last):
 
File "/usr/local/bin/appcfg.py", line 76, in <module>
    run_file
(__file__, globals())
 
File "/usr/local/bin/appcfg.py", line 72, in run_file
    execfile
(script_path, globals_)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 3592, in <module>
    main
(sys.argv)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 3583, in main
    result
= AppCfgApp(argv).Run()
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 2229, in Run
    self
.action(self)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 3368, in __call__
   
return method()
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 3195, in PerformUpload
    run_fn
(args)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/appcfg.py", line 3087, in RunBulkloader
    sys
.exit(bulkloader.Run(arg_dict))
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 4360, in Run
   
return _PerformBulkload(arg_dict)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 4204, in _PerformBulkload
    loader
.initialize(filename, loader_opts)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 2879, in initialize
    self
.high_id_table = self._find_high_id(self.generate_records(filename))
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 2896, in _find_high_id
   
for values in record_generator:
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/tools/bulkloader.py", line 2920, in generate_records
    yield datastore
.Entity._FromPb(fixed_entity_proto)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/datastore.py", line 958, in _FromPb
    value
= datastore_types.FromPropertyPb(prop)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/datastore_types.py", line 1792, in FromPropertyPb
    value
= conversion(value)
 
File "/Applications/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/api/datastore_types.py", line 1722, in <lambda>
    lambda val
: _EPOCH + datetime.timedelta(microseconds=val),
OverflowError: date value out of range

Robert Kluin

unread,
Jun 11, 2011, 4:15:14 PM6/11/11
to google-a...@googlegroups.com
Perhaps you've got a bad datatime value on one of the entities?

> --
> You received this message because you are subscribed to the Google Groups
> "Google App Engine" group.
> To view this discussion on the web visit
> https://groups.google.com/d/msg/google-appengine/-/IwSEPZxAlHcJ.
> To post to this group, send email to google-a...@googlegroups.com.
> To unsubscribe from this group, send email to
> google-appengi...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/google-appengine?hl=en.
>

Dennis Lo

unread,
Jun 19, 2011, 9:41:01 PM6/19/11
to google-a...@googlegroups.com
Thanks Robert!

You comments really helped me find my solution this problem. It got me thinking outside the box to find which row(s) were causing this issues.

Check out my solution here: http://stackoverflow.com/questions/6316096/java-google-app-engine-bulk-loader-upload-error-overflowerror-date-value-out-of
Reply all
Reply to author
Forward
0 new messages