Create a bunch of demo contents "on the fly"

61 views
Skip to first unread message

Alejandro Gonzalez

unread,
Sep 23, 2015, 6:51:14 AM9/23/15
to Google App Engine
Hello,

I've been struggling with the proper way to create a bunch of demo contents "on the fly"...

This is the scenario:
  1. I have an application in which users can create an account before purchasing it.
  2. When a user creates an account, he may choose to generate demo contents (to see the application in action and filled with data)
  3. When a new demo account is created a lot of new entities needs to be created in the recently created namespace for that user.


The first approach was to use the application logic to generate all the demo contents in the new namespace when the user request the demo, in a task queue. Shortly after implementing this approach I realized that it was very time/resource consuming. If i use the application logic, a single action may produce more task and datastore updates (one user makes a +1 in a publication, and that +1 generate notifications, user activity, statistics, global counters, etc...).


The second approach was to use mapreduce/pipelines to, basically, copy a pre-generated demo's namespace into the new one. This approach sounds much better, and takes significant less time and resources to accomplish the task. The problem with this approach is that I start seeing this error "java.lang.IllegalArgumentException: the id allocated for a new entity was already in use, please try again". I just need to re-allocate all IDs for all entities I'm copying, but most of them relies on automatic-ids generation (scattered IDs). As per this issue I can't reallocate scattered IDs (https://code.google.com/p/googleappengine/issues/detail?id=11541)(http://stackoverflow.com/questions/32652316/exceeded-maximum-allocated-ids-exception-when-allocating-keyrange-appengine-o). There are entities in which may have sense to manually generate the ids (legacy IDs) but there are other entities that this has no sense at all.


My questions are:
- Am I doing something wrong with the second approach? Should I reconsider all my application ID generation just to be able to copy entities between namespaces and allocate it's scattered IDs?
- Should I focus on generate a demo with legacy IDs to use as a base demo to copy to another namespace?
- Am I missing a third approach? 


Thanks in advance






Nick (Cloud Platform Support)

unread,
Sep 23, 2015, 3:00:11 PM9/23/15
to Google App Engine
For the reason you mentioned, simulating each user action sequentially, incrementing global/sharded counters, triggering the creation of notification entities, etc. will not be the fastest way. A bulk-creation of all required data at a frozen state is definitely preferable, and a distributed task to read the data from a permanent, read-only namespace into a namespace created for the demo sounds like a great approach.

At this point, however, we encounter an issue running MapReduce which I'm not sure I understand after reading your second paragraph. Are you storing serialized keys on any of your entities? I don't think there should be a problem saving entities with the same ID so long as they're in a different namespace. Maybe share the code for your MapReduce tasks.

Alejandro Gonzalez

unread,
Sep 25, 2015, 4:13:41 AM9/25/15
to Google App Engine
Hello Nick, thanks for your response.

The problem with the second approach is:

- The majority of my Entities has Long IDs generated by the Datastore. The auto-generated IDs are scattered IDs: https://cloud.google.com/appengine/docs/java/datastore/entities#Java_Assigning_identifiers

- When copying one entity from the frozen namespace to the new namespace I need to allocate its ID in the new namespace, otherwise the  datastore may generate one ID that is already in use for that Entity in the new namespace. From the docs: 

  System-allocated ID values are guaranteed unique to the entity group. If you copy an entity from one entity group or namespace to another and wish to preserve the ID part of the key, be sure to allocate the ID first to prevent Datastore from selecting that ID for a future assignment.

- When trying to allocate a new ID (which is a scattered ID auto-generated by the datastore) in the new namespace, I'm getting this error: java.lang.IllegalArgumentException: Exceeded maximum allocated IDs. Doing a search, that error seems to be related with this post on stackoverflow: http://stackoverflow.com/questions/32652316/exceeded-maximum-allocated-ids-exception-when-allocating-keyrange-appengine-o (which explains fairly well my situation with the exception that I'm using the low level Datastore API for the allocation) and its related to this issue in the tracker: https://code.google.com/p/googleappengine/issues/detail?id=11541

So the problem is in the DatastoreService.allocateIdRange() function, when trying to allocate scattered IDs. 


Finally I found a workaround that works for me and avoids the bug in the DatastoreService.allocateIdRange(). What I'm doing now is basically make manipulate all the IDs I found (in Entity Keys and in properties that reference a Key) the IDs to short them when copying the Entity to the new Datastore:


private void allocateIdForKey(Key entityK){
if( entityK.getId() > 0 ){
//the entity has an id and it must be allocated
//to avoid collisions in ids
KeyRange range = new KeyRange(
entityK.getParent(),
entityK.getKind(),
entityK.getId(),
entityK.getId());
        //throws an exception if the Long ID in the Key is too big
ds.allocateIdRange(range);
}
}

//avoid https://code.google.com/p/googleappengine/issues/detail?id=11541
//by cutting down the long ID if it is too big
private Long getNewId( Long id ){
return id.toString().length() > 6 ?
new Double(Math.ceil(id/2)).longValue() :
id;
}

private Key getNewKey(Key key){
Boolean keyHasStringId = !StringUtil.isNullOrEmpty(key.getName());
if( !keyHasStringId ){ //LONG IDs
        //short the ID if needed!! if we don't we'll get an Exceeded Maximum allocated IDs
        //exception
Long newId = getNewId(key.getId());
return KeyFactory.createKey(getNewKey(key.getParent()), key.getKind(), newId);
} else { //STRING IDs
return KeyFactory.createKey(getNewKey(key.getParent()), key.getKind(), key.getName());
}
}

@Override
public void map(Entity entity) {
NamespaceManager.set(toNamespace);
//change to the destination namespace, and create the new key for the entity
Key destinationKey = getNewKey(entity.getKey());
Entity destinationEntity = new Entity(destinationKey);
destinationEntity.setPropertiesFrom(entity);

//check entity properties for keys to update them
final Map<String, Object> properties = entity.getProperties();
Set<String> propKeys = properties.keySet();
for (String propKey : propKeys) {
Object property = entity.getProperty(propKey);
if( (property instanceof Key) ){
destinationKey = getNewKey((Key) property);
destinationEntity.setProperty(propKey, destinationKey);
}
}

allocateIdForKey(destinationEntity.getKey());
batcher.put(destinationEntity);
}


Thanks again for your time and effort to help the community!

Nick (Cloud Platform Support)

unread,
Sep 28, 2015, 4:45:30 PM9/28/15
to Google App Engine
Hey Alejandro,

That was an awesome post. This is very very valuable information and a great way to handle an issue. Thank you for your contribution to the community here!

Best wishes,

Nick
Reply all
Reply to author
Forward
0 new messages