Alternate to mapreduce

已查看 33 次
跳至第一个未读帖子

Rajesh Gupta

未读,
2017年7月21日 04:24:272017/7/21
收件人 google-a...@googlegroups.com
Is there any alternate to mapreduce framework.  Mapreduce is not supported well for namespaces and its api is not simple and it is not documented well.

Cloud dataflow lies outside the appengine expertise.  We cannot use objectify in dataflow.

Does Google console datastore UI team has any plans to facilitate making batch data changes using UI

Regards,
Rajesh
Accounting/Inventory/Orders/Sales/Purchase on Google Cloud Platform and Mobile
Field Service Software

Jordan (Cloud Platform Support)

未读,
2017年7月21日 16:24:052017/7/21
收件人 Google App Engine
You can instead use PUSH Task Queues to replicate a Map-Reduce job. Simply have a master method shard a job, and 'Map' the shards to other instances by simply enqueuing tasks. The instances that accept the shards (aka tasks) then perform the work and can write their results to the Datastore. Your master method then checks the Datastore on a looped timer until all shards are finished computing to finally perform your 'Reduce' phase. 

You can also experiment with deploying different services (aka separate groups of instances) and Shard (aka enqueuing tasks) across these different services to ensure each shard never waits in a pending queue for an available instance. You can of course lower the min-pending-latency of a single service to achieve this same goal of minimizing the pending queue and forcing new instance creations for faster Map-Reduce. 

- As for performing Dataflow work via the Console, I am not aware of this happening any time soon. If this is a real show stopper for you I recommend filing a feature request with them, specifying your exact use-case in detail. 

回复全部
回复作者
转发
0 个新帖子