How can I query the database inside a firetask?

41 views
Skip to first unread message

tifon...@gmail.com

unread,
Nov 16, 2015, 9:33:09 AM11/16/15
to fireworkflows
Dear all,
Assume that I have the following Firetask,

class FooTask(FireTaskBase):

    _fw_name = "footask"    

    def run_task(self, fw_spec):
        var_foo = "var_foo"
        PyTask(func = 'math.exp', args = [3.2] , stored_data_varname = var_foo)
        <more code here>


class BarTask(FireTaskBase):

    _fw_name = "footask"    

    def run_task(self, fw_spec):
          previous_foo = query_db("var_foo")
          <more code>

Does Fireworks have a query_db function? In other words, can I do some work in a task using Pytask then retrieve this result in a further computation?




   



Anubhav Jain

unread,
Nov 16, 2015, 12:00:52 PM11/16/15
to Felipe Zapata, fireworkflows
Hi,

I have a couple of comments here.

First, you should give BarTask a different _fw_name than footask (or don't set the _fw_name at all, in which case it defaults to package_name.class_name, or use the @explicit_serialize decorator instead).

Second, I am not sure that you need to call PyTask inside of the "run_task" method of footask. The run_task method can contain any Python code, not just FireTasks. Perhaps you are using PyTask so that you can use the stored_data_varname. In that case, I would still suggest just calling the Python routines you want directly, and then storing the desired variables at the end using a FWAction (see example #1 below)

Currently, the stored_data_varname is meant to be more "archival" (look it up later for reference) than a variable that is meant to affect the execution of the workflow. In order to do the latter, I would suggest one of the following:

1) (recommended) don't use PyTask, and just use a FWAction at the end that both stores and passes on the variable:

def run_task(fw_spec):
    var_foo = math.exp(3,2)
    return FWAction(update_spec={"foo":var_foo}, stored_data={"foo":var_foo})

Then, the next FireTask will be able to access the variable using the 

fw_spec['foo']

This is due to the update_spec part. Actually, you might not need the stored_data at all in this case. The value of foo will be stored in the spec of the next FireWork due to update_spec. A few more details about this are in the following tutorials:


2) (not recommended). Remember, there is no simple way to access stored_data_varname. You will need to query the launches database to get this information. You can get access to a launchpad object by setting the following in your fw_spec:
{"_add_launchpad_and_fw_id": True}

Then, your firetask will have access to a "launchpad" and "fw_id" internal variables, i.e. inside of run_task() you will be able to access self.launchpad and self.fw_id. Then you can do something like:

stored_data = self.launchpad.launches.find({"fw_id": self.fw_id}, {"action.stored_data":1})['action']['stored_data']

*after* storing your data. Again, I don't recommend this technique.

Finally, if there is a certain way you'd *want* things to work, feel free to suggest it. We can always look into adding it if it makes sense in a general way.

Best,
Anubhav




--
You received this message because you are subscribed to the Google Groups "fireworkflows" group.
To unsubscribe from this group and stop receiving emails from it, send an email to fireworkflow...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/fireworkflows/49be51b6-e2df-4000-a020-c31c82d5d8df%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

tifon...@gmail.com

unread,
Nov 17, 2015, 12:41:05 PM11/17/15
to fireworkflows, tifon...@gmail.com, AJ...@lbl.gov
Thank you for the nice explanation!

Now, assuming that I pass the arguments as described in solution 1. Can I get the fw_spec from the last firework?
Suppose I have an script like,

    launchpad = LaunchPad()
    launchpad.reset('', require_password=False)
    
    fireworks_WF     = create_tasks(workflow)
    launchpad.add_wf(fireworks_WF) 

    rapidfire(launchpad, nlaunches=0)

    fw_spec = query_lpad(launchpad)

Is it there a way to query the lpad once the workflow is done, asking for some variables store in the fw_spec?

Best, 

Felipe 

Anubhav Jain

unread,
Nov 18, 2015, 1:40:44 PM11/18/15
to Felipe Zapata, fireworkflows
Hi Felipe,

Once the workflow is completed, you can query and display things inside the FW spec.

See this link for more info:

The "-d all" shows you the entire spec, and there is also an example that shows you how to query by the spec values.

There is also a web site that you can use to click and browse results:

Finally, if you are familiar with MongoDB, you can directly query the database. The links to the collections are LaunchPad.fireworks, LaunchPad.workflows, and LaunchPad.launches. e.g.

my_launchpad = LaunchPad.from_file(MY_FILE)
my_launchpad.fireworks.find({"fw_id":1}, {"spec":1})

See docs for MongoDB and pymongo for more on how to use the find() command and related commands.

Best
Anubhav

Reply all
Reply to author
Forward
Message has been deleted
0 new messages