Feature request: task max-age

99 views
Skip to first unread message

Charlie

unread,
May 19, 2011, 7:30:04 AM5/19/11
to python-doit
Team,

I'm using doit productively to automate sysadmin tasks. It helps me
to break complex processes into simple tasks and then glue them
together. Of course, make can glue tasks also -- I love gnumake. But
doit supports task dependencies and gives me all the features of
Python.

Here I'd like to propose some features to make doit a more powerful
tool. Of the all my wisks, only the first is essential.

As a sysadmin, I schedule periodic tasks, for example, cleaning up log
files every day.
It will be very helpful if a doit task can support the following key:

repeat: <timedelta>

After timedelta, the task will be run again when we run doit.

I'd also like to check some conditions before I execute a task, for
example, create an user account only if it does not exist; but in case
that the account exists, the task does not fail, it's bypassed. I
understand that I can check the conditions in the actions; but it's
clean to separate out the condition checking logic. So I'd like a
doit task to support the following key:

conditions: [ ... ]

Conditions are specified like actions.
A cmd-action is true if it returns 0;
a python action is true is it returns True.

Talking about cmd-action, it's specified as a string. It can be nice
if we can also specify tuple cmd actions. Instead of "ls a b c", I
can specify ("ls", ["a", "b", "c"]). For this simple example, the
string version looks better, but for more complex commands, the tuple
version can be cleaner, as I use doit to glue together pretty long
commands.

Best Regards!
Charlie

Tim Diels

unread,
May 19, 2011, 7:38:20 AM5/19/11
to pytho...@googlegroups.com
On Thu, May 19, 2011 at 1:30 PM, Charlie <quanl...@gmail.com> wrote:
As a sysadmin, I schedule periodic tasks, for example, cleaning up log
files every day.
It will be very helpful if a doit task can support the following key:

repeat: <timedelta>

    After timedelta, the task will be run again when we run doit.

Hi, I'm not part of the team, but couldn't you use cron for this?

Charlie

unread,
May 19, 2011, 7:42:28 AM5/19/11
to python-doit
Sorry, I give a wrong title/subject to my post, the title should be

Feature requests: task repeat, conditions, tuple cmd-actions

So, maybe someone can change the title, or even delete it so I can
post again.

Charlie

Quanlin Guo

unread,
May 19, 2011, 8:13:29 AM5/19/11
to pytho...@googlegroups.com
Tim:

Thanks for your suggestion.  Yes, I can use cron to do this by maintaining many cron entries for many tasks.   I'm trying to use doit as a comprehensive configuration management tool, like CFEngine.  My plan is to maintain a whole set of system configuration tasks, then schedule doit every 5 minute or 1 hour.   Inside dodo.py, the tasks can have different repeat-intervals.   It's easier to maintain one dodo.py compared to maintaining crontab in my complex environemts, because I have different Linux distributions and versions, several Solaris versions, HP-UX, AIX, Windows.  Python + doit can be packaged together and distributed to all my boxes. 

Here are some examples of what I have done with doit:  create and configure Linux virtual machines,  generated ISOs for Linux kickstart CDs,  query my inventory database to generated sophisticated reports stage by stage.  

I love all the traditional unix power tools, like cron, make, perl, ..., I also like new comprehensive configuration management tools like CFEngine and Puppet.  But I find doit to be so versatile and so productive.  So I desire it to be even better ;-)  My "secret" plan it to use doit to replace CFEngine to manage my large and complex computing environment. 

Best Regards! 

Charlie

--
You received this message because you are subscribed to the Google Groups "python-doit" group.
To post to this group, send an email to pytho...@googlegroups.com.
To unsubscribe from this group, send email to python-doit...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/python-doit?hl=en-GB.

Tim Diels

unread,
May 19, 2011, 8:45:57 AM5/19/11
to pytho...@googlegroups.com
On Thu, May 19, 2011 at 2:13 PM, Quanlin Guo <quanl...@gmail.com> wrote:
It's easier to maintain one dodo.py compared to maintaining crontab in my complex environemts, because I have different Linux distributions and versions, several Solaris versions, HP-UX, AIX, Windows.  Python + doit can be packaged together and distributed to all my boxes.

Sounds reasonable :)

To whoever will implement this, this link might be helpful:
http://docs.python.org/library/sched.html

Charlie

unread,
May 19, 2011, 8:59:11 AM5/19/11
to python-doit
The "repeat" feature I asked for does not need doit to wait there and
repeat the task. It really means that after a time period, the task
is marked as out of date. It works exactly like the doit command
"forget" -- doit forgets the task after the specified timedelta, so
the task will be out-of-date and be picked up when doit is run
again.

So, instead of "repeat", maybe "forget-after" is better. Then we can
"forget-after" a specific datetime, or a timedelta.

Best Regards!

Charlie

On May 19, 8:45 am, Tim Diels <limyr...@gmail.com> wrote:

Eduardo Schettino

unread,
May 19, 2011, 11:36:34 AM5/19/11
to pytho...@googlegroups.com
On Thu, May 19, 2011 at 7:30 PM, Charlie <quanl...@gmail.com> wrote:


As a sysadmin, I schedule periodic tasks, for example, cleaning up log
files every day.
It will be very helpful if a doit task can support the following key:

repeat: <timedelta>

    After timedelta, the task will be run again when we run doit.

yeah, currently it is not easy to achieve this... so lets do it :)

what about we call it "timeout" instead of "repeat" or "forget-after"?

i guess it will be simple to simple to add this:

1) on dependency.py:DependencyBase.save_success include current timestamp (maybe save this only if "timeout" is active)
2) on dependency.py:DependencyBase.get_status add one more condition to check if the timeout expired

can you give it a try and implement this?

 
I'd also like to check some conditions before I execute a task, for
example, create an user account only if it does not exist; but in case
that the account exists, the task does not fail, it's bypassed.  I
understand that I can check the conditions in the actions; but it's
clean to separate out the condition checking logic.   So I'd like a
doit task to support the following key:

conditions: [ ... ]

    Conditions are specified like actions.
    A cmd-action is true if it returns 0;
    a python action is true is it returns True.


"uptodate" is already a "condition" check...

##############################################

def fake_check_user_exist():
    return True

def task_create_user():

    def user_exist():
        return {'uptodate': [fake_check_user_exist()]}
    yield {'name': 'check_exist',
           'actions': [user_exist],
           }

    def create_user():
        print "creating user..."
    yield {'name': 'create',
           'actions': [create_user],
           'calc_dep': ['create_user:check_exist'],
           }


def task_need_user():
    return {'actions': ['echo hello user'],
            'setup': ['create_user'],
            }

##########################################
this would solve your problem?

 
Talking about cmd-action, it's specified as a string.  It can be nice
if we can also specify tuple cmd actions.  Instead of "ls a b c", I
can specify ("ls", ["a", "b", "c"]).  For this simple example, the
string version looks better, but for more complex commands, the tuple
version can be cleaner, as I use doit to glue together pretty long
commands.

if you pass a tuple to an action it *is* a python-action (callable, args, kwargs). I think that if list means it is a cmd-action would be confusing.
just use "join":

'actions': ("ls", " ".join(my_cmd_with_many_stuff))

cheers,
  Eduardo

Eduardo Schettino

unread,
May 19, 2011, 11:39:49 AM5/19/11
to pytho...@googlegroups.com
On Thu, May 19, 2011 at 8:13 PM, Quanlin Guo <quanl...@gmail.com> wrote:

I love all the traditional unix power tools, like cron, make, perl, ..., I also like new comprehensive configuration management tools like CFEngine and Puppet.  But I find doit to be so versatile and so productive.  So I desire it to be even better ;-)  My "secret" plan it to use doit to replace CFEngine to manage my large and complex computing environment. 

wow. thanks. after this I guess I should add a section on doit site with some quotes :)


Mnemnosi

unread,
May 19, 2011, 12:43:59 PM5/19/11
to python-doit
On May 19, 5:59 am, Charlie <quanlin...@gmail.com> wrote:
> So, instead of "repeat", maybe "forget-after" is better.  Then we can
> "forget-after" a specific datetime, or a timedelta.

I had a (vaguely) similar need, and personally implemented it
approximately as:

def should_run(previous):
...

def task_whatever():
return {"action": ["ls"], "function_dep": should_run}

It may not be the most elegant solution, but basically "function_dep"
is executed the first time and passed None as a value of previous. It
does whatever checking it needs to do, and returns a string.

This string is checked against the previous value, and if different,
the task is run. Its then saved like a task result similar things in
doit's db.

Next time things run, this process is repeated. Its basically a little
hack override the built in dependency-logic.

For a 'forget-after', I'd do (approximately):

def every_other_day(previous):
now = datetime.datetime.now()
if previous is None:
return now.strftime("%c")

then = datetime.datetime.strptime(previous, "%c")

if then + datetime.timedelta(days=2) > now:
return now.strftime("%c")
else:
return previous

Soo, it'd look like:

return {"actions": ["ls"], "function_dep": every_other_day}

The above is a bit off the top of my head, because I don't actually
currently have anything I'd only do after X amount of time. But I do
have a couple function_dep's that basically do slightly esoteric
checks in a couple other places. (In my case, it was more "check to
see if external-thing-over-there says I should re-do this, which it'll
say periodically").

I'm pretty sure its not strictly needed, that a combination of
calc_dep's and uptodate's will let you do anything function_dep could
do: but the logic of setting those up gets unwieldy for my taste.

I could clean up the hack into a proper patch if desired.

--matt

Eduardo Schettino

unread,
May 19, 2011, 1:47:37 PM5/19/11
to pytho...@googlegroups.com
On Fri, May 20, 2011 at 12:43 AM, Mnemnosi <mnem...@lavabit.com> wrote:
On May 19, 5:59 am, Charlie <quanlin...@gmail.com> wrote:
> So, instead of "repeat", maybe "forget-after" is better.  Then we can
> "forget-after" a specific datetime, or a timedelta.

I had a (vaguely) similar need, and personally implemented it
approximately as:

def should_run(previous):
   ...

def task_whatever():
   return {"action": ["ls"], "function_dep": should_run}

It may not be the most elegant solution, but basically "function_dep"
is executed the first time and passed None as a value of previous. It
does whatever checking it needs to do, and returns a string.

This string is checked against the previous value, and if different,
the task is run. Its then saved like a task result similar things in
doit's db.

if you are just checking if the returned string changed thats exactly what result_dep does (except result_dep takes a task not a callable)
 
I'm pretty sure its not strictly needed, that a combination of
calc_dep's and uptodate's will let you do anything function_dep could
do:
there is no way for a task access its own previous results. so maybe it is not possible to achieve the same thing as you did with function_dep. 
 
but the logic of setting those up gets unwieldy for my taste.
ok. you guys made my mind. it seems you all want some lightweight way of doing things that does not require using calc_deps.
 

I could clean up the hack into a proper patch if desired.

please show me your patch. dont worry about "clean up"...


Charlie

unread,
May 19, 2011, 2:14:37 PM5/19/11
to python-doit
Eduardo,

Thanks! I will try to implement this and send the diffs to you
later.

Charlie

On May 19, 11:36 am, Eduardo Schettino <schettin...@gmail.com> wrote:

Charlie

unread,
May 19, 2011, 5:22:08 PM5/19/11
to python-doit
Eduardo:

I have implemented the feature on top of the 0.11.0 release.
Here is a description of the doit task timeout feature implemented:

A task can have a timeout attribute.
The attribute value can be of type integer or datetime.timedelta.
When the value is an integer, the time unit is second.

A task may not have both run_once and timeout.

If a task is successfully run, it will timeout after the specified
time interval.
This is one way for the task to be out of date.

If a task has a timeout attribute and it has no dependency,
then, once it runs successfully, it will be skipped until it
timeouts.

I'm not sure what's the best way to send the implementation to you.
I'm going to email the files to you.

Best Regards!

Charlie

Charlie

unread,
May 19, 2011, 5:43:24 PM5/19/11
to python-doit
Matt:

I have implemented the doit task timeout feature and sent all files to
Eduardo and forwarded you a copy also.

Best Regards!

Charlie

Charlie

unread,
May 19, 2011, 5:48:29 PM5/19/11
to python-doit
Eduardo,

> > "uptodate" is already a "condition" check...
I agree and can use that solution.

As for the string cmd-action, I'm fine with it.

As I said, only the timeout feature is essential. Now I have a trial
implementation, and would really appreciate it if you integrate the
feature into a future release.

By the way, the little amount of time it takes to add the feature is a
testament to the clarity of your doit implementation.

Best Regards!

Charlie

On May 19, 11:36 am, Eduardo Schettino <schettin...@gmail.com> wrote:

Eduardo Schettino

unread,
May 20, 2011, 2:00:23 PM5/20/11
to pytho...@googlegroups.com


On Fri, May 20, 2011 at 5:22 AM, Charlie <quanl...@gmail.com> wrote:
Eduardo:

I have implemented the feature on top of the 0.11.0 release.
Here is a description of the doit task timeout feature implemented:

   A task can have a timeout attribute.
   The attribute value can be of type integer or datetime.timedelta.
   When the value is an integer, the time unit is second.

   A task may not have both run_once and timeout.

   If a task is successfully run, it will timeout after the specified
time interval.
   This is one way for the task to be out of date.

   If a task has a timeout attribute and it has no dependency,
   then, once it runs successfully, it will be skipped until it
timeouts.

I'm not sure what's the best way to send the implementation to you.
I'm going to email the files to you.

Best Regards!

Charlie

Hi,

I commit this feature into trunk.
(uptodate callables)    http://bazaar.launchpad.net/~schettino72/doit/trunk/revision/394
(timeout)                   http://bazaar.launchpad.net/~schettino72/doit/trunk/revision/395

I ended up with a solution based on Matt ideas...

uptodate callables
=================

 * first I added support for callables on "uptodate" (this is what Charlie was calling "condition" and Matt "function_dep"). this function must return True/False/None...

 * this callable must take at least 2 parameters "task" and "values". of course you can just ignore them and you can also add more parameters to the callable.
     -  The "task" parameter will give you access to task object. so you have access to its metadata and opportunity to modifiy the task itself!
     -  "values" is a dictionary with the values saved in the last successful execution of the task.

run_once
===========

with this new feature I could remove run_once from doit core and implement it as function to be passed to 'uptodate' attribute. the implementation is only 5 lines!

def run_once(task, values):
    def save_executed():                      # save a value to indicate this task was executed
        return {'run-once': True} 
    task.insert_action(save_executed)   # add an action to the task that is used only to save a value
    return values.get('run-once', False)   # check if this task was ever run before

def task_xxx():
    ....
    'uptodate': [run_once],


timeout
============

i just adapted Charlie's code here. I put a wrapper function that returns a callable

def timeout(timeout_limit):
    if isinstance(timeout_limit, datetime.timedelta):                # convert timeout_limit to int and check invalid input
        limit_sec = timeout_limit.seconds
    elif isinstance(timeout_limit, int):
        limit_sec = timeout_limit
    else:
        msg = "timeout should be datetime.timedelta or int got %r "
        raise Exception(msg % timeout_limit)

    def uptodate_timeout(task, values):
        def save_now():                                                       # save time from successful execution
            return {'success-time': time.time()}
        task.insert_action(save_now)                                    # add action to task to save time

        last_success = values.get('success-time', None)       # retrieve time from last successful execution
        if last_success is None:                                          # check first time execution
            return False
        return (time.time() - last_success) < limit_sec           # check task expired timeout
    return uptodate_timeout


def task_test3():
    return {
            'actions': ['echo test 3; date'],
            'uptodate': [timeout(10)],
            'verbosity': 2,
           }

##############################

so both run_once and timeout could be implemented by a user without modifying doit code. for convenience i added these 2 on doit.tools.

Matt, this way cover you function_dep usage?

cheers,
  Eduardo


Reply all
Reply to author
Forward
0 new messages