an archive-like task

97 views
Skip to first unread message

Michael Gliwinski

unread,
Sep 27, 2011, 10:56:02 AM9/27/11
to pytho...@googlegroups.com
Hi All,

Let me start by saying I'm really lovin doit, at first the interface seemed
verbose but quickly changed my mind when I started using it and realised the
flexibility. Many thanks for the great software!

A (possibly) newb question. I'm working on a task that compiles a program and
then it needs to add that compiled program to a library which is like an
archive file (another task actually creates this archive/library). So the
problem I'm having is that I can't specify target on this add_to_lib task (as
multiple files would have the same target) and it keeps executing every time.
The code looks like this:

def task_compile_p():
# task that creates the archive/library if it doesn't exist
lib = 'foo.pl'
yield {
'name': 'lib',
'actions': ['# create lib command'],
'targets': [lib],
'uptodate': [True],
}

for fn in iglob('src/*.p'):
target = path.splitext(fn)[0] + '.r'

# task that compiles the procedure
yield {
'name': fn,
'actions': ['# compile command'],
'file_dep': [fn],
'targets': [target],
}

# task that adds compiled file to lib
yield {
'name': 'lib:' + fn,
'actions': ['# add to lib command'],
'file_dep': [lib, target],
'uptodate': [True],
# adding 'targets' on lib here can't work
}

Any ideas how to prevent that last task from running if none of the deps
changed?

Many thanks,
Michael


--
Michael Gliwinski
Henderson Group Information Services
9-11 Hightown Avenue, Newtownabby, BT36 4RT
Phone: 028 9034 3319

**********************************************************************************************
The information in this email is confidential and may be legally privileged. It is intended solely for the addressee and access to the email by anyone else is unauthorised.
If you are not the intended recipient, any disclosure, copying, distribution or any action taken or omitted to be taken in reliance on it, is prohibited and may be unlawful.
When addressed to our clients, any opinions or advice contained in this e-mail are subject to the terms and conditions expressed in the governing client engagement leter or contract.
If you have received this email in error please notify sup...@henderson-group.com

John Henderson (Holdings) Ltd
Registered office: 9 Hightown Avenue, Mallusk, County Antrim, Northern Ireland, BT36 4RT.
Registered in Northern Ireland
Registration Number NI010588
Vat No.: 814 6399 12
*********************************************************************************

Eduardo Schettino

unread,
Sep 27, 2011, 12:09:40 PM9/27/11
to pytho...@googlegroups.com
On Tue, Sep 27, 2011 at 10:56 PM, Michael Gliwinski <Michael....@henderson-group.com> wrote:
Hi All,

Let me start by saying I'm really lovin doit, at first the interface seemed
verbose but quickly changed my mind when I started using it and realised the
flexibility.  Many thanks for the great software!
great :)


         # task that adds compiled file to lib
         yield {
             'name': 'lib:' + fn,
             'actions': ['# add to lib command'],
             'file_dep': [lib, target],
             'uptodate': [True],
             # adding 'targets' on lib here can't work
             }

Any ideas how to prevent that last task from running if none of the deps
changed?

I didnt understand 100% the logic of what you trying to do... but i guess the main problem is setting 'lib' as a file_dep on the last task.

When lib doesnt exist is it required to run all add_to_lib tasks or the "create lib command" can create and add the initial compiled files?

assuming that "create lib command" can create and add the initial compiled files:

1) you just need to make sure the lib file exist, this could be done by adding a dependency to the task that creates it.

2) 'uptodate':  [True] # is useless in this case. you can remove it.


         # task that adds compiled file to lib
         yield {
             'name': 'lib:' + fn,
             'actions': ['# add to lib command'],
             'file_dep': [target],
             'task_dep': ['compile_p:lib'],
             }

this solves your problem?

cheers
  Eduardo

Michael Gliwinski

unread,
Sep 28, 2011, 4:24:31 AM9/28/11
to pytho...@googlegroups.com
On Tuesday 27 Sep 2011 17:09:40 Eduardo Schettino wrote:
> # task that adds compiled file to lib
>
> > yield {
> >
> > 'name': 'lib:' + fn,
> > 'actions': ['# add to lib command'],
> > 'file_dep': [lib, target],
> > 'uptodate': [True],
> > # adding 'targets' on lib here can't work
> > }
> >
> > Any ideas how to prevent that last task from running if none of the deps
> > changed?
>
> I didnt understand 100% the logic of what you trying to do... but i guess
> the main problem is setting 'lib' as a file_dep on the last task.
>
> When lib doesnt exist is it required to run all add_to_lib tasks or the
> "create lib command" can create and add the initial compiled files?

Unfortunately the "create lib" command doesn't handle adding files, it just
creates the initial file and fails if it already exists.

...


> 1) you just need to make sure the lib file exist, this could be done by
> adding a dependency to the task that creates it.

Will this be enough to ensure "add to lib" tasks are run when the library file
is deleted?

> 2) 'uptodate': [True] # is useless in this case. you can remove it.

Indeed, that was just one of the things I tried to prevent the task from
executing each time, along with 'uptodate': [run_once]. BTW, is there any
difference?

> # task that adds compiled file to lib
> yield {
> 'name': 'lib:' + fn,
> 'actions': ['# add to lib command'],
> 'file_dep': [target],
> 'task_dep': ['compile_p:lib'],
> }
>
> this solves your problem?

Almost, it prevents the task from running each time, but if the library file
is deleted only the 'compile_p:lib' task is run (i.e. the individual "add to
lib" tasks for each compiled file are not).

Why is 'lib' a problem as a file_dep? Is it because the 'lib' file changes
after adding each file?

Eduardo Schettino

unread,
Sep 28, 2011, 5:21:34 AM9/28/11
to pytho...@googlegroups.com
On Wed, Sep 28, 2011 at 4:24 PM, Michael Gliwinski <Michael....@henderson-group.com> wrote:
On Tuesday 27 Sep 2011 17:09:40 Eduardo Schettino wrote:
>
> When lib doesnt exist is it required to run all add_to_lib tasks or the
> "create lib command" can create and add the initial compiled files?

Unfortunately the "create lib" command doesn't handle adding files, it just
creates the initial file and fails if it already exists.

ok. in this case add_to_lib should have a dependency on the created "lib", but this
dependency is not a file_dep that checks if the file is changed... I added "result" timestamp
to the create lib task,  so add_to_lib can identify if it was applied to the most recently created "lib"
by a adding a result_dep.

full example in the bottom (with some fake actions).

you could also solve this problem in a different way by adding a 'uptodate' function to add_to_lib that would
check the created timestamp from 'lib' file directly from the file system.


> 2) 'uptodate':  [True] # is useless in this case. you can remove it.

Indeed, that was just one of the things I tried to prevent the task from
executing each time, along with 'uptodate': [run_once].  BTW, is there any
difference?
they are different. i.e. if you have a target that was not created by doit, run_once will run the task again (once). uptodate:True will not run (not even once).
I guess for your 'lib' task run_once is a better choice now that you require the task to be executed to save its result to be used by other tasks.
 


Why is 'lib' a problem as a file_dep?  Is it because the 'lib' file changes
after adding each file?

yes. sorry i didnt explain before... the signature of file_dep is saved after every task is executed successfully. they are NOT saved together at the end of doit execution. uhhmmm ,not sure the docs mention this, it should!

---------------

from glob import iglob
import time

from doit.tools import run_once

def task_compile_p():
    def created_time():
        return str(time.time())

    # task that creates the archive/library if it doesn't exist
    lib = 'foo.pl'
    yield {
        'name': 'lib',
        'actions': ['touch foo.pl', created_time], # fake
        'targets': [lib],
        'uptodate': [run_once],
        }

    for fn in iglob('*.p'):
        target = fn.split('.')[0] + '.r'


        # task that compiles the procedure
        yield {
            'name': fn,
            'actions': ['cp %(dependencies)s %(targets)s'], # fake

            'file_dep': [fn],
            'targets': [target],
            }

        # task that adds compiled file to lib
        yield {
            'name': 'lib:' + fn,
            'actions': ['echo %(dependencies)s >> foo.pl'], # fake
            'file_dep': [target],
            'result_dep': ['compile_p:lib'],
            }

-----------

cheers,
  Eduardo

Michael Gliwinski

unread,
Sep 28, 2011, 11:22:51 AM9/28/11
to pytho...@googlegroups.com
On Wednesday 28 Sep 2011 10:21:34 Eduardo Schettino wrote:
> Michael....@henderson-group.com> wrote:
> > On Tuesday 27 Sep 2011 17:09:40 Eduardo Schettino wrote:
> > > When lib doesnt exist is it required to run all add_to_lib tasks or the
> > > "create lib command" can create and add the initial compiled files?
> >
> > Unfortunately the "create lib" command doesn't handle adding files, it
> > just creates the initial file and fails if it already exists.
> >
> > ok. in this case add_to_lib should have a dependency on the created
> > "lib",
>
> but this
> dependency is not a file_dep that checks if the file is changed... I added
> "result" timestamp
> to the create lib task, so add_to_lib can identify if it was applied to
> the most recently created "lib"
> by a adding a result_dep.
>
> full example in the bottom (with some fake actions).

This is absolutely brilliant and works like a charm, many thanks.

> you could also solve this problem in a different way by adding a 'uptodate'
> function to add_to_lib that would
> check the created timestamp from 'lib' file directly from the file system.

Sorry, just so I understand what you meant here, what would it compare it to?

> > > 2) 'uptodate': [True] # is useless in this case. you can remove it.
> >
> > Indeed, that was just one of the things I tried to prevent the task from
> > executing each time, along with 'uptodate': [run_once]. BTW, is there
> > any difference?
>
> they are different. i.e. if you have a target that was not created by doit,
> run_once will run the task again (once). uptodate:True will not run (not
> even once).

Yes, just noticed this. Thanks for explanation.


Thanks for the help again, just learned something new :)

Eduardo Schettino

unread,
Sep 28, 2011, 11:52:33 AM9/28/11
to pytho...@googlegroups.com
On Wed, Sep 28, 2011 at 11:22 PM, Michael Gliwinski <Michael....@henderson-group.com> wrote:
On Wednesday 28 Sep 2011 10:21:34 Eduardo Schettino wrote:
> Michael....@henderson-group.com> wrote:
> > On Tuesday 27 Sep 2011 17:09:40 Eduardo Schettino wrote:> you could also solve this problem in a different way by adding a 'uptodate'

> function to add_to_lib that would
> check the created timestamp from 'lib' file directly from the file system.

Sorry, just so I understand what you meant here, what would it compare it to?

compare to itself :) take a look at the uptodate-> timeout implementation...  http://bazaar.launchpad.net/~schettino72/doit/trunk/view/head:/doit/tools.py#L69

I think this would be a nice addition to doit.tools. could you give a try implementing that? how should we call it?

regards,
  eduardo


Michael Gliwinski

unread,
Sep 29, 2011, 4:23:10 AM9/29/11
to pytho...@googlegroups.com
On Wednesday 28 Sep 2011 16:52:33 Eduardo Schettino wrote:
> > > also solve this problem in a different way by adding a 'uptodate'
> > > function to add_to_lib that would
> > > check the created timestamp from 'lib' file directly from the file
> > > system.
> >
> > Sorry, just so I understand what you meant here, what would it compare it
> > to?
>
> compare to itself :) take a look at the uptodate-> timeout
>
> implementation...
> http://bazaar.launchpad.net/~schettino72/doit/trunk/view/head:/doit/tools.p
> y#L69
>
> I think this would be a nice addition to doit.tools. could you give a try
> implementing that? how should we call it?

Ah, yes, sorry, just re-read the docs and realised uptodate callable gets
values saved from last run.

So if we called it 'ctime_changed' it would basically be a function with
signature like: `ctime_changed(filename)' which inserts an action that gets
the timestamp from file, and then compares last result to current timestamp,
yes?

Would it make sense to make it more generic, e.g. `timestamp_changed(fn,
time='create')' where time could be one of (atime, access, ctime, create,
mtime, modify) (sort of like ls --time=)? Suppose if you're depending on
mtime you might as well use a file_dep, but it could be helpful in cases where
you depend on directory which can't be a file_dep.


Also, a side-question, after learning more about uptodate I get the feeling it
might be more intuitive if it was called 'outofdate', then you could have e.g.
'outofdate': [on_config_changed] or 'outofdate': [on_timeout], etc. what do
you think? Not sure if it fits all cases. It's another question if it would
be practical to change now, but... ;)

Eduardo Schettino

unread,
Sep 29, 2011, 6:26:28 AM9/29/11
to pytho...@googlegroups.com
On Thu, Sep 29, 2011 at 4:23 PM, Michael Gliwinski
<Michael....@henderson-group.com> wrote:
>
> On Wednesday 28 Sep 2011 16:52:33 Eduardo Schettino wrote:So if we called it 'ctime_changed' it would basically be a function with

> Ah, yes, sorry, just re-read the docs and realised uptodate callable gets
> values saved from last run.

i know it is hard to get everything from the docs at once. if you have
suggestions on improving the docs they are welcome...

> signature like: `ctime_changed(filename)' which inserts an action that gets
> the timestamp from file, and then compares last result to current timestamp,
> yes?

yes

>
> Would it make sense to make it more generic, e.g. `timestamp_changed(fn,
> time='create')' where time could be one of (atime, access, ctime, create,
> mtime, modify) (sort of like ls --time=)?  Suppose if you're depending on
> mtime you might as well use a file_dep, but it could be helpful in cases where
> you depend on directory which can't be a file_dep.

great idea.  file_dep checks timestamp is exactly the same, if not it
checks file size different and than md5. so mtime would also be useful
if someone doesnt like comparing md5.

another possible parameter would be control the timestamp is exactly
the same or bigger. if you are willing to implement it you decide
which parameters goes in :)

>
> Also, a side-question, after learning more about uptodate I get the feeling it
> might be more intuitive if it was called 'outofdate', then you could have e.g.
> 'outofdate': [on_config_changed] or 'outofdate': [on_timeout], etc. what do
> you think?  Not sure if it fits all cases.  It's another question if it would
> be practical to change now, but... ;)
>

outofdate == not uptodate
i agree 'outofdate':[on_timeout] is better than 'uptodate':[timeout]
but what makes the difference is the "on"...  I think
'uptodate':[check_timeout] is as readable as 'outofdate':[on_timeout],
agree?

so IMO i dont think it is worth changing it at this point. adding an
alias for check_timeout, check_on_config_changed would be much easier
:D

cheers,
  Eduardo

Michael Gliwinski

unread,
Sep 29, 2011, 10:56:13 AM9/29/11
to pytho...@googlegroups.com
On Thursday 29 Sep 2011 11:26:28 Eduardo Schettino wrote:
> > signature like: `ctime_changed(filename)' which inserts an action that
> > gets the timestamp from file, and then compares last result to current
> > timestamp, yes?
>
> yes
>
> > Would it make sense to make it more generic, e.g. `timestamp_changed(fn,
> > time='create')' where time could be one of (atime, access, ctime, create,
> > mtime, modify) (sort of like ls --time=)? Suppose if you're depending on
> > mtime you might as well use a file_dep, but it could be helpful in cases
> > where you depend on directory which can't be a file_dep.
>
> great idea. file_dep checks timestamp is exactly the same, if not it
> checks file size different and than md5. so mtime would also be useful
> if someone doesnt like comparing md5.
>
> another possible parameter would be control the timestamp is exactly
> the same or bigger. if you are willing to implement it you decide
> which parameters goes in :)

OK, I'll have a stab at it later today :)

> > Also, a side-question, after learning more about uptodate I get the
> > feeling it might be more intuitive if it was called 'outofdate', then
> > you could have e.g. 'outofdate': [on_config_changed] or 'outofdate':
> > [on_timeout], etc. what do you think? Not sure if it fits all cases.
> > It's another question if it would be practical to change now, but... ;)
>
> outofdate == not uptodate
> i agree 'outofdate':[on_timeout] is better than 'uptodate':[timeout]
> but what makes the difference is the "on"... I think
> 'uptodate':[check_timeout] is as readable as 'outofdate':[on_timeout],
> agree?
>
> so IMO i dont think it is worth changing it at this point. adding an
> alias for check_timeout, check_on_config_changed would be much easier

Agreed. In this case maybe the function above should be called
check_timestamp_changed?

Cheers,
M

Michael Gliwinski

unread,
Sep 30, 2011, 5:03:20 AM9/30/11
to pytho...@googlegroups.com
On Thursday 29 Sep 2011 15:56:13 Michael Gliwinski wrote:
> On Thursday 29 Sep 2011 11:26:28 Eduardo Schettino wrote:
> > > signature like: `ctime_changed(filename)' which inserts an action that
> > > gets the timestamp from file, and then compares last result to current
> > > timestamp, yes?
> >
> > yes
> >
> > > Would it make sense to make it more generic, e.g.
> > > `timestamp_changed(fn, time='create')' where time could be one of
> > > (atime, access, ctime, create, mtime, modify) (sort of like ls
> > > --time=)? Suppose if you're depending on mtime you might as well use
> > > a file_dep, but it could be helpful in cases where you depend on
> > > directory which can't be a file_dep.
> >
> > great idea. file_dep checks timestamp is exactly the same, if not it
> > checks file size different and than md5. so mtime would also be useful
> > if someone doesnt like comparing md5.
> >
> > another possible parameter would be control the timestamp is exactly
> > the same or bigger. if you are willing to implement it you decide
> > which parameters goes in :)
>
> OK, I'll have a stab at it later today :)

Filled in a bug for tracking and started coding, will let you know when it's
ready for review/merge. Bug: https://bugs.launchpad.net/doit/+bug/862606

BTW, could you set the importance to wishlist; can't seem to do that as the
reporter.

Regards,
Michael

Reply all
Reply to author
Forward
0 new messages