question: how to look for files in both a specific directory and in a single directory at the same time?

4 views
Skip to first unread message

gggeek

unread,
Apr 13, 2011, 4:18:44 AM4/13/11
to Pake discussions
The use case I have in mind is:
have a list of 'paths', where each path can be either a filename or a
foldername, and it can be specified as either a complete path (eg.
that/dir/file.txt) or as a relative path (eg. */file.txt).

This can be done in ant, and it is pretty useful when you want to have
a list of eg. files/dirs to remove from a build. Some of them you want
to delete in a specific place, some other wherever they appear.

gggeek

unread,
Apr 13, 2011, 4:19:40 AM4/13/11
to Pake discussions
of course the topic should have been "how to look for files in both a
specific directory and in any directory at the same time?"

Alexey Zakhlestin

unread,
Apr 13, 2011, 4:36:04 AM4/13/11
to pa...@googlegroups.com

Well, for most of the cases, this should be possible to achieve with pakeFinder.

pakeFinder object is a set of rules, which can be applied (using $obj->in() method) to various directories, so you can create several pakeFinder objects and apply them to any directories you like afterwards.

Sometimes, that won't be enough, as you need to merge several sets of rules.
We can try to plan, how such tool would look like.

class pakeMultiFinder implements ArrayAccess
{
public function in($path);

/* ArrayAccess methods here */
}

Gaetano Giunta

unread,
Apr 13, 2011, 6:14:35 AM4/13/11
to pa...@googlegroups.com
Alexey Zakhlestin wrote:
> On 13.04.2011, at 12:18, gggeek wrote:
>
>> The use case I have in mind is:
>> have a list of 'paths', where each path can be either a filename or a
>> foldername, and it can be specified as either a complete path (eg.
>> that/dir/file.txt) or as a relative path (eg. */file.txt).
>>
>> This can be done in ant, and it is pretty useful when you want to have
>> a list of eg. files/dirs to remove from a build. Some of them you want
>> to delete in a specific place, some other wherever they appear.
> Well, for most of the cases, this should be possible to achieve with pakeFinder.
>
> pakeFinder object is a set of rules, which can be applied (using $obj->in() method) to various directories, so you can create several pakeFinder objects and apply them to any directories you like afterwards.
Pardon my ignorance: what happens currently if I apply two times the ->in() method to the same pakeFinder object?

> Sometimes, that won't be enough, as you need to merge several sets of rules.
> We can try to plan, how such tool would look like.
>
> class pakeMultiFinder implements ArrayAccess
> {
> public function in($path);
>
> /* ArrayAccess methods here */
> }
I'm trying to allow usage of ANT patterns. Will be back with the findings...

Alexey Zakhlestin

unread,
Apr 13, 2011, 6:40:15 AM4/13/11
to pa...@googlegroups.com

On 13.04.2011, at 14:14, Gaetano Giunta wrote:

> Alexey Zakhlestin wrote:
>> On 13.04.2011, at 12:18, gggeek wrote:
>>
>>> The use case I have in mind is:
>>> have a list of 'paths', where each path can be either a filename or a
>>> foldername, and it can be specified as either a complete path (eg.
>>> that/dir/file.txt) or as a relative path (eg. */file.txt).
>>>
>>> This can be done in ant, and it is pretty useful when you want to have
>>> a list of eg. files/dirs to remove from a build. Some of them you want
>>> to delete in a specific place, some other wherever they appear.
>> Well, for most of the cases, this should be possible to achieve with pakeFinder.
>>
>> pakeFinder object is a set of rules, which can be applied (using $obj->in() method) to various directories, so you can create several pakeFinder objects and apply them to any directories you like afterwards.
> Pardon my ignorance: what happens currently if I apply two times the ->in() method to the same pakeFinder object?

->in() returns an array and does NOTHING to original pakeFinder object.
so, you will just get several arrays :)

gggeek

unread,
Apr 13, 2011, 9:09:38 AM4/13/11
to Pake discussions
Here's the code - ugly looking but seems to be working with my limited
testing:

/**
* Mimics ant pattern matching.
* New addition (afaict): any pattern ending in '/' will only match
directories
* @see http://ant.apache.org/manual/dirtasks.html#patterns
* @todo more complete testing
*/
function pake_antpattern( $files, $rootdir )
{
$results = array();
foreach( $files as $file )
{
//echo " Beginning with $file in dir $rootdir\n";
$type = 'any';
// if user set '/ 'as last char: we look for directories only
if ( substr( $file, -1 ) == '/' )
{
$type = 'dir';
$file = substr( $file, 0, -1 );
}
// managing 'any subdir or file' as last item: trick!
if ( strlen( $file ) >= 3 && substr( $file, -3 ) == '/**' )
{
$file .= '/*';
}

$dir = dirname( $file );
$file = basename( $file );
if ( strpos( $dir, '**' ) !== false )
{
$split = explode( '/', $dir );
$path = '';
foreach( $split as $i => $part )
{
if ( $part != '**' )
{
$path .= "/$part";
}
else
{
//echo " Looking for subdirs in dir
$rootdir{$path}\n";
$newfile = implode( '/', array_slice( $split, $i +
1 ) ) . "/$file" . ( $type == 'dir'? '/' : '' );
$dirs = pakeFinder::type( 'dir' )->in( $rootdir .
$path );

foreach( $dirs as $newdir )
{
//echo " Iterating in $newdir, looking for
$newfile\n";
$found = pake_antpattern( array( $newfile ),
$newdir );
$results = array_merge( $results, $found );
}
break;
}
}
}
else
{
//echo " Looking for $type $file in dir $rootdir/$dir\n";
$found = pakeFinder::type( $type )->name( $file )-
>maxdepth( 0 )->in( $rootdir . '/' . $dir );
//echo " Found: " . count( $found ) . "\n";
$results = array_merge( $results, $found );
}
}
return $results;
}

Note that I did not run tests comparing results with actual ant - just
based on visual inspection.

Alexey Zakhlestin

unread,
Apr 14, 2011, 3:38:30 AM4/14/11
to pa...@googlegroups.com
On 13.04.2011, at 17:09, gggeek wrote:
> On Apr 13, 12:14 pm, Gaetano Giunta <giunta.gaet...@gmail.com> wrote:
>> Alexey Zakhlestin wrote:
>>> On 13.04.2011, at 12:18, gggeek wrote:
>>
>>>> The use case I have in mind is:
>>>> have a list of 'paths', where each path can be either a filename or a
>>>> foldername, and it can be specified as either a complete path (eg.
>>>> that/dir/file.txt) or as a relative path (eg. */file.txt).
>>
>>>> This can be done in ant, and it is pretty useful when you want to have
>>>> a list of eg. files/dirs to remove from a build. Some of them you want
>>>> to delete in a specific place, some other wherever they appear.
>>> Well, for most of the cases, this should be possible to achieve with pakeFinder.
>>
>>> pakeFinder object is a set of rules, which can be applied (using $obj->in() method) to various directories, so you can create several pakeFinder objects and apply them to any directories you like afterwards.
>>
>> Pardon my ignorance: what happens currently if I apply two times the ->in() method to the same pakeFinder object?> Sometimes, that won't be enough, as you need to merge several sets of rules.
>>> We can try to plan, how such tool would look like.
>>
>>> class pakeMultiFinder implements ArrayAccess
>>> {
>>> public function in($path);
>>
>>> /* ArrayAccess methods here */
>>> }
>>
>> I'm trying to allow usage of ANT patterns. Will be back with the findings...
>
> Here's the code - ugly looking but seems to be working with my limited
> testing:


I put the code here: https://gist.github.com/919047
It's your code, but formatted according to PEAR coding standards with a bit of my personal taste tweaking.

I didn't do any tests yet.

After reading this code, I am even more sure, that pakeMultiFinder idea is the way to go. It would allow us to use other similar pattern-generators without repeating code over and over again

I'll try to implement something like that in a branch and will post result here.

Alexey Zakhlestin

unread,
Apr 16, 2011, 2:28:58 AM4/16/11
to pa...@googlegroups.com

On 13.04.2011, at 12:18, gggeek wrote:

I implemented ant-style matching in pakeFinder.
In the end, it was simpler than I thought, so I didn't need to, actually go with "multifinder" way.

implementation is not based on your code, but is made to match ant's documentation.
I also added some tests

How to use:

$rule = pakeFinder::type('any')->pattern('*/file.txt')->not_pattern('tmp*/*');
$files = $rule->in($target);

This will match:
1) any file or directory
2) which is named "file.txt"
3) and is located in some first-level subdirectory of $target,
4) where name of this first-level subdirectory doesn't start with "tmp"

pattern() and not_pattern() accept multiple arguments, or you can just call them several times
and, of course, you can mix this with other pakeFinder's checks.


does this look ok? :)


by the way, here is the order of pakeFinder's rules application:

1) check if last piece of path matches "type"
2) check if last piece of path is discarded
3) check if relative path matches ->not_pattern()/->pattern()
4) check if last piece of path matches ->not_name()/->name()
5) check if file size matches ->size() [only for files]
6) check if path is ok with custom checkers set with ->exec()

Gaetano Giunta

unread,
Apr 16, 2011, 7:44:02 AM4/16/11
to pa...@googlegroups.com
Alexey Zakhlestin wrote:
> On 13.04.2011, at 12:18, gggeek wrote:
>
>> The use case I have in mind is:
>> have a list of 'paths', where each path can be either a filename or a
>> foldername, and it can be specified as either a complete path (eg.
>> that/dir/file.txt) or as a relative path (eg. */file.txt).
>>
>> This can be done in ant, and it is pretty useful when you want to have
>> a list of eg. files/dirs to remove from a build. Some of them you want
>> to delete in a specific place, some other wherever they appear.
> I implemented ant-style matching in pakeFinder.
> In the end, it was simpler than I thought, so I didn't need to, actually go with "multifinder" way.
I was also thinking about this solution as the preferrable

> implementation is not based on your code, but is made to match ant's documentation.
> I also added some tests
>
> How to use:
>
> $rule = pakeFinder::type('any')->pattern('*/file.txt')->not_pattern('tmp*/*');
> $files = $rule->in($target);
>
> This will match:
> 1) any file or directory
> 2) which is named "file.txt"
> 3) and is located in some first-level subdirectory of $target,
> 4) where name of this first-level subdirectory doesn't start with "tmp"
>
> pattern() and not_pattern() accept multiple arguments, or you can just call them several times
> and, of course, you can mix this with other pakeFinder's checks.
>
>
> does this look ok? :)
>
Will try out!
Reply all
Reply to author
Forward
0 new messages