Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

List files while adding <a href=...>$file</a> for each

6 views
Skip to first unread message

Tuxedo

unread,
Aug 11, 2022, 9:30:35 PM8/11/22
to
Hello,

To list all MP4 files with an extentions, eg. mp4 or MP4, I can run 'locate'
and 'ls -lt' for a detailed and date ordered list.

locate -i .mp4|xargs ls -tl

-rw-r--r-- 1 tuxedo users 96959180 Dec 8 2018
/home/tuxedo/Desktop/videos2018/MAH0055.MP4
-rw-r--r-- 1 tuxedo users 45972453 Dec 8 2018
/home/tuxedo/Desktop/videos2018/MAH0045.MP4
-rw-r--r-- 1 tuxedo users 10313044 Dec 7 2018
/home/tuxedo/Desktop/videos2018/MAH0020.mp4
-rw-r--r-- 1 tuxedo users 981765973 Dec 7 2018
/home/tuxedo/Desktop/videos2018/MAH0011.MP4
etc.

For local file browsing, I would like the output as follows:

-rw-r--r-- 1 tuxedo users 96959180 Dec 8 2018 <a
href=/home/tuxedo/Desktop/videos2018/MAH0055.MP4>MAH0055</a>
-rw-r--r-- 1 tuxedo users 45972453 Dec 8 2018 <a
href=/home/tuxedo/Desktop/videos2018/MAH0045.MP4>MAH0045</a>
-rw-r--r-- 1 tuxedo users 10313044 Dec 7 2018 <a
href=/home/tuxedo/Desktop/videos2018/MAH0020.mp4>MAH0020</a>
-rw-r--r-- 1 tuxedo users 981765973 Dec 7 2018 <a
href=/home/tuxedo/Desktop/videos2018/MAH0011.MP4>MAH0011</a>
etc.

In other words:

add '<a href=' before the file path and the
append '>' after
append the filename for each find
append '</a>' after the filename

How can this be done using any combination of commands and constructs?

Many thanks,
Tuxedo

Janis Papanagnou

unread,
Aug 11, 2022, 10:53:12 PM8/11/22
to
On 12.08.2022 03:21, Tuxedo wrote:
> Hello,
>
> To list all MP4 files with an extentions, eg. mp4 or MP4, I can run 'locate'
> and 'ls -lt' for a detailed and date ordered list.

Are you sure that this is true?

>
> locate -i .mp4|xargs ls -tl

Locate will not do the sorting, and xargs will sort only those entries
of the locate output that fits in the exec buffer. So in the general
case - i.e. if not all files fit in one invocation of xargs's ls - you
will get chunks of sorted subsets only.

Janis

> [...]


Janis Papanagnou

unread,
Aug 11, 2022, 11:22:14 PM8/11/22
to
On 12.08.2022 03:21, Tuxedo wrote:
> Hello,
>
> To list all MP4 files with an extentions, eg. mp4 or MP4, I can run 'locate'
> and 'ls -lt' for a detailed and date ordered list.
>
> locate -i .mp4|xargs ls -tl

In my previous post I wrote that xargs ls -lt would not do what you
expected.

>
> -rw-r--r-- 1 tuxedo users 96959180 Dec 8 2018
> /home/tuxedo/Desktop/videos2018/MAH0055.MP4
> -rw-r--r-- 1 tuxedo users 45972453 Dec 8 2018
> /home/tuxedo/Desktop/videos2018/MAH0045.MP4
> -rw-r--r-- 1 tuxedo users 10313044 Dec 7 2018
> /home/tuxedo/Desktop/videos2018/MAH0020.mp4
> -rw-r--r-- 1 tuxedo users 981765973 Dec 7 2018
> /home/tuxedo/Desktop/videos2018/MAH0011.MP4
> etc.
>
> For local file browsing, I would like the output as follows:
>
> -rw-r--r-- 1 tuxedo users 96959180 Dec 8 2018 <a
> href=/home/tuxedo/Desktop/videos2018/MAH0055.MP4>MAH0055</a>
> -rw-r--r-- 1 tuxedo users 45972453 Dec 8 2018 <a
> href=/home/tuxedo/Desktop/videos2018/MAH0045.MP4>MAH0045</a>
> -rw-r--r-- 1 tuxedo users 10313044 Dec 7 2018 <a
> href=/home/tuxedo/Desktop/videos2018/MAH0020.mp4>MAH0020</a>
> -rw-r--r-- 1 tuxedo users 981765973 Dec 7 2018 <a
> href=/home/tuxedo/Desktop/videos2018/MAH0011.MP4>MAH0011</a>
> etc.
>
> In other words:
>
> add '<a href=' before the file path and the
> append '>' after
> append the filename for each find

The filename is (e.g.) "MAH0011.MP4" but you want "MAH0011", it seems.
Stripping any extension might not lead to sensible filenames, depending
on the naming rule. - Consider how files should be displayed, e.g.,
"f.tar.gz", or ".sh_profile", etc.

> append '</a>' after the filename
>
> How can this be done using any combination of commands and constructs?

One way can be along this principle approach...

locate -i .mp4 |
xargs stat -c'%Y <a href=%n>%n</a>' |
sort -n |
cut -d' ' -f2- |
sed 's/>[^<]\+\/\(.*\)[.][^.]\+</>\1</'

Here the stat command will add modification time in seconds since Epoch,
sort will do the numerical sorting of the entries, cut strips off that
seconds field again, and sed will remove the unwanted parts from the
file-path that will be visible in the HTML output.

Specifically the sed expression may be adjusted, depending on what the
"filename" actually shall be (see note on top).

Janis


>
> Many thanks,
> Tuxedo
>

Tuxedo

unread,
Aug 12, 2022, 2:33:23 AM8/12/22
to
Thanks for these vital bits of information!

With a bit over 2,000 files in the output, maybe it's below the limit of
xargs' ls buffer sorting capability or maybe I'm missing many files in the
output that do not fit in.

>
> Janis
>
>> [...]

Tuxedo

unread,
Aug 12, 2022, 3:41:58 AM8/12/22
to
Janis Papanagnou wrote:

> On 12.08.2022 03:21, Tuxedo wrote:
[...]

>
> The filename is (e.g.) "MAH0011.MP4" but you want "MAH0011", it seems.
> Stripping any extension might not lead to sensible filenames, depending
> on the naming rule. - Consider how files should be displayed, e.g.,
> "f.tar.gz", or ".sh_profile", etc.

You're right. It's best not to hide the file extensions.

>
>> append '</a>' after the filename
>>
>> How can this be done using any combination of commands and constructs?
>
> One way can be along this principle approach...
>
> locate -i .mp4 |
> xargs stat -c'%Y <a href=%n>%n</a>' |
> sort -n |
> cut -d' ' -f2- |
> sed 's/>[^<]\+\/\(.*\)[.][^.]\+</>\1</'
>
> Here the stat command will add modification time in seconds since Epoch,
> sort will do the numerical sorting of the entries, cut strips off that
> seconds field again, and sed will remove the unwanted parts from the
> file-path that will be visible in the HTML output.
>
> Specifically the sed expression may be adjusted, depending on what the
> "filename" actually shall be (see note on top).

Thanks for the stat, sort and sed tricks!

It outputs a list of the file name portions within the <a
href=linkpath">filename</a>

But, as said, it's better to include the file extensions in the output.
Systems which typically hide file extensions are for users who get confused
by them.

Can anyone here also suggest how to include a part of the file path in the
link name to add a visual indication of the whereabouts of each file in the
HTML list output, something like follows:

<a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>

This presumes all is somewhere in ~/ although it's not. Replacing the
/home/tuxedo string with /~ is just meant to shorten the output a little.

Before the link, how can a date stamp best be added?

As in:

Dec 14 2018 <a
href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>

Or maybe:

2018-12-14
<a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>

And after the link, how can the file sizes be added to the final output?

As in "YY-MM-DD <a..>file.ext</a> (size)":

2018-12-14
<a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a> (15MB)

Or the size can appear before or between the date and file name. It doesn't
really matter as long as it's somewhere.

Thanks for any additional ideas!

Tuxedo

>
> Janis
>
>
>>
>> Many thanks,
>> Tuxedo
>>

Janis Papanagnou

unread,
Aug 12, 2022, 3:53:54 AM8/12/22
to
Maybe, maybe not. - Are overall sorted data necessary or is that
irrelevant? Is the locate database static or does it grow over time?
And how long is a file path on average? 60 characters? 80? You may
have 160.000 characters or more; depending on your OS that might be
too much. Would you want to run a program that works correctly only
by chance? Or take an approach that works independent of exec buffer
length, like an explicit sort process on the selected data? - Your
choice. (I've elsethread provided such an approach. Although for own
purposes I'd change it; simplify the sed expression to something
like 's/>[^<]\+\/\(.*\)</>\1</' and put the href file name argument
in double-quotes, all easily to adjust.)

Janis

Janis Papanagnou

unread,
Aug 12, 2022, 4:07:26 AM8/12/22
to
Without stripping the extension: sed 's/>[^<]\+\/\(.*\)</>\1</'

>
> Can anyone here also suggest how to include a part of the file path in the
> link name to add a visual indication of the whereabouts of each file in the
> HTML list output, something like follows:

You could adjust the sed regexp by keeping a couple (two?) optional
'/' characters. After the \/ part of the pattern and within the \(
... \) subexpression add a couple to be expected as \/[^/]+\/[^/]+
(untested).

>
> <a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>

The href argument should IMO be quoted

stat -c'%Y <a href="%n">%n</a>'

>
> This presumes all is somewhere in ~/ although it's not. Replacing the
> /home/tuxedo string with /~ is just meant to shorten the output a little.

You could also use "...", as in

sed 's/>[^<]\+\/\(.*\)</>... \1</'

>
> Before the link, how can a date stamp best be added?
>
> As in:
>
> Dec 14 2018 <a
> href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>
>
> Or maybe:
>
> 2018-12-14
> <a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a>
>
> And after the link, how can the file sizes be added to the final output?
>
> As in "YY-MM-DD <a..>file.ext</a> (size)":
>
> 2018-12-14
> <a href=/home/tuxedo/some/dir/MAH0045.MP4>~/some/dir/MAH0045.MP4</a> (15MB)
>
> Or the size can appear before or between the date and file name. It doesn't
> really matter as long as it's somewhere.

Just use the options that stat provides (%y, %s, etc.); see 'man stat'
for all options. For example

stat -c'%Y %y <a href="%n">%n</a> %s'

Adjust as you like, just keep the %Y first because it's used for sorting
and is removed afterward.

>
> Thanks for any additional ideas!

HTH. I suppose you can now play with that and test the best format.

Janis

Janis Papanagnou

unread,
Aug 12, 2022, 4:10:58 AM8/12/22
to
On 12.08.2022 10:07, Janis Papanagnou wrote:
> On 12.08.2022 09:33, Tuxedo wrote:
>> Janis Papanagnou wrote:
>
> Without stripping the extension: sed 's/>[^<]\+\/\(.*\)</>\1</'


>> Can anyone here also suggest how to include a part of the file path in the
>> link name to add a visual indication of the whereabouts of each file in the
>> HTML list output, something like follows:
>
> You could adjust the sed regexp by keeping a couple (two?) optional
> '/' characters. After the \/ part of the pattern and within the \(
> ... \) subexpression add a couple to be expected as \/[^/]+\/[^/]+
> (untested).

I think the '+' regexp meta-symbol must be escaped '\+' as above;
you'll find out yourself if you try the most appropriate substring.

Janis

Tuxedo

unread,
Aug 12, 2022, 4:50:46 AM8/12/22
to
Thank you for all these magic tricks. It gives me plenty to experiment with
for a while :-)

Tuxedo

Janis Papanagnou

unread,
Aug 12, 2022, 5:03:43 AM8/12/22
to
On 12.08.2022 10:41, Tuxedo wrote:
> Janis Papanagnou wrote:
>
>> On 12.08.2022 10:07, Janis Papanagnou wrote:
>>> On 12.08.2022 09:33, Tuxedo wrote:
>>>> Janis Papanagnou wrote:
>>>
>>> Without stripping the extension: sed 's/>[^<]\+\/\(.*\)</>\1</'
>>
>>
>>>> Can anyone here also suggest how to include a part of the file path in
>>>> the link name to add a visual indication of the whereabouts of each file
>>>> in the HTML list output, something like follows:
>>>
>>> You could adjust the sed regexp by keeping a couple (two?) optional
>>> '/' characters. After the \/ part of the pattern and within the \(
>>> ... \) subexpression add a couple to be expected as \/[^/]+\/[^/]+
>>> (untested).
>>
>> I think the '+' regexp meta-symbol must be escaped '\+' as above;
>> you'll find out yourself if you try the most appropriate substring.
>>
>> Janis
>
> Thank you for all these magic tricks. It gives me plenty to experiment with
> for a while :-)

You're welcome.

I've just noticed that stat's %y creates a lot of probably undesired
time information (if you want only the date).

$ stat -c%y /var/games/slashem/record
2022-08-11 13:47:33.893058723 +0200

So here's some more hints with a test file to show how it works...

echo "/var/games/slashem/record" |
xargs stat -c'%Y %y <a href="%n">%n</a> %s' |
sort -n | cut -d' ' -f2- |
sed -e 's/ [0-9][0-9]:[^<]*</ </' \
-e 's/>[^<]\+\(\/[^/]\+\/[^/]\+\)</>...\1</

...will produce...

2022-08-11 <a href="/var/games/slashem/record">.../slashem/record</a> 10030

The first sed substitution expression removes the spurious time data
and the second one keeps just two path components and adds '...'.

Probably a better iteration to play with.

Janis

>
> Tuxedo
>

Tuxedo

unread,
Aug 13, 2022, 11:36:28 AM8/13/22
to
Janis Papanagnou wrote:

[...]

>
> I've just noticed that stat's %y creates a lot of probably undesired
> time information (if you want only the date).
>
> $ stat -c%y /var/games/slashem/record
> 2022-08-11 13:47:33.893058723 +0200
>
> So here's some more hints with a test file to show how it works...
>
> echo "/var/games/slashem/record" |
> xargs stat -c'%Y %y <a href="%n">%n</a> %s' |
> sort -n | cut -d' ' -f2- |
> sed -e 's/ [0-9][0-9]:[^<]*</ </' \
> -e 's/>[^<]\+\(\/[^/]\+\/[^/]\+\)</>...\1</
>
> ...will produce...
>
> 2022-08-11 <a href="/var/games/slashem/record">.../slashem/record</a>
> 10030
>
> The first sed substitution expression removes the spurious time data
> and the second one keeps just two path components and adds '...'.
>
> Probably a better iteration to play with.

Yes, thank you, especially for the the explanations. How simple, yet useful.
It works great and will likely be used much beyond my original mp4 file
listing purpose. It's found its place in my /usr/local/bin

Tuxedo

Alex Bochannek

unread,
Aug 13, 2022, 5:20:33 PM8/13/22
to
Janis Papanagnou <janis_pa...@hotmail.com> writes:

>
> echo "/var/games/slashem/record" |
> xargs stat -c'%Y %y <a href="%n">%n</a> %s' |
> sort -n | cut -d' ' -f2- |
> sed -e 's/ [0-9][0-9]:[^<]*</ </' \
> -e 's/>[^<]\+\(\/[^/]\+\/[^/]\+\)</>...\1</
>
> ...will produce...
>
> 2022-08-11 <a href="/var/games/slashem/record">.../slashem/record</a> 10030
>
> The first sed substitution expression removes the spurious time data

The first one is unnecessary if you instead use:

cut -d' ' -f2,5-

> and the second one keeps just two path components and adds '...'.

--
Alex.

Janis Papanagnou

unread,
Aug 13, 2022, 5:26:27 PM8/13/22
to
On 13.08.2022 23:20, Alex Bochannek wrote:
> Janis Papanagnou <janis_pa...@hotmail.com> writes:
>
>>
>> echo "/var/games/slashem/record" |
>> xargs stat -c'%Y %y <a href="%n">%n</a> %s' |
>> sort -n | cut -d' ' -f2- |
>> sed -e 's/ [0-9][0-9]:[^<]*</ </' \
>> -e 's/>[^<]\+\(\/[^/]\+\/[^/]\+\)</>...\1</
>>
>> ...will produce...
>>
>> 2022-08-11 <a href="/var/games/slashem/record">.../slashem/record</a> 10030
>>
>> The first sed substitution expression removes the spurious time data
>
> The first one is unnecessary if you instead use:
>
> cut -d' ' -f2,5-

Good point! Simplifies things.

Janis

0 new messages