Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

bash 3.2.57 versus bash 5.0.17

464 views
Skip to first unread message

paris2venice

unread,
Aug 7, 2020, 6:25:36 PM8/7/20
to
First, I want to thank those who helped me last time with resolving the errors I was getting with a call to my usage function. Your suggestions worked and I was able to work out all of the other bugs in my script.

However, just when I was ready to incorporate the data into the script to make it publicly available (mostly to friends who might find it useful), I realized that I would have to remove the /usr/local/bin/bash she-bang since most of them would not have installed bash 5.0. So I did so and everything in my "Gardiner numbers to Egyptian hieroglyphs" script worked except for one thing and that was when attempting to show an entire group. When Gardiner classified all Egyptian hieroglyphs, he grouped them into 28 different groups which are simply a letter e.g. Gardiner numbers for "women and her occupations" are all prefaced by the letter B.

Anyway, the output for bash 3 (it's the default in macOS ... Apple, of course) gives me only a single hieroglyph rather than the full set as from 5 and when I look at the debug output, there is absolutely no difference so I don't know what to try to resolve this.

I can post the entire script if necessary but right now the data is in a separate file and it is 1071 lines long so not sure what to do in that sense.
Anyway, below is the "show_group" function for now but I'm not sure that it alone provides enough for anyone to provide assistance. Thanks a bunch for taking a look.

This is the output from bash 5:
Gardiner sign list: group B = woman & her occupations
B1 𓁐 B2 𓁑 B3 𓁒 B4 𓁓 B5 𓁔 B5a 𓁕

B6 𓁖 B7 𓁗 B8 𓁘 B9 𓁙

and this is the output from bash 3:
Gardiner sign list: group B = woman & her occupations
B1 𓁐

$group is captured two lines before the while loop begins but does not appear until the end of the while loop with "done <<< $group"

$file is, of course, the data file containing the hex numbers for the Unicode necessary to produce the hieroglyphs. For group B, it is only these nine lines:
B1: F0 93 81 90
B2: F0 93 81 91
B3: F0 93 81 92
B4: F0 93 81 93
B5: F0 93 81 94
B5a: F0 93 81 95
B6: F0 93 81 96
B7: F0 93 81 97
B8: F0 93 81 98
B9: F0 93 81 99

show_group()
{
if [ -f "$aed" ] # just a second set of unnecessary data
then
gg=$( grep "Gardiner sign list: group ${1} =" "$aed" )
printf '\e[1;34m%s\e[m\n' "$gg" # \e[1;34m = blue escape sequence
else
echo "$n : $aed not found" # not a fatal error
fi

i=0
array_max=6 # in other words, no more than 6 hieroglyphs per line
group=$( grep "^${1}[0-9]" $file )
line_count=$( grep "^${1}[0-9]" $file |wc -l )

while [ $line_count -ge 0 ]
do
IFS='
' read -r line
last_gn="$gn"
gn=$( awk -F: '{print $1}' <<< "$line" )
case ${#gn} in
2 ) pad="...." ;;
3 ) pad="..." ;;
4 ) pad=".." ;;
5 ) pad="." ;;
esac

if [ "$gn" = "" ] && [ $i -gt 0 ]
then
printf "${glyph_array[*]}" |sed 's/\./ /g'
printf "\n"
break
else
glyph="${gn}${pad}$( get_glyph $gn )...." || exit
glyph_array=( ${glyph_array[@]} $glyph )

i=$(( $i + 1 )) # increment $i
if [ $i -eq $array_max ]
then
#
# send array to stdout with dots replaced by spaces
#
printf "${glyph_array[*]}" |sed 's/\./ /g'
printf "\n"
i=0 # reset i=0
glyph_array=( ) # re-initialize glyph array
line_count=$(( $line_count - $array_max ))
fi
fi
[ "$gn" = "$last_gn" ] && i=$(( $array_max - 1 ))
done <<< $group
}
Thanks for any help.

Ed Morton

unread,
Aug 7, 2020, 6:36:25 PM8/7/20
to
A shell is tool to manipulate files and processes and sequence calls to
other tools. The standard UNIX tool to manipulate text is AWK. You're
trying to use a shell to manipulate text - don't do that, just write
whatever that is you're currently trying to do in shell in awk instead
for simplicity, clarity, robustness, efficiency, portability, etc.

Ed.

Keith Thompson

unread,
Aug 7, 2020, 6:49:26 PM8/7/20
to
paris2venice <paris2...@gmail.com> writes:
> First, I want to thank those who helped me last time with resolving
> the errors I was getting with a call to my usage function. Your
> suggestions worked and I was able to work out all of the other bugs in
> my script.
>
> However, just when I was ready to incorporate the data into the script
> to make it publicly available (mostly to friends who might find it
> useful), I realized that I would have to remove the
> /usr/local/bin/bash she-bang since most of them would not have
> installed bash 5.0.
[...]

I hope you're not assuming that /usr/local/bin/bash is necessarily
bash 5.0.

On my system, for example (Ubuntu 20.04), /bin/bash is version
5.0.17, and /usr/local/bin/bash doesn't exist.

If your script depends on a version of bash that a user might not have,
I'd suggest checking $BASH_VERSINFO and aborting with a clear error
message if it's not the right version. Let the user figure out how to
update the shebang for their own system.

--
Keith Thompson (The_Other_Keith) Keith.S.T...@gmail.com
Working, but not speaking, for Philips Healthcare
void Void(void) { Void(); } /* The recursive call of the void */

paris2venice

unread,
Aug 7, 2020, 6:50:04 PM8/7/20
to
Hi Ed, The only problem with that is that I only know small portions of awk and not nearly enough to accomplish my goals or even begin such a project whereas I do know a sufficient amount with bash. I know ... probably a lame excuse.

paris2venice

unread,
Aug 7, 2020, 7:00:09 PM8/7/20
to
On Friday, August 7, 2020 at 3:49:26 PM UTC-7, Keith Thompson wrote:
Thanks for your reply, Keith. I know which version I am using because I installed bash 5 using homebrew and because of this:
$ /bin/bash --version
GNU bash, version 3.2.57(1)-release (x86_64-apple-darwin19)
Copyright (C) 2007 Free Software Foundation, Inc.
$ /usr/local!!
/usr/local/bin/bash --version
GNU bash, version 5.0.17(1)-release (x86_64-apple-darwin19.4.0)
Copyright (C) 2019 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>

As to letting other people figure out how to install bash 5.0 ... these are Egyptology people and they would almost certainly not have a clue.

Janis Papanagnou

unread,
Aug 7, 2020, 7:21:55 PM8/7/20
to
On 08.08.2020 00:50, paris2venice wrote:
> On Friday, August 7, 2020 at 3:36:25 PM UTC-7, Ed Morton wrote:
>> On 8/7/2020 5:25 PM, paris2venice wrote:
>>> [...]

Chances are not that high for you, I suppose, to get on-topic responses
if you post a lot of code and assume that folks here debug it for you,
or analyse what it actually does, or that someone tests your code by
building a test case (something you missed to provide).

(When I saw the output differences I thought a potential locale issue
might cause such effects on the different systems. But this is really
nothing but a wild guess.)

>> [...]
>
> Hi Ed, The only problem with that is that I only know small portions of
> awk and not nearly enough to accomplish my goals or even begin such a
> project whereas I do know a sufficient amount with bash. I know ...
> probably a lame excuse.

I'm not sure switching to Awk would help - to be sure I'd first need to
understand what your code actually does. But if it's just text processing
then Ed is most likely right. Don't overestimate Awk's complexity; if you
have some experience with programming you can learn the basics of Awk in
few (1..4) hours, for all the gory details calculate one day. Awk has not
all these quirks and pitfalls that Shell programming provides, it is a
small and quite consistent language (not comparable with Shell).

Janis

paris2venice

unread,
Aug 7, 2020, 7:44:40 PM8/7/20
to
Hi Janis, Thanks for your reply. I appreciate your thoughts. I realize that I posted too much code so I did not give it much chance that someone might see something. As to using awk, there is an additional reason that I cannot use awk ... My script incorporates AppleScript to make it an OOP script and I doubt very much that awk would have any way of accomplishing this.

Cydrome Leader

unread,
Aug 8, 2020, 2:39:06 AM8/8/20
to
paris2venice <paris2...@gmail.com> wrote:
> First, I want to thank those who helped me last time with resolving the errors I was getting with a call to my usage function. Your suggestions worked and I was able to work out all of the other bugs in my script.
>
> However, just when I was ready to incorporate the data into the script to make it publicly available (mostly to friends who might find it useful), I realized that I would have to remove the /usr/local/bin/bash she-bang since most of them would not have installed bash 5.0. So I did so and everything in my "Gardiner numbers to Egyptian hieroglyphs" script worked except for one thing and that was when attempting to show an entire group. When Gardiner classified all Egyptian hieroglyphs, he grouped them into 28 different groups which are simply a letter e.g. Gardiner numbers for "women and her occupations" are all prefaced by the letter B.
>
> Anyway, the output for bash 3 (it's the default in macOS ... Apple, of course) gives me only a single hieroglyph rather than the full set as from 5 and when I look at the debug output, there is absolutely no difference so I don't know what to try to resolve this.
>
> I can post the entire script if necessary but right now the data is in a separate file and it is 1071 lines long so not sure what to do in that sense.
> Anyway, below is the "show_group" function for now but I'm not sure that it alone provides enough for anyone to provide assistance. Thanks a bunch for taking a look.
>
> This is the output from bash 5:
> Gardiner sign list: group B = woman & her occupations
> B1 ???? B2 ???? B3 ???? B4 ???? B5 ???? B5a ????
>
> B6 ???? B7 ???? B8 ???? B9 ????
>
> and this is the output from bash 3:
> Gardiner sign list: group B = woman & her occupations
> B1 ????

Is bash the only difference between the two runs? What about awk and sed?
I can't say I even understand what this program is supposed to do.
^^^^^^^^^^^^^^^

this line wrapped IFS stuff is sort of scary.


> last_gn="$gn"
> gn=$( awk -F: '{print $1}' <<< "$line" )
> case ${#gn} in
> 2 ) pad="...." ;;
> 3 ) pad="..." ;;
> 4 ) pad=".." ;;
> 5 ) pad="." ;;
> esac
>
> if [ "$gn" = "" ] && [ $i -gt 0 ]
> then
> printf "${glyph_array[*]}" |sed 's/\./ /g'
> printf "\n"
> break
> else
> glyph="${gn}${pad}$( get_glyph $gn )...." || exit
> glyph_array=( ${glyph_array[@]} $glyph )
>
> i=$(( $i + 1 )) # increment $i
^^^^^^^^^^^
this is the most useless comment, on the only line that makes any sense.

Benjamin Esham

unread,
Aug 8, 2020, 10:19:14 AM8/8/20
to
paris2venice wrote:

> On Friday, August 7, 2020 at 4:21:55 PM UTC-7, Janis Papanagnou wrote:
>
>> I'm not sure switching to Awk would help - to be sure I'd first need to
>> understand what your code actually does. But if it's just text processing
>> then Ed is most likely right. Don't overestimate Awk's complexity; if you
>> have some experience with programming you can learn the basics of Awk in
>> few (1..4) hours, for all the gory details calculate one day. Awk has not
>> all these quirks and pitfalls that Shell programming provides, it is a
>> small and quite consistent language (not comparable with Shell).
>
> Hi Janis, Thanks for your reply. I appreciate your thoughts. I
> realize that I posted too much code so I did not give it much chance
> that someone might see something. As to using awk, there is an
> additional reason that I cannot use awk ... My script incorporates
> AppleScript to make it an OOP script and I doubt very much that awk
> would have any way of accomplishing this.

I'm very curious to know what you mean by this: how does your script
integrate with AppleScript? Is it being invoked from an AppleScript, or does
it invoke AppleScript scripts itself with the "osascript" command or
something like that? In either case I'd bet you could hook it up with an Awk
script too.

macOS ships with what seems to be a 2007 version of the "one true Awk." It
may not have all the niceties of Gawk but it might still be a lot better
than a shell script for the work you're doing!

Benjamin

paris2venice

unread,
Aug 8, 2020, 11:26:58 AM8/8/20
to
On Friday, August 7, 2020 at 11:39:06 PM UTC-7, Cydrome Leader wrote:
Hi Cydrome Leader, Yes. The versions of bash are the only thing different. The code is exactly the same in both runs and all that I changed was this line:
diff <other-directory>/utf ./utf
1c1
< #!/usr/local/bin/bash
---
> #!/bin/bash

What the script does is convert what is called a Gardiner number (for example, G1) to its corresponding Egyptian hieroglyph (in the case of G1, a vulture). It also converts a set of Gardiner numbers to their corresponding Egyptian hieroglyphs. Your output shows up as "????" because you likely have an old version of your operating system. To the best of my knowledge, only recent operating systems have the ability to display the Unicode for Egyptian hieroglyphs. Thus, the hieroglyphs show up fine in this newsgroup on my macOS Catalina.

As to the "line-wrapped IFS stuff being sort of scary," I use it all of the time and I'm sure that most bash-writers on this newsgroup do so as well. It is very common. In fact, I learned it on this newsgroup many years ago.
The $line_count variable is just the count of lines in the Gardiner group that was extracted with the command above it ... that is:
group=$( grep "^${1}[0-9]" $file )
so $1 was supplied by the user and $file is the data file.
In this case (for Gardiner numbers beginning with B), the program would find 10 lines from the Gardiner number/ Unicode hex numbers data file.

The conversion is done with another function called 'get_glyph' which is a very simple function used by all of the different functions in the script.

What is actually going wrong is that bash 3 is stopping after processing the first hieroglyph in the array that is built. An array is built with six hieroglyphs per line of output. Thus, there should be two lines for the ten hieroglyphs in Gardiner group B.



paris2venice

unread,
Aug 8, 2020, 11:39:43 AM8/8/20
to
Hi Benjamin, Yes. My script is using osascript to invoke AppleScript. The bash script itself is being executed from the macOS Desktop with a double-click because it is using the ".command" extension. In the case of my temporary filename, it is called utf.command so even if I did write it in awk, I doubt very much that Apple provides such a way to double-click an awk script. In any case, there is a reason that the script is bailing after the first element of the array is processed and that's all I need to know to solve this problem. There seems to be no sense as to why it would bail after the first element of the array because the code is very simple in reality.

Benjamin Esham

unread,
Aug 8, 2020, 1:53:05 PM8/8/20
to
> Hi Benjamin, Yes. My script is using osascript to invoke AppleScript. The
> bash script itself is being executed from the macOS Desktop with a
> double-click because it is using the ".command" extension. In the case of
> my temporary filename, it is called utf.command so even if I did write it
> in awk, I doubt very much that Apple provides such a way to double-click
> an awk script.

Apple has provided a facility that allows you to double-click *any* kind of
script. By changing the first line of the script (the "shebang line") from
something like "#!/bin/bash" to something like "#!/usr/bin/awk -f", you can
tell the OS that it should use Awk instead of Bash to run the file. (Shebang
lines have been used in this way for decades in Unix and its derivatives.)
So while I certainly understand that you don't want to rewrite a large,
mostly-working Bash script in another language, the lack of OS support
doesn't seem like it's an issue here :-)

> In any case, there is a reason that the script is bailing after the first
> element of the array is processed and that's all I need to know to solve
> this problem. There seems to be no sense as to why it would bail after
> the first element of the array because the code is very simple in reality.

I'd recommend that you post a minimal example that demonstrates the problem:
a full shell script that can be run as-is, an example data file, and an
explanation of what the output is supposed to look like. Without these
things, it's going to be very difficult for anyone else to debug your
problem.

One very general piece of advice is that in Bash scripts, you almost always
want to double-quote variables and command substitutions unless you
specifically want the shell to split them for you. So, for example, use
"$foo" instead of $foo and "$(example arg1)" instead of $(example arg1).

Another piece of advice is that the built-in "column" utility may be able to
do some of these formatting tasks for you. For example,

$ cat example_file
100001
100002
100003
100004
100005
100006
100007
100008
100009
100010
100011
100012
100013
100014
100015
100016
100017
100018
100019
100020
$ column -x example_file
100001 100002 100003 100004 100005 100006
100007 100008 100009 100010 100011 100012
100013 100014 100015 100016 100017 100018
100019 100020

Hope this helps,

Benjamin

paris2venice

unread,
Aug 10, 2020, 6:04:59 PM8/10/20
to
So I was able to figure it out. Apparently, bash 5 can use a while loop where input is cycled in from a variable as in:
group=$( grep "^${1}[0-9]" $file )
while ...
done <<< "$group"

where as bash 3 failed on that but can properly use the input if cycled in via a pipe as in:
grep "^${1}[0-9]" $file |
{
while [ $line_count -ge 0 ]

I definitely have used both before in bash 3 so it is still a bit of a mystery.

Benjamin Esham

unread,
Aug 10, 2020, 7:43:38 PM8/10/20
to
paris2venice wrote:

> On Saturday, August 8, 2020 at 10:53:05 AM UTC-7, Benjamin Esham wrote:
>
> [snip]
>
>> One very general piece of advice is that in Bash scripts, you almost
>> always want to double-quote variables and command substitutions unless
>> you specifically want the shell to split them for you. So, for example,
>> use "$foo" instead of $foo and "$(example arg1)" instead of $(example
>> arg1).
>
> So I was able to figure it out. Apparently, bash 5 can use a while loop
> where input is cycled in from a variable as in:
> group=$( grep "^${1}[0-9]" $file )
> while ...
> done <<< "$group"
>
> where as bash 3 failed on that but can properly use the input if cycled in
> via a pipe [...] I definitely have used both before in bash 3 so it is
> still a bit of a mystery.

Your original post had a line like

done <<< $group

without the double quotes. It's possible this was your culprit.

Benjamin

Janis Papanagnou

unread,
Aug 11, 2020, 3:31:43 AM8/11/20
to
Your script looks overly complicated for what you actually seem to try to do;
it seems there'd not much more than 10 lines of [modern] shell code necessary.

Building blocks:

read -r group hex ## or: read -r group a b c d

printf "\U${hex// /}" ## or: printf "\U$a$b$c$d" ## Unicode literal

columns ## or other tools to build columnary output

I'd probably code the constant identifiers instead of repeatedly grep'ing
them from a file (this requres more than "10 lines" but seems to create
clearer code).

case ${group} of
(A*) ...
(B*) printf "%s\n" "Gardiner sign list: group B = woman & her occupations"
...
esac

Then I observed that your hex codes above in your file produce no characters
(in my environment). But if I use hieroglyphs from another plane it works,
e.g. printf "\U00013079\n"

I'd really simplify your code and try to avoid all those mostly unnecessary
'grep's and 'sed's etc.

BTW, contrary to bash/ksh/zsh's printf GNU awk's printf seems not to support
unicode literals by printf "\U..." so awk may not help you here.

Janis

paris2venice

unread,
Aug 13, 2020, 2:26:19 AM8/13/20
to
Hi Benjamin,
While you are correct about that ... that it was the culprit, you have to admit that it does not make a lot of sense that it works that way in bash 5 but not in bash 3 ... except for the fact that it was apparently corrected to be consistent with how other such constructs work. For example, you can't quote every variable. Try it in a for loop and see what results you get in bash 3 or bash 5. They're both consistent in the for loop.
$ echo $BASH_VERSION
3.2.57(1)-release
$ k="1 2 3 4"
$ for j in "$k"
> do
> echo "$j"
> done
1 2 3 4
There would be no point in using such a loop where $k is quoted. Similarly, that's how I looked at the while ... done loop.
$ echo $BASH_VERSION
5.0.17(1)-release
$ k="1 2 3 4"
$ for j in $k
> do
> echo "$j"
> done
1
2
3
4

paris2venice

unread,
Aug 13, 2020, 3:00:23 AM8/13/20
to
Hi Janis,
Thanks for your suggestions about condensing my code. I was planning to rewrite some of it anyway but first I had to get it working but I like your ideas.

As to your trouble with the UTF-8 that I am using (as opposed to the Unicode numbers that you used), look at the difference between bash 3 and bash 5 again:
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.15.3
BuildVersion: 19D76
$ echo $BASH_VERSION
5.0.17(1)-release
$ printf "\U00013079\n" # Unicode numbers
𓁹
$ printf "\xF0\x93\x81\xB9\n" # UTF-8
𓁹

$ echo $BASH_VERSION
3.2.57(1)-release
$ printf "\U00013079\n"
\U00013079
$ printf "\xF0\x93\x81\xB9\n"
𓁹

So strange that you apparently have the opposite issue.
By the way, just as we have Ken Thompson to thank for Unix, he was also the developer (in 1992) of UTF-8.

Janis Papanagnou

unread,
Aug 13, 2020, 6:14:19 AM8/13/20
to
On 13.08.2020 09:00, paris2venice wrote:
>> [...]
>
> Hi Janis,
> Thanks for your suggestions about condensing my code. I was planning to rewrite some of it anyway but first I had to get it working but I like your ideas.
>
> As to your trouble with the UTF-8 that I am using (as opposed to the Unicode numbers that you used), look at the difference between bash 3 and bash 5 again:
> $ sw_vers
> ProductName: Mac OS X
> ProductVersion: 10.15.3
> BuildVersion: 19D76
> $ echo $BASH_VERSION
> 5.0.17(1)-release
> $ printf "\U00013079\n" # Unicode numbers
> 𓁹
> $ printf "\xF0\x93\x81\xB9\n" # UTF-8
> 𓁹
>
> $ echo $BASH_VERSION
> 3.2.57(1)-release
> $ printf "\U00013079\n"
> \U00013079
> $ printf "\xF0\x93\x81\xB9\n"
> 𓁹
>
> So strange that you apparently have the opposite issue.

(Ah, okay, so you have defined and you are printing UTF-8 encoded Unicode.)

Maybe bash-3 does not support \U in its printf. (I had been using bash-4.)
The bash-3 man page should tell.

Janis

Keith Thompson

unread,
Aug 13, 2020, 2:57:45 PM8/13/20
to
It looks like \U was first supported in bash 4.3.

medicoma...@gmail.com

unread,
Sep 21, 2020, 7:41:50 AM9/21/20
to
0 new messages