In the FAQ, there is a posting on this subject:
How do I determine whether a scalar is a number/whole/integer/float?
http://www.perldoc.com/perl5.8.0/pod/perlfaq4.html#How-do-I-determine-whether-a-scalar-is-a-number-whole-integer-float-
The sub is_numeric from that page incorrectly returns true for e.g.
input "NaN". NaN is clearly Not A Number.
The ==, !=, <=> operators and the sprintf sub yield warnings on
non-numeric data, and there is no way to detect whether a scalar is
numeric beforehand. I hereby suggest a sub is_numeric that uses the
same detection code as the above operators. I simply need the
is_numeric to reliably avoid these warnings. The ability to do that
reliably should be in the core. Don't you agree? Here is a suggestion
that doesn't use regexps or use POSIX either as the suggestions in the
current FAQ.
Could it be included in the above FAQ?
sub is_numeric {
use warnings;
my $warned = 0;
local $SIG{__WARN__} = sub { $warned = 1;};
# my $bogus = : To avoid compile-time warning:
# 'Useless use of numeric eq (==) in void context'
my $bogus =
$_[0] == 0;
return ($warned) ? 0 : 1;
}
Any more elegant solutions to the need for "my $bogus"? If run with
perl -w it yields a (compile-time?) warning otherwise..
The POSIX::strtod behaves like C's strtod as documented, but
apparently, that is not what the operators above use internally.
Working around quirks like that makes for error prone code, and I'd
like to avoid that...
Peter
--------------- Code that issues warnings ---------------------
#!/usr/bin/perl
use warnings;
sub getnum {
use POSIX qw(strtod);
my $str = shift;
$str =~ s/^\s+//;
$str =~ s/\s+$//;
$! = 0;
# Uncommenting this line makes it work, but I don't
# know what other similar input that would trigger the same
# bug...
# return undef if ($str =~ /^nan$/i);
my($num, $unparsed) = strtod($str);
if (($str eq '') || ($unparsed != 0) || $!) {
return undef;
} else {
return $num;
}
}
sub is_numeric { defined getnum($_[0]) }
my $val = 'nan';
# ALL of the below yeild warnings.
# If they are all enabled, only the first one yields a warning...
if (is_numeric($val)) {
# printf "%s %d\n", $val, $val;
# printf "%d\n", $val +2;
# printf "%d %s\n", $val, ($val==0) ? "Y" : "N" ;
# printf "%.2f\n", $val;
print "Higher?" . ($val <=> 0) ? "Y" : "N";
}
Just return $warned (same thing).
> }
>
> Any more elegant solutions to the need for "my $bogus"? If run with
> perl -w it yields a (compile-time?) warning otherwise..
I wouldn't worry about that too much. I would, however, say
no warnings;
use warnings 'numeric';
so other warnings that might be triggered don't get in the way.
An alternative:
sub is_numeric {
use warnings FATAL => 'numeric';
return defined eval { $_[ 0] == 0 };
}
That also happens to take care of the auxiliary variable.
Anno
sub is_numeric {
no warnings;
use warnings FATAL => 'numeric';
return defined eval { $_[ 0] == 0 };
}
Could we put this in the FAQ? It is so much more elegant than what is
there now.
Regards,
Peter
-------------------------------------
I minor anal comment...
anno...@lublin.zrz.tu-berlin.de (Anno Siegel) wrote in message news:<b1b8vq$pil$1...@mamenchi.zrz.TU-Berlin.DE>...
> > return ($warned) ? 0 : 1;
>
> Just return $warned (same thing).
That would be (!$warned), right?
I don't see the need to disable other warnings, but one should make
sure 'numeric' is the only FATAL one in the scope.
use warnings NONFATAL => 'all', FATAL => 'numeric'
should do that. If other warnings are triggered by the code, the user
should see them if they are enabled.
> sub is_numeric {
> no warnings;
> use warnings FATAL => 'numeric';
> return defined eval { $_[ 0] == 0 };
> }
>
> Could we put this in the FAQ? It is so much more elegant than what is
> there now.
It has another problem. If "is_numeric( $x)" is called, and $x is
a tied (or otherwise magic) variable, other fatal errors could happen
which would be veiled by eval(). Likewise if $x contains an object
that has "0+" (nummification) overloaded.
To catch these cases we'd have to go back to checking $@ after the eval().
Anno
I'm curious philosophically, why is perl missing a way to find out if a
variable is numeric or string?
Perl must do it internally to find out if a "==" or an "eq" operation is
valid.
Is the type in the symbol table?
If perl can't do something there's usually some fundamental reason.
-ed
On 2/3/03 2:34 AM, in article b1lgj2$afr$1...@mamenchi.zrz.TU-Berlin.DE, "Anno
don't top post!
EK> I'm curious philosophically, why is perl missing a way to find out if a
EK> variable is numeric or string?
because there is (almost) no need for it.
EK> Perl must do it internally to find out if a "==" or an "eq" operation is
EK> valid.
sure, it has access to the *_OK flags in the SV. you can get at them
with some modules.
EK> If perl can't do something there's usually some fundamental reason.
like no need for it. perl converts as needed and the coder shouldn't
care.
<snip of full quote>
uri
--
Uri Guttman ------ u...@stemsystems.com -------- http://www.stemsystems.com
----- Stem and Perl Development, Systems Architecture, Design and Coding ----
Search or Offer Perl Jobs ---------------------------- http://jobs.perl.org
Damian Conway Perl Classes - January 2003 -- http://www.stemsystems.com/class
On Thu, 06 Feb 2003 17:27:07 -0800,
Ed Kulis <eku...@apple.com> wrote:
> Hi,
>
> I'm curious philosophically, why is perl missing a way to find out if a
> variable is numeric or string?
It isn't missing a way; see the answer to the FAQ. If you're asking
why it doesn't have a builtin operator or function for that; it
doesn't really need one. The Perl data type is a scalar, not an
integer or a string. Perl generally does the right thing depending on
the context you use the scalar in.
> Perl must do it internally to find out if a "==" or an "eq" operation is
> valid.
Yes, but you're certainly not suggesting that Perl exposes all of its
internals to the programmer, are you? As a Perl programmer, the idea
is that you don't worry about it, and you let Perl worry. if you use a
scalar with ==, then Perl will make sure that from then onwards, the
scalar has a numeric component (IV or NV). If you use it with eq, then
perl will make sure that it has a string component (PV).
> Is the type in the symbol table?
No, but if you insist on knowing about it, you can use one of the
modules in the Devel:: name space, for example. However you need to
understand that a Perl scalar does not _have_ a type. It is a bit more
complicated than that. See http://gisle.aas.no/perl/illguts/ and the
perlguts documentation.
In short: A Perl scalar is an SV. An SV can be a SvIV, in which case
it can hold a, possibly unsigned, integer, a SvNV, in which case it
can hold a numeric (double) or SvPV in which case it can hold a
string. As long as an SV is one of these, you can still say that it
has a certain type".
Apart from these, it can also be a SvRV, in which case it holds a
reference. Then, it can also be a SvPVIV or a SvPVNV, which can hold
more than one type of value at the same time, or a SvPVMG, which
attaches magic to the SV. If an SV is one of these, it no longer has a
simple type, and in fact, it can have multiple valid types at the same
time, depending on its history.
Perl automatically converts between these SV types, and upgrades them,
when necessary. it also maintains a state on the individual values in
the structures to see which are valid or not.
> If perl can't do something there's usually some fundamental reason.
The fundamental reason is that it is the wrong question. From Perl's
point of view there is no such thing as a "numeric" scalar. Internally
in Perl, scalars can have numeric fields, which are currently valid or
not, but that is an implementation detail that should not be exposed
to the Perl programmer.
[snip of TOFU quote on FAQ question about subject]
Martien
--
|
Martien Verbruggen | You can't have everything, where would you
Trading Post Australia | put it?
|
In a script I'm writing, the user enters values for a ping test in a TK GUI.
They can enter a value for the number of pings to perform, the timeout
between pings, etc, etc.
If they put 'cheese' into the entry box for the number of pings they'd like
to perform, I get the following error...
Argument "cheese" isn't numeric in numeric le (<=) at D:\Perl\My Scripts...
Then my script informs them that: "cheese pings completed."
I would much rather have a popup window that tells them they're stupid, and
to enter a proper value into the entry box, wouldn't you?
So, you see, there IS a need for it.
I've had to write a series of regexs for each entry box. My life would have
been much easier if there was a simple check that they've entered a number,
or a whole number, as required.
R.
But then... why the difference in '==' and 'eq'?
--Frank
Because 12 is '== ' to 00012 but it is not eq to '00012'.
This is more obvious for greater/smaller:
234 is '<' than 1234 but it is 'gt' than 1234.
jue
: In a script I'm writing, the user enters values for a ping test in a TK GUI.
: They can enter a value for the number of pings to perform, the timeout
: between pings, etc, etc.
:
: If they put 'cheese' into the entry box for the number of pings they'd like
: to perform, I get the following error...
:
: Argument "cheese" isn't numeric in numeric le (<=) at D:\Perl\My Scripts...
:
: Then my script informs them that: "cheese pings completed."
Most would not look for a distinction between "number" and "string" in
that case. They would look for the distinction between "digit" and
"non-digit."
The difference is subtle, but it's an example of using the right
terminology. Strings and numbers morph into one another seamlessly as
the code requires. The digit-ness of a character, on the other hand,
is never in doubt and will not change unless the program explicitly
alters it.
Another term to become familiar with is "data validation." That's
what you're really after. The string/number question is an X-Y
problem.
: I would much rather have a popup window that tells them they're stupid, and
: to enter a proper value into the entry box, wouldn't you?
Absolutely not. That is just bad UI design.
Either take steps that prevent the user from ever being able to enter
nonsense values (use a list box, for example), or accept anything that
comes in and do a sensible thing when it is nonsensical (falling back
to a default value, for example).
: I've had to write a series of regexs for each entry box.
Are the regexes all the same for each input, or does each input have
its own special set?
: My life would have
: been much easier if there was a simple check that they've entered a number,
: or a whole number, as required.
Make your own simple check and roll it off into a subroutine. How
hard can this be?
sub validate_number {
my($arg, $default) = @_;
my($valid) = $arg =~ /^(\d+)/;
return defined $valid ? $valid : $default;
}
print validate_number('cheese', 3);
> > EK> I'm curious philosophically, why is perl missing a way to find out
> if a
> > EK> variable is numeric or string?
[...]
> > like no need for it. perl converts as needed and the coder shouldn't
> > care.
>
> In a script I'm writing, the user enters values for a ping test in a TK GUI.
> They can enter a value for the number of pings to perform, the timeout
> between pings, etc, etc.
>
> If they put 'cheese' into the entry box for the number of pings they'd like
> to perform, I get the following error...
>
> Argument "cheese" isn't numeric in numeric le (<=) at D:\Perl\My Scripts...
>
> Then my script informs them that: "cheese pings completed."
>
> I would much rather have a popup window that tells them they're stupid, and
> to enter a proper value into the entry box, wouldn't you?
>
> So, you see, there IS a need for it.
Well, of course the need to tell numbers from strings can come up in
a program, and it can be done as the faq and this thread have shown.
However, since Perl doesn't distinguish them, there is no systematic
need for a Perl primitive (like, for instance, defined()) to tell
them apart.
Anno
Yes, I see your point. My remark was a more filosophical one: there is
a difference between '==' and 'eq' which makes that you cannot simply
say 'if ($a == $b)'. So the difference between textual and numerical
values is exposed to the programmer. Not that I really care, but one
could imagine another approach where '==' always means 'identical with
possible typecast to numeric value' and 'eq' always means 'textually
identical'. Thus 5 == 5, 'a' == 'a', 5 == '5', 5 == '005', not 'b' == 'a',
not 5 == 'a', 'a' eq 'a', 5 eq '5', not 5 eq '05' and so on. Then the
exposure of the underlying type is really gone.
I'll go and hide for the flames now ;-)
--Frank
> Because 12 is '== ' to 00012
No it isn't.
You must have meant to say:
Because 10 is '== ' to 00012
or
Because 12 is '== ' to '00012'
heh.
--
Tad McClellan SGML consulting
ta...@augustmail.com Perl programming
Fort Worth, Texas
Hmmm, I like that idea. You still need to check whether they've typed in
bolox, though.
>
> : I've had to write a series of regexs for each entry box.
>
> Are the regexes all the same for each input, or does each input have
> its own special set?
>
> : My life would have
> : been much easier if there was a simple check that they've entered a
number,
> : or a whole number, as required.
>
> Make your own simple check and roll it off into a subroutine. How
> hard can this be?
>
> sub validate_number {
> my($arg, $default) = @_;
> my($valid) = $arg =~ /^(\d+)/;
> return defined $valid ? $valid : $default;
> }
> print validate_number('cheese', 3);
It's not _that_ hard. One of my entry boxes can accept any positive number,
so that would include 2.5, another easier one can only accept positive whole
numbers greater than 1, which is nice, another can accept numbers like
1.25e+25, so I need a different regex for each.
Just the regex for accepting numbers like 2.5 is:
(/^[+]?\d+$/) || (/^[+]?\d*\.?\d+$/ ) || (/^[+-]?\d+\.?\d*$/ )
Still, I suppose once everyone has written their own subroute once, they
won't have to bother again :-/
R.
RSB> If they put 'cheese' into the entry box for the number of pings
RSB> they'd like to perform, I get the following error...
RSB> Argument "cheese" isn't numeric in numeric le (<=) at D:\Perl\My
RSB> Scripts...
RSB> Then my script informs them that: "cheese pings completed."
all input typed by a users is text. in fact perl can't read numbers
using any form of i/o. you aren't getting that fact.
if you write a program such as you have, then you have to check the
validity of the input. if you want a number then check for one before
you use it. perl doesn't know anything about what you wanted. so it
happily converts 'cheese' to 0 and generates a warning since you fed it
a bad number. it is your problem, not perl's.
RSB> So, you see, there IS a need for it.
no, there isn't. perl will convert the number anytime you use it as a
number. you still have to check that the text makes a valid number.
entered text NE number
RSB> I've had to write a series of regexs for each entry box. My life
RSB> would have been much easier if there was a simple check that
RSB> they've entered a number, or a whole number, as required.
write a number check sub then. reusing the regex is the way to go. a
series of regexes makes no sense.
and validating input is standard procedure in any language. nothing
special about perl there.
how can a language validate unless you specify a validation format?
writing a validation sub is trivial and useful.
RSB> Just the regex for accepting numbers like 2.5 is:
RSB> (/^[+]?\d+$/) || (/^[+]?\d*\.?\d+$/ ) || (/^[+-]?\d+\.?\d*$/ )
RSB> Still, I suppose once everyone has written their own subroute once, they
RSB> won't have to bother again :-/
FAQ. if you had bothered to look, you wouldn't have had to invent your
own. and the FAQ has better regexes. i see several ways to improve
yours.
why don't you seem to allow negative integers? why is [+] in a char
class? the second regex seems to be able to match anything the first one
matches (the leading digits and decimal point are optional so it will
match positive integers). the third regex also can match integers (which
handles the negative integer issue). you also don't allow floating point
formated (with a +/-E00 suffix).
so yes, number validation is not new and you didn't bother to find out
how others do it so you rolled your own weaker version.
The first regex matches a proper subset of the third regex, and hence
is redundant.
I guess you have a reason for not allowing -.5 but allowing +.5?
If -.5 is acceptable you could use something like:
/\d/ && /^[+-]?\d*\.?\d*$/
The second regex merges your second and third ones, but in doing so
allows "." and "", which aren't wanted. The first regex exists purely
to disallow those two degnerate cases, by requiring a digit.
--
Sam Holden
See my comments embedded.
On 2/7/03 4:37 AM, in article
3e43a8ee$0$118$e4fe...@dreader4.news.xs4all.nl, "Frank Maas"
<spamf...@cheiron-it.nl> wrote:
"==" with typecast is a really good idea. Of course, there would be serious
backward compability issues. Maybe a new operator "===" :-)
And if the error is exposed to the programmer, then a way to prevent should
be simplier than say redesiging a UI or using the Devel module.
I write interfaces and I run into the problem of distinguishing a number
from a string all the time. If say the sender has put the numbers in the
file in the right place then all is OK.
But when the input file is out of spec having a non numeric characters in
columns designated for numbers then it would be nicer to catch it with a
simple operator then with ad hoc regular expressions, calls to the Devel
Module, or a perl script crash with a invalid type comparison.
Perl classes, subg, overlays, calls to Devel, regular expressions might all
solve this problem in an ad hoc way but I think that there's a middle ground
where a new perl feature using a simple technique with an operator or
function is justified. The simple technique would also make the code more
maintianable for less experienced developers that inherit the code.
>
> I'll go and hide for the flames now ;-)
>
> --Frank
>
Flame away!!!
-ed
: In a script I'm writing, the user enters values for a ping test in a TK GUI.
: They can enter a value for the number of pings to perform, the timeout
: between pings, etc, etc.
: If they put 'cheese' into the entry box for the number of pings they'd like
: to perform, I get the following error...
: Argument "cheese" isn't numeric in numeric le (<=) at D:\Perl\My Scripts...
: Then my script informs them that: "cheese pings completed."
: I would much rather have a popup window that tells them they're stupid, and
: to enter a proper value into the entry box, wouldn't you?
: So, you see, there IS a need for it.
I would disagree. This is just one example of the problem of validating
input. That is a common problem, but is rarely as simple as checking that
something is a number. Even when it is that simple it's not that simple.
First of all, what does a number look like? I would write one million as
1,000,000, but some people would write it as 1.000.000 which simply looks
nonsensical to me. What if someone entered 1,000,00 as the number? How
should that be interpreted?
So, I do not see that this should be a core function inside perl.
: I've had to write a series of regexs for each entry box. My life would have
: been much easier if there was a simple check that they've entered a number,
: or a whole number, as required.
I suspect there is a module that has numerous "canned" routines for
validating inputs of various sorts. I'm pretty sure there are also
numerous "canned" examples of re's to check many formats, perhaps in the
faqs.
Also, cannot the form itself be designed to force validated input in the
first place? I don't know the TK GUI at all, but many other dialog
systems have options in the GUI system to force the user to enter data in
many predetermined formats, such as all digits, or digits with a decimal,
or only x number of characters long, etc etc etc.
Because you need to be able to compare things in a numerical or string
context.
92 eq "092" # false
92 == "092" # true
0 eq "foo" # false
0 == "foo" # true
While Perl doesn't have types for variables, it does care about cotnext,
and in many contexts the variables operated on are coerced into the
correct type. Note that Perl _changes_ the scalar it operates on when
needed. The "type" belongs to the operation, not to the scalars.
You can keep asking questions about this for weeks, but that is simply
how Perl works. It is a _fundamental_ part of the design of Perl.
Variables are not typed as integer, string, float or one of those
things. Variables are scalars, arrays or hashes (and there are a few
others). And scalars have multiple faces, depending on the context in
which they are used. That's how it is. It is deliberately not like C or
languages like that. And because of this design philosophy, the question
"Is this variable an integer or string?" is not a valid question. Perl
does not have those types for variables. The content of the variable can
be evaluated in an integer/numeric or string context, but it as no such
type itself.
Martien
--
|
Martien Verbruggen | I'm just very selective about what I accept
| as reality - Calvin
|
Except for the bitwise logical operators.
{
my($one,$two,$three1,$three2,$ignored);
$one="1";
$two="2";
$three1=$one ^ $two;
$ignored=$one+0;
$three2=$one ^ $two;
if ($three1 eq $three2) {
print "'String or numeric?' is an easy question.\n";
} else {
print "'String or numeric?' is a complicated question.\n";
}
}
Anyone know if separate string and numeric versions of these are planned
for Perl 6?
/Bo Lindbergh (e-mail address is invalidated and ignored)
Well, some operators have multiple contexts, and not all operators force
a certain context. But that still does not mean the scalar has a type
like integer or string, or that the operator doesn't determine the
context.
While perl scalars do not have _a_ type, one can always ask the question
whether the PV (string) part of an SV, or the NV part are valid. Note
that this is not at all equivalent to asking whether the scalar "is a
number". It just asks the question whether the scalar has been used in a
context that required the NV or PV slot to be filled.
my $foo = "foo";
if ($foo == 1)
{
# Do something
}
Is $foo in the above numeric or not? If you ask Perl whether the NV slot
contains a valid value, it'll tell you yes (check with Devel::Peek).
However, most people would not say that that makes it numeric.
> {
> my($one,$two,$three1,$three2,$ignored);
>
> $one="1";
> $two="2";
> $three1=$one ^ $two;
> $ignored=$one+0;
> $three2=$one ^ $two;
Note that this behaviour is documented in perlop.
> if ($three1 eq $three2) {
> print "'String or numeric?' is an easy question.\n";
> } else {
> print "'String or numeric?' is a complicated question.\n";
Indeed. And as I have said before, it is complicated to answer it,
because the question is wrong. Perl does not have types like that. Type
information is entirely determined by the operators at run time, based
onm what the variables look like, or how they've been used.
Adding to the confusion, I think, is that in Perl and the Perl
documentation, people still freely talk about numbers, integers and
strings. It is important to understand that that is not the same as when
people talk about these things in typed languages. In Perl, we mostly
talk about the data contained by the variable, intent, and sometimes
history of the variable.
Since variables do not have fixed types, it is almost impossible to
judge what people mean when they say numeric. Perl doesn't, and
shouldn't try to guess what you mean on this.
> Anyone know if separate string and numeric versions of these are planned
> for Perl 6?
Not me. I've only very sporadically read up on Perl 6. I wish I had more
time to keep up with it, but alas.
Martien
--
|
Martien Verbruggen | Useful Statistic: 75% of the people make up
| 3/4 of the population.
|
Because + is a regex metacharacter. Sure, one could write \+, but I've
been bitten by double interpolation/evaluation so often, that I often
use [+] and such over \+.
Abigail
--
# Perl 5.6.0 broke this.
%0=map{reverse+chop,$_}ABC,ACB,BAC,BCA,CAB,CBA;$_=shift().AC;1while+s/(\d+)((.)
(.))/($0=$1-1)?"$0$3$0{$2}1$2$0$0{$2}$4":"$3 => $4\n"/xeg;print#Towers of Hanoi