Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Validate SSN regex issue

35 views
Skip to first unread message

bubunia...@gmail.com

unread,
Jul 29, 2015, 6:50:29 AM7/29/15
to
Hi all,
I am a newbie and was trying one regex problem in perl. I am getting junk output due to improper parsing of command line input. Can anyone please help me in this regard?

Regards
Pradeep

Problem Statement

Validate SSN:
<char><char><char><char><char><digit><digit><digit><digit><char>
The task is to figure out if the SSN is valid or not.

Output Format

For every SSN listed from command line, print YES if it is valid and NO if it isn't.

Sample Input

3
ABCDS1234Y
ABAB12345Y
avCDS1234Y


My Code :

#!/usr/bin/perl -w
use strict;
use warnings;

sub isvalidpan {
my $number_of_pan = shift @ARGV;
print "Initial Number of PAN: $number_of_pan\n";

foreach(@ARGV) {
if($number_of_pan) {
print "$_\n";
chomp(@ARGV);
validatePAN($_);
}
}
}

sub validatePAN {
my $str = shift;

#my @out = split(/""/,$str);
#print "OUT: @out\n";

print " Validating the PAN inside function: $str\n";
#while (@out) {
if ($str =~m/[A-Z][A-Z][A-Z][A-Z][A-Z]\d\d\d\d[A-Z]{10}/) {
print STDOUT "YES" ;
}
else {
print STDOUT "NO" ;
}
#}
}

&isvalidpan(@ARGV);


Rainer Weikusat

unread,
Jul 29, 2015, 7:51:31 AM7/29/15
to
bubunia...@gmail.com writes:
> I am a newbie and was trying one regex problem in perl. I am
> getting junk output due to improper parsing of command line
> input. Can anyone please help me in this regard?

[...]


> Validate SSN:
> <char><char><char><char><char><digit><digit><digit><digit><char>
> The task is to figure out if the SSN is valid or not.

[...]


> if ($str =~m/[A-Z][A-Z][A-Z][A-Z][A-Z]\d\d\d\d[A-Z]{10}/) {

This regex matches any string containing five uppercase letters followed
by 5 digits followed by 10 uppercase letters. You should probably use

/^[A-Z]{5}\d{4}[A-Z]$/

instead (which matches a string composed of 5 upper-case letters
followed by 4 digits followed by another upper-case letter).

Jens Thoms Toerring

unread,
Jul 29, 2015, 8:10:33 AM7/29/15
to
bubunia...@gmail.com wrote:
> Hi all,
> I am a newbie and was trying one regex problem in perl. I am getting junk output due to improper parsing of command line input. Can anyone please help me in this regard?

> Regards
> Pradeep
>
> Problem Statement

> Validate SSN:
> <char><char><char><char><char><digit><digit><digit><digit><char>
> The task is to figure out if the SSN is valid or not.

If it's that simple (I don't know anything about SSNs, we don't
have them over here) the proper regex wold be

/^[A-Z]{5}\d{4}[A-Z]$/i

I.e. it starts with 5 letters, is followed by 4 digits and
another letter. The 'i' qualifier makes the tests case-in-
sensitive. The '^' at the start and the '$' at the end of the
regex make sure that no additional character can be present.

So your whole program could be done as

#!/usr/bin/perl
use strict;
use warnings;

while ( my $ssn = shift @ARGV ) {
print $ssn, ": ", $ssn =~ /^[A-Z]{5}\d{4}[A-Z]$/i ? "yes" : "no", "\n";
}

If you want to read from stdin instead of having to pass the
SSN's arguments replace the 'shift @ARGV' bit by '<>'.

Regards, Jens
--
\ Jens Thoms Toerring ___ j...@toerring.de
\__________________________ http://toerring.de

bubunia...@gmail.com

unread,
Jul 29, 2015, 8:19:51 AM7/29/15
to
Thanks for your help. But the problem is when the code is run with the following arguments the response of validation of ARGV[1]=YES is concatenating with the second argument(YESAVMNP6543P) which is what I dont intend to do.

1. C:\Perl\bin>perl test.pl 2 AVDPS4325N AVMNP6543P
Initial Number of PAN: 2
AVDPS4325N
Validating the PAN inside function: AVDPS4325N
YESAVMNP6543P
Validating the PAN inside function: AVMNP6543P
YES
C:\Perl\bin>

2. I want the number of arguments gets controlled by the number of SSN strings entered. Wondering how can I do it.

C:\Perl\bin>perl test.pl 3 AVDPS4325N AVMNP6543P
Initial Number of PAN: 3
AVDPS4325N
Validating the PAN inside function: AVDPS4325N
YESAVMNP6543P
Validating the PAN inside function: AVMNP6543P
YES
C:\Perl\bin>

George Mpouras

unread,
Jul 29, 2015, 8:34:49 AM7/29/15
to
On 29/7/2015 1:50 μμ, bubunia...@gmail.com wrote:
> ABCDS1234Y
> ABAB12345Y
> avCDS1234Y
>


use strict; use warnings; use feature 'say';

my $regex = qr/(?i)^[a-z]{4}\w\d{4}[a-z]?/o;

until (eof DATA) {
$_ = <DATA>;
/^\s*(#.*)?$/ ? next : chomp;
say "$_ : ". ( $_=~$regex ? 'ok' : 'no' )
}

__DATA__

# some sample data

ABCDS1234Y
ABAB12345Y
avCDS1234Y

Jens Thoms Toerring

unread,
Jul 29, 2015, 8:41:16 AM7/29/15
to
bubunia...@gmail.com wrote:
> Thanks for your help. But the problem is when the code is run with the following arguments the response of validation of ARGV[1]=YES is concatenating with the second argument(YESAVMNP6543P) which is what I dont intend to do.

Well, if you want a line-break output a "\n" after the "YES"
or "NO". the print function doesn't add one unless you
explicitely tell it to. BTW, you don't need to tell print
to print to STDOUT, that's what it does per default. So

print "YES\n";

prints to STDOUT and that with a line-break after the "YES".

George Mpouras

unread,
Jul 29, 2015, 8:42:46 AM7/29/15
to
number of arguments = scalar @ARGV;

iterate the arguments


foreach (@ARGV)
{
print $_;
}


or


for (my $i=0; $i<@ARGV; $i++)
{
print "arg $i is $ARGV[$i]\n"
}




gamo

unread,
Jul 29, 2015, 9:21:27 AM7/29/15
to
El 29/07/15 a las 14:19, bubunia...@gmail.com escribió:
Does the number of arguments passed count the possible concatenations?

If no, then could be handled as

perl -E '$a="YESabcd1234a"; if ($a=~/(YES|NO|)?\D{4}\d{4}\D/){
say 1;}else{say 0;};'
1

If yes, then I don't know yet.

--
http://www.telecable.es/personales/gamo/
The generation of random numbers is too important to be left to chance

bubunia...@gmail.com

unread,
Jul 29, 2015, 11:48:41 AM7/29/15
to
On Wednesday, July 29, 2015 at 6:51:27 PM UTC+5:30, gamo wrote:
> El 29/07/15 a las 14:19, bubunia...@gmail.com escribió:
> > On Wednesday, July 29, 2015 at 5:21:31 PM UTC+5:30, Rainer Weikusat wrote:
> >> bubunia...@gmail.com writes:
> >>> I am a newbie and was trying one regex problem in perl. I am
> >>> getting junk output due to improper parsing of command line
> >>> input. Can anyone please help me in this regard?
> >>
> >> [...]
> >>
> >>
> >>> Validate SSN:
> >>> <char><char><char><char><char><digit><digit><digit><digit><char>
> >>> The task is to figure out if the SSN is valid or not.
> >>
> >> [...]
> >>
> >>
> >>> if ($str =~m/[A-Z][A-Z][A-Z][A-Z][A-Z]\d\d\d\d[A-Z]{10}/) {
> >>
> >> This regex matches any string containing five uppercase letters followed
> >> by 5 digits followed by 10 uppercase letters. You should probably use
> >>
> >> /^[A-Z]{5}\d{4}[A-Z]$/
> >>
> >> instead (which matches a string composed of 5 upper-case letters
> >> followed by 4 digits followed by another upper-case letter).
> >
> > Thanks for your help. But the problem is when the code is run with the following arguments the response
>
> of validation of ARGV[1]=YES is concatenating with the second
> argument(YESAVMNP6543P)
>
> which is what I dont intend to do.

Thanks all for your help.
> >
> > 1. C:\Perl\bin>perl test.pl 2 AVDPS4325N AVMNP6543P
> > Initial Number of PAN: 2
> > AVDPS4325N
> > Validating the PAN inside function: AVDPS4325N
> > YESAVMNP6543P
> > Validating the PAN inside function: AVMNP6543P
> > YES
> > C:\Perl\bin>
> >
> > 2. I want the number of arguments gets controlled by the number of SSN strings entered. Wondering how can I do it.
> >
> > C:\Perl\bin>perl test.pl 3 AVDPS4325N AVMNP6543P
> > Initial Number of PAN: 3
> > AVDPS4325N
> > Validating the PAN inside function: AVDPS4325N
> > YESAVMNP6543P
> > Validating the PAN inside function: AVMNP6543P
> > YES
> > C:\Perl\bin>
> >
>
> Does the number of arguments passed count the possible concatenations?
>
> If no, then could be handled as
>
> perl -E '$a="YESabcd1234a"; if ($a=~/(YES|NO|)?\D{4}\d{4}\D/){
> say 1;}else{say 0;};'
> 1
>
> If yes, then I don't know yet.
>


Good point..No it does not count the number of characters which should not more than 10. That is the reason I tried something passing {10} in my original code in the regex. But it was not working. I tried your code also but it also doesnot count the number of characters properly not sure what is wrong. I think best way to fix counting the number of characters is grouping each items and collect in $1,$2,$3 and sum of $1+$2+$3 should be 10. Let me try that. Any Suggestion/Comments ?

if (($str = ~ m/^[A-Z]{5}\d{4}[A-Z]$/) && ($str = ~/\D{5}\d{4}\D/)) {
print "YES\n" ;
}
else {
print "NO\n" ;
}

Passed more than 10 characters in the SSN but still it prints YES.

C:\Perl\bin>perl test.pl 2 AVDPS4325NXaa AVMNP6543P
Initial Number of PAN: 2
AVDPS4325NXaa
Validating the PAN inside function: AVDPS4325NXaa
YES
AVMNP6543P
Validating the PAN inside function: AVMNP6543P
YES

C:\Perl\bin>





Jens Thoms Toerring

unread,
Jul 29, 2015, 12:27:53 PM7/29/15
to
bubunia...@gmail.com wrote:
> Good point..No it does not count the number of characters which should not
> more than 10. That is the reason I tried something passing {10} in my
> original code in the regex. But it was not working. I tried your code also
> but it also doesnot count the number of characters properly not sure what is
> wrong. I think best way to fix counting the number of characters is grouping
> each items and collect in $1,$2,$3 and sum of $1+$2+$3 should be 10. Let me
> try that. Any Suggestion/Comments ?

> if (($str = ~ m/^[A-Z]{5}\d{4}[A-Z]$/) && ($str = ~/\D{5}\d{4}\D/)) {

There is no space allowed between the '=' and the '~'. '=~'
is an operator and if you insert a space it becomes something
completely differect. With the space you're matching against
'$_' and then assign the all-bits-reversed of the result of
the match to '$str' (which is then checked for being true or
false).

All you need is

if ( $str =~ /^[A-Z]{5}\d{4}[A-Z]$/ ) {

which is only ever true if '$str' has five letters between A
and Z coming first, then 4 decimal digits and then again a
single character between A and Z. It won't be true under any
other circumstances. And it won't be true if '$str' contains
more than 10 characters (with the only exception of a trai-
ling mew-line). So there's absolutely no need to count cha-
racters. And the part begining with '&&' in your line of
code is superfluous, it's just a less restrictive match than
the first, so if the first match was successful, the second
also is.

bubunia...@gmail.com

unread,
Jul 29, 2015, 2:09:55 PM7/29/15
to
Excellent it works... Thanks a lot...You are real genius :-)

shar...@hotmail.com

unread,
Jul 29, 2015, 2:29:37 PM7/29/15
to
When you have "use warnings" enabled, then dont put "-w" on the #! line.

You can use the below code as a sort of boilerplate for your validation
activities as & when it requires some scaling in terms of checks on your
PANs/SSNs.

Also it's not clear what purpose is served by the first parameter, which
is a number, and not being used anywhere.


#!/usr/bin/perl

use v5.8.8; # or later
use strict;
use warnings;

use Carp qw( croak );
use File::Basename qw( basename );
use List::Util qw( max );

local $\ = qq{\n}; # auto-print newlines
local $, = qq{};

my $PN = basename $0; # program name

my $USAGE_MSG = <<"USAGE";
Usage: $PN Num_of_PANs PAN1 PAN2 ... PANn

Ex: $PN 3 AVDPS4325N AVMNP6543P DF45FGH7U
USAGE

my $VALID_PAN_REGEX = qr{
^ # Begin the pan string
[A-Z]{5} # 5 uppercase characters, followed by
[0-9]{4} # 4 numeric digits, and ending with a
[A-Z] # single uppercase character.
$ # End of the pan string
}x;

my %show_pan = (
gud => sub { "\33[32m[$_[0]]\33[0m" },
bad => sub { "\33[31m[$_[0]]\33[0m" },
);

use vars qw( *red );
local *red = $show_pan{bad};

my %Constraints = (
usage => sub {
croak
join "\n", (
red(qq{[ERROR] $PN: Incorrect Number of parameters.}),
qq{\tSupplied: '$_[0]'},
qq{\tRequired: atleast '2'.},
q{},
qq{$USAGE_MSG},
)},

non_empty => sub {
length($_[0]) > 0 or
croak
join "\n", (
red(qq{[ERROR] $PN: Empty first parameter.}),
qq{\tSupplied: '$_[0]'.},
qq{\tRequired: numeric.},
qq{},
qq{$USAGE_MSG},
)},

is_numeric => sub {
$_[0] =~ m/\D/ and
croak
join "\n", (
red(qq{[ERROR] $PN: Nonnumeric first parameter.}),
qq{\tSupplied: '$_[0]'.},
qq{\tRequired: numeric.},
qq{},
qq{$USAGE_MSG},
)},
);

sub validate_PANs {
my ($number_of_PANs, @PANs) = @_;

print "Initial number of PANs: $number_of_PANs.";

my $argc = @PANs;
my $fmt = max map { length() } @PANs;
my @dots = map { "." x ($fmt - length() + 3) } @PANs;

while ( defined(my $current_pan = shift @PANs) ) {
printf '%s: %s%s',
q{Validating the PAN inside the function},
qq{<$current_pan>},
$dots[$argc-$#PANs-2]
;

my $is_a_valid_PAN = $current_pan =~ m/$VALID_PAN_REGEX/;

my $result = $is_a_valid_PAN ? "YES" : "NO";

print $show_pan{ $is_a_valid_PAN ? "gud" : "bad" }->($result);
}
}

# validate constraints on the number of params supplied
$Constraints{usage}->(0+@ARGV)
if @ARGV < 2;

# validate constraints on the 1st param
$Constraints{$_}->($ARGV[0])
for qw( non_empty is_numeric );

validate_PANs( @ARGV );

__END__
0 new messages