Am 05.03.2016 um 15:46 schrieb Charlie Hoffpauir:
> Thanks Marcel, that's getting really, really close, and I can probably
> work with it as-is. Operating on the original text, the code was
> missing any capitalized surname that had a Parenthesis attached, but I
> can easily edit the text and remove all parentheses before running the
> code. But the code is picking up two surnames for each line number...
> the first surname that appears, and the next one... (see the listing
> for line number 100002 below)
>
> Example:
>
> 100001 GOODEN GREEN
> 100002 JOSEPH NORMAND
[...]
>
> any easy modification to fix that?
>
You don't neccessarily have to edit your original file, you can as well
add more .split()'s after the ones I put into the foreach-loop. Just be
careful with escaping the " correctly: "\""
For the duplicates I suspect they're not seperated by whitespaces, but
by a tab or something else. You might want to check with an editor that
can display those special characters; You can then either replace them
or add an appropriate .split(). Using an appropriate editor you can also
make sure that you have all lines starting with your numbers and that
there are no carriage-returns / line-feeds in between.
(I use Notepad++, which can easily display all those special characters;
but feel free to use anything else.)
To check the line-splitting you can add a "write-host $line[$i]" into
the for-loop. I did that when I wrote up the script to see whether the
split worked well.
HTH
--
Marcel