Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

Print only unique lines of various files.

16 views
Skip to first unread message

David Kirkby

unread,
Jun 15, 2014, 6:05:41 AM6/15/14
to
I have a number of files, which are 95% the same, but different occasionally. Th comments in the file follow some logic - i.e. the time follows the date, rather than a purely alphabetical order.

e.g. file 1 is j.s11.s2p

! This is a touchstone format file.
! It should be saved with .s2p extension
! HEWLETT PACKARD,8720D,0,7.74
! Date = 6 Jun 2014
! Time = 13:24:07
! Start frequency = 0.050000000 GHz
! Stop frequency = 7.000000000 GHz


File: 2 is j.s22.s2p

! This is a touchstone format file.
! It should be saved with .s2p extension
! HEWLETT PACKARD,8720D,0,7.74
! Date = 6 Jun 2014
! Time = 13:24:28
! Start frequency = 0.050000000 GHz
! Stop frequency = 7.000000000 GHz


The "Time" line is slightly different in these two files, but in principle there could be other changes.

Now I know one can find the uniq lines if one sorts them first, so

cat j.s11.s2p j.s22.s2p | sort | uniq

would work only print the "Time" line twice, but the fact the lines are now sorted in alphabetical order is not what I want. I get this


drkirkby@buzzard:~/VNA-6/src/test$ cat j.s11.s2p j.s22.s2p | sort | uniq | more
! Angles are in degrees
! Averaging = OFF
! Averaging factor = 16
! Calibration = Full 2-port
! Calibration kit = 3.5 mm D
! Date = 6 Jun 2014
<snip out lots of irrelevant things<
! Time = 13:24:07
! Time = 13:24:28


Is there any way to print the file, copying the lines that are different (in this case starting with "! Time", but not totally screwing up the order of the file?

I wan something that looks like this.

! This is a touchstone format file.
! It should be saved with .s2p extension
! HEWLETT PACKARD,8720D,0,7.74
! Date = 6 Jun 2014
! Time = 13:24:28
! Time = 13:24:28
! Start frequency = 0.050000000 GHz
! Stop frequency = 7.000000000 GHz

I am looking for a PORTABLE solution, using a basic borne shell, without any GNUisms.

Dave

Janis Papanagnou

unread,
Jun 15, 2014, 7:25:27 AM6/15/14
to
(I assume this should be different time stamps.)

> ! Start frequency = 0.050000000 GHz
> ! Stop frequency = 7.000000000 GHz
>
> I am looking for a PORTABLE solution, using a basic borne shell, without any GNUisms.

Here is one possible sketch for the given samples (i.e. assuming equal
number of lines in the files)...

exec 3< file1 4< file2
while IFS= read -r line1 <&3 && IFS= read -r line2 <&4
do
printf "%s\n" "$line1"
[[ "$line2" != "$line1" ]] &&
printf "%s\n" "$line2"
done

If you mean bourne shell compatibility literally then adjust the test
operator by something like [ X"$line2" != X"$line1" ] and replace
printf by echo (not that echo would be more portable, though).

Janis

>
> Dave
>

Frank P. Westlake

unread,
Jun 15, 2014, 9:04:03 AM6/15/14
to
On 06/15/2014 03:05 AM, David Kirkby wrote:
> Is there any way to print the file, copying the lines that are different (in this case starting with "! Time", but not totally screwing up the order of the file?
>
> I wan something that looks like this.
>
> ! This is a touchstone format file.
> ! It should be saved with .s2p extension
> ! HEWLETT PACKARD,8720D,0,7.74
> ! Date = 6 Jun 2014
> ! Time = 13:24:28
> ! Time = 13:24:28
> ! Start frequency = 0.050000000 GHz
> ! Stop frequency = 7.000000000 GHz

You could read
$(diff -U 4294967295 j.s11.s2p j.s22.s2p|tail -n +4)
in a loop and output ${line:1}. The output of 'diff' alone is

--- j.s11.s2p 2014-06-15 05:45:08.800210040 -0700
+++ j.s22.s2p 2014-06-15 05:17:08.131217317 -0700
@@ -1,7 +1,7 @@
! This is a touchstone format file.
! It should be saved with .s2p extension
! HEWLETT PACKARD,8720D,0,7.74
! Date = 6 Jun 2014
-! Time = 13:24:07
+! Time = 13:24:28
! Start frequency = 0.050000000 GHz
! Stop frequency = 7.000000000 GHz

So you would be skipping the first three lines and the first column of
each line, producing:

! This is a touchstone format file.
! It should be saved with .s2p extension
! HEWLETT PACKARD,8720D,0,7.74
! Date = 6 Jun 2014
! Time = 13:24:07
! Time = 13:24:28
! Start frequency = 0.050000000 GHz
! Stop frequency = 7.000000000 GHz

Frank

Frank P. Westlake

unread,
Jun 15, 2014, 9:38:46 AM6/15/14
to
On 06/15/2014 06:04 AM, Frank P. Westlake wrote:
> You could read
> $(diff -U 4294967295 j.s11.s2p j.s22.s2p|tail -n +4)
> in a loop and output ${line:1}. The output of 'diff' alone is

A couple more options:

diff -D "" j.s11.s2p j.s22.s2p|grep -v '^#'
diff -D "Skip this line" j.s11.s2p j.s22.s2p|grep -v 'Skip this line'

Frank

Janis Papanagnou

unread,
Jun 15, 2014, 9:46:16 AM6/15/14
to
I can't see that all those diff options would match the OP's "portability"
requirement; they are not even POSIX.

Janis

Frank P. Westlake

unread,
Jun 15, 2014, 9:57:56 AM6/15/14
to
On 06/15/2014 06:46 AM, Janis Papanagnou wrote:
> I can't see that all those diff options would match the OP's "portability"
> requirement...

I can't either.

Frank



Luuk

unread,
Jun 15, 2014, 11:48:37 AM6/15/14
to
Should the 'time' lines be grouped??

If not:
~/tmp> awk '{ a[$0]++ }END{ for (i in a){ print i }}' j.s11.s2p j.s22.s2p
! It should be saved with .s2p extension
! Time = 13:24:28
! Stop frequency = 7.000000000 GHz
! Start frequency = 0.050000000 GHz
! This is a touchstone format file.
0 new messages