You can also get close with daff [1]:
$ daff --context 0 --act insert --id COLUMN1 --id COLUMN2 --id
COLUMN3 lsh.csv rsh.csv
@@,COLUMN1,COLUMN2,COLUMN3
+++,2222,4444,1
+++,7777,7777,1
There's an extra column at the beginning though, which you could strip
with cut:
$ daff --context 0 --act insert --id COLUMN1 --id COLUMN2 --id
COLUMN3 lsh.csv rsh.csv | cut -d, -f 2-
COLUMN1,COLUMN2,COLUMN3
2222,4444,1
7777,7777,1
The daff flags used are as follows:
* --context 0: removes some context rows that daff puts in there by
default, like in a regular diff.
* --act insert: filters for insertions only, ignoring deletions or
modifications.
* --id COLUMN: adds column to the primary key for comparison.
Cheers,
Paul
[1]
https://github.com/paulfitz/daff
On 02/04/2015 03:31 PM, Aaron Schumacher wrote:
> For the example you give, the unix `comm` utility can be used.
> Assuming you do know that the structures (and headers) of the two
> files are the same, and the input files are names `left` and `right`,
> this should do it:
>
> ```
> head -1 left > output # to maintain the header row
> comm -13 left right >> output
> ```
>
> It is important that the input files be sorted, as shown in your example.
>
> - Aaron
>
>
> On Monday, October 6, 2014 at 8:21:21 AM UTC-4, Sri wrote:
>
> +1 for this feature. If it already exists, please let us know.
>
> On Thursday, 16 May 2013 11:56:58 UTC-4,
joachi...@googlemail.com
> wrote:
>
> Hi all,
>
> I would like to compare two csv files lsh and rsh with the
> same structure and with a header line.
>
> Is it possible for diffkit to generate a csv file of the same
> structure containing the lines from rsh file which are not in
> lsh file?
>
> Is there an example how to specify a plan file for that job?
>
> Or can I use diffkit for that with sed or any other tool
> processing the diffkit output?
>
> Many thanks!
>
> Example
>
> *lsh file:*
> COLUMN1,COLUMN2,COLUMN3
> 1111,1111,1
> 1111,1111,2
> 4444,4444,1
> 4444,4444,2
> 6666,6666,1
> 6666,6666,2
>
> *rsh file:*
> COLUMN1,COLUMN2,COLUMN3
> 1111,1111,1
> 1111,1111,2
> 2222,4444,1
> 4444,4444,2
> 6666,6666,1
> 6666,6666,2
> 7777,7777,1
>
> *Output file:*
> COLUMN1,COLUMN2,COLUMN3
> 2222,4444,1
> 7777,7777,1
>
> --
> You received this message because you are subscribed to the Google
> Groups "diffkit-user" group.
> To unsubscribe from this group and stop receiving emails from it, send
> an email to
diffkit-user...@googlegroups.com
> <mailto:
diffkit-user...@googlegroups.com>.
> For more options, visit
https://groups.google.com/d/optout.