Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

how to compare two json file line by line using python?

982 views
Skip to first unread message

Avnesh Shakya

unread,
May 27, 2013, 12:32:40 AM5/27/13
to
hi,
how to compare two json file line by line using python? Actually I am doing it in this way..

import simplejson as json
def compare():
newJsonFile= open('newData.json')
lastJsonFile= open('version1.json')
newLines = newJsonFile.readlines()
print newLines
sortedNew = sorted([repr(x) for x in newJsonFile])
sortedLast = sorted([repr(x) for x in lastJsonFile])
print(sortedNew == sortedLast)

compare()

But I want to compare line by line and value by value. but i found that json data is unordered data, so how can i compare them without sorting it. please give me some idea about it. I am new for it.
I want to check every value line by line.

Thanks

rusi

unread,
May 27, 2013, 12:51:59 AM5/27/13
to
It really depends on what is your notion that the two files are same
or not.

For example does extra/deleted non-significant white-space matter?

By and large there are two approaches:
1. Treat json as serialized python data-structures, (and so) read in
the data-structures into python and compare there

2. Ignore the fact that the json file is a json file; just treat it as
text and use string compare operations

Naturally there could be other considerations: the files could be huge
and so you might want some hybrid of json and text approaches
etc etc

Avnesh Shakya

unread,
May 27, 2013, 1:05:39 AM5/27/13
to rusi, pytho...@python.org

Actually, I am extracting data from other site in json format and I want to put it in my database and when I extract data again then I want to compare last json file, if these are same then no issue otherwise i will add new data in database, so here may be every time data can be changed or may be not so I think sorting is required, but if i compare line by line that will be good, I am thinking in this way...


Steven D'Aprano

unread,
May 27, 2013, 1:33:11 AM5/27/13
to
On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:

> But I want to compare line by line and value by value. but i found that
> json data is unordered data, so how can i compare them without sorting
> it. please give me some idea about it. I am new for it. I want to check
> every value line by line.

Why do you care about checking every value line by line? As you say
yourself, JSON data is unordered, so "line by line" is the wrong way to
compare it.


The right way is to decode the JSON data, and then compare whether it
gives you the result you expect:

a = json.load("file-a")
b = json.load("file-b")
if a == b:
print("file-a and file-b contain the same JSON data")

If what you care about is the *data* stored in the JSON file, this is the
correct way to check it.

On the other hand, if you don't care about the data, but you want to
detect changes to whitespace, blank lines, or other changes that make no
difference to the JSON data, then there is no need to care that this is
JSON data. Just treat it as text, and use the difflib library.

http://docs.python.org/2/library/difflib.html


--
Steven

Avnesh Shakya

unread,
May 27, 2013, 2:28:42 AM5/27/13
to Steven D'Aprano, pytho...@python.org
Thanks a lot, I got it.


Denis McMahon

unread,
May 27, 2013, 6:50:16 AM5/27/13
to
On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:

> how to compare two json file line by line using python? Actually I am
> doing it in this way..

Oh what a lot of homework you have today.

Did you ever stop to think what the easiest way to compare two json
datasets is?

--
Denis McMahon, denismf...@gmail.com

Grant Edwards

unread,
May 28, 2013, 12:00:14 PM5/28/13
to
On 2013-05-27, Steven D'Aprano <steve+comp....@pearwood.info> wrote:
> On Sun, 26 May 2013 21:32:40 -0700, Avnesh Shakya wrote:
>
>> But I want to compare line by line and value by value. but i found that
>> json data is unordered data, so how can i compare them without sorting
>> it. please give me some idea about it. I am new for it. I want to check
>> every value line by line.
>
> Why do you care about checking every value line by line? As you say
> yourself, JSON data is unordered, so "line by line" is the wrong way to
> compare it.

There's no such thing as "lines" in JSON anyway. Outside of string
literals, all whitespace is equivalent, so replacing all newlines with
space characters results in equivalent blobs of JSON -- but one is a
single line, and the other is multiple lines.

> The right way is to decode the JSON data, and then compare whether it
> gives you the result you expect:
>
> a = json.load("file-a")
> b = json.load("file-b")
> if a == b:
> print("file-a and file-b contain the same JSON data")
>
> If what you care about is the *data* stored in the JSON file, this is
> the correct way to check it.

Yup.

--
Grant Edwards grant.b.edwards Yow! Are we laid back yet?
at
gmail.com
0 new messages