syncIQ progress

1,405 views
Skip to first unread message

Dan Pritts

unread,
Jun 22, 2018, 4:36:21 PM6/22/18
to isilon-u...@googlegroups.com
Hi all,

Is there any way to get detailed status out of a syncIQ job while it is running?  

Is it common for jobs to get "stuck"?

Source cluster is 7.2.1.2, target is 8.1.0.2.

I don't normally use synciq, we're using it for migration, so i'm no expert.   Initial sync went fine, took quite a while but that was OK.  Unfortunately, I have to re-sync most everything because of permission mapping foulups.  On this particular synciq policy, i created a new one fresh.    I turned on the differential mode via "target compare initial sync", and it isn't hammering the bandwidth between the clusters, but it is just taking bloody forever.    2 days so far, on a directory with ~300k files, totalling 1.8TB.  The target cluster is totally unloaded, the source cluster is in production use but has reasonable amounts of CPU and disk I/O bandwidth available. 

I sniffted the wire between the two and it doesn't appear to be doing anything obviously useful.   There are lots of 1-data-byte packets back and forth and a lot with the following payload.  Not much else. 

000000580000001c00000000000 [ followed by a bunch more null bytes, 88 bytes total payload ]

I'm off to open a ticket on this, but if anyone has any suggestions I'd love to hear it.

--
Dan Pritts
ICPSR Computing & Network Services
University of Michigan

Dan Pritts

unread,
Jun 23, 2018, 1:31:42 AM6/23/18
to isilon-u...@googlegroups.com


Dan Pritts wrote on 6/22/18 4:36 PM:
> Hi all,
>
> Is there any way to get detailed status out of a syncIQ job while it
> is running?
...answer, not on a running job but yes by modifying the log level on
the policy.
> Is it common for jobs to get "stuck"?
Dunno how common but it sure seemed to have happened here.

I ended up killing the active jobs, whacking the policies, creating new,
more granular policies, and it is working much better.

danno

Adam Carrgilson

unread,
Jun 25, 2018, 4:10:17 AM6/25/18
to Isilon Technical User Group
Hi Dan,

I've got the following which tells me how much my running SyncIQ jobs have worked through:

isi sync jobs list | grep running | awk {' print $1 '} | while read -r line; do Z1=$(isi sync jobs list | grep ${line}); Z2=$( isi sync jobs reports view --policy=${line} | grep "Total Network Bytes:" | awk '{ print $4 }' | awk '{ split( "B KiB MiB GiB TiB PiB" , v ); s=1; while( $1>1024 ){ $1/=1024; s++ } print int($1) v[s] }'); echo -e "${Z1} \t${Z2}"; done

I hope that helps someone!

Adam.

Dan Pritts

unread,
Jun 26, 2018, 12:45:35 PM6/26/18
to isilon-u...@googlegroups.com, Adam Carrgilson


Adam Carrgilson wrote on 6/25/18 4:10 AM:

Hi Dan,

I've got the following which tells me how much my running SyncIQ jobs have worked through:

isi sync jobs list | grep running | awk {' print $1 '} | while read -r line; do Z1=$(isi sync jobs list | grep ${line}); Z2=$( isi sync jobs reports view --policy=${line} | grep "Total Network Bytes:" | awk '{ print $4 }' | awk '{ split( "B KiB MiB GiB TiB PiB" , v ); s=1; while( $1>1024 ){ $1/=1024; s++ } print int($1) v[s] }'); echo -e "${Z1} \t${Z2}"; done

I hope that helps someone!

It does help, thanks.  The "isi sync jobs reports view" command is not exactly intuitively named, but it shows the status of a running job.

your script has a minor error,

    Z1=$(isi sync jobs list | grep ${line});

if you have two running policies, named so that one name is a substring of another name, it'll throw off the logic.  This would be better, but i'm not sure it will always work. 

    Z1=$(isi sync jobs list | grep "^${line} ");



Regarding my original problem, I discovered I was missing a syncIQ related patch on the source cluster.  Since I applied that, jobs that formerly got stuck seem to be running OK, although the number of files they say they're changing seems high. 

thanks
danno

Adam Carrgilson

unread,
Jun 28, 2018, 11:59:36 AM6/28/18
to Isilon Technical User Group
As well as my previous CLI method, I realised I also pull the same information out through the web API:

If you pull the following path:
https://[Isilon Host]:8080/platform/1/sync/jobs

 You should be able to gather the SyncIQ job states, policy names, the running duration and bytes transferred in a JSON format.

Dan Pritts

unread,
Jun 28, 2018, 2:02:45 PM6/28/18
to isilon-u...@googlegroups.com, Adam Carrgilson
I need to spend more time with the API.  Interesting.

Meanwhile, i spent an hour on webex with a nice engineer last night.  She had several errors she was looking for, but none of them panned out.  Jobs still seem to "hang".  We did discover that sometimes if you cancel a hanging job, then re-run it, it will finish immediately.  So whatever was stuck just needs to be inspected.  Unfortunately that didn't always work.

She suggested I upgrade the old cluster to 7.2.1.5, which I didn't want to muck with but i guess hsouldn't be a big deal. 

I learned abou the isi_repstate_mod command - it is a low-level interface to syncing.  the --help is useful.  the pol_id's it refers to are hex strings, they are shown when you view a policy with isi sync policy view. 

Meanwhile, while shuffling some network interfaces around so I could get more bandwidth for syncing, one of my current cluster's nodes panicked (luckily, failover worked properly and the node came back up OK after a power cycle).  Nothing is ever easy.


Adam Carrgilson wrote on 6/28/18 11:59 AM:
--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Scott Hunter

unread,
Jun 29, 2018, 8:39:23 PM6/29/18
to Isilon Technical User Group
I'm curious what patch you found.  Was it on the source cluster or on the target cluster?   I'm getting very poor performance syncing from v7.2.1.4  to  v8.1.0.2.   

Dan Pritts

unread,
Jun 30, 2018, 4:57:55 PM6/30/18
to isilon-u...@googlegroups.com, Scott Hunter


Scott Hunter wrote on 6/29/18 8:39 PM:
> I'm curious what patch you found. Was it on the source cluster or on
> the target cluster? I'm getting very poor performance syncing
> from v7.2.1.4 to v8.1.0.2.
> --

There was a very specific syncIQ patch for 7.2.1.2, I think the bug had
been fixed by 7.2.1.4. I don't think it helped me at all - there were
some specific log messages.

There is a jumbo rollup synciQ patch for 8.1.0.2. I had installed that
before I got started so no idea if it helped or hurt.

google for the isilon recommended patches document, they were in there.

I have ended up breaking up my syncs into many, many small jobs. It is
annoying as !@$#(&*^ but it is working. Most of the jobs eventually
complete OK, and for the ones that don't I'm just whacking the
destination and starting from scratch.

thanks
danno

John Beranek

unread,
Jul 4, 2018, 10:39:17 AM7/4/18
to isilon-u...@googlegroups.com

I’ve written various tools that use the API and produce much more (to our minds) useful information than the Isilon GUI, like:

 

 

Cheers,

 

John

 

--

cid:image007.jpg@01D200AC.AA260900

John Beranek

Operations Infrastructure Architect

www.pressassociation.com

T: +44 1430 456055

M: +44 7583 749367

The Press Association, Bridgegate, Howden, DN14 7AE

 

Registered Address: PA Group Limited, 292 Vauxhall Bridge Road, London, SW1V 1AE. Registered in England No.4197

--

You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


This email is from the Press Association. For more information, see www.pressassociation.com. This email may contain confidential information. Only the addressee is permitted to read, copy, distribute or otherwise use this email or any attachments. If you have received it in error, please contact the sender immediately. Any opinion expressed in this email is personal to the sender and may not reflect the opinion of the Press Association. Any email reply to this address may be subject to interception or monitoring for operational reasons or for lawful business practices.

Scott Hunter

unread,
Jul 11, 2018, 6:07:37 PM7/11/18
to Isilon Technical User Group


On Wednesday, July 4, 2018 at 7:39:17 AM UTC-7, John Beranek - PA wrote:

I’ve written various tools that use the API and produce much more (to our minds) useful information than the Isilon GUI


This looks very nice.  If I can beta test, let me know!

-Scott
Reply all
Reply to author
Forward
0 new messages