SyncIQ intra-cluster copy ?

1,435 views
Skip to first unread message

dynamox

unread,
May 28, 2013, 1:59:04 PM5/28/13
to isilon-u...@googlegroups.com
Hello folks,

I have a requirement to copy data on regular basis from /ifs/app1 to /ifs/app2 within the same cluster, it's about 3TB of data, used by Linux servers ..millions of small files. Preferably i would like to use internal tools to copy the data versus utilizing host based tools like rsync. The other day i decided to play with SyncIQ and see if i could get it to copy the data but no matter what options (include) i tried it would always complain about "SyncIQ policy target path overlaps source path on same cluster". So is it safe to assume i won't be able to use SyncIQ for intra-cluster copy ? I tried using rsync in the cluster itself and it was just as slow as using an external linux server.

OneFS 6.5.5.12

Thanks

Saker Klippsten

unread,
May 28, 2013, 2:34:04 PM5/28/13
to isilon-u...@googlegroups.com
SyncIQ is only cluster to cluster....as you found out. Though I have not tested to see if you created another set of ips/subnet on the same cluster for another interface and tried to sync back to the same cluster specifying the ip for the target cluster name.

Few questions.

Maybe you can talk more about your requirement..

1. (/ifs/app2 )can it be read-only or will the Linux servers that are accessing /ifs/app2 need to write to /ifs/app2.
2. Will it be 3TB regularly or just the first time you copy over, then its just "change data". or will it be a new set of 3TB and the old 3TB gets removed...

Maybe there is away to snapshot /ifs/app1 directory and share the snapshot directory ( will be read only ) which is why I asked..






If** /ifs/app2 can be read only. Then maybe you could just use snapshots and share that specific snapshot directory.


--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.
 
 

LinuxRox

unread,
May 28, 2013, 2:39:50 PM5/28/13
to isilon-u...@googlegroups.com
/ifs/app1 and /ifs/app2 needs to be both read-write enabled. It's ok to have /ifs/app2 read-only during "refresh" process (something that you would typically have with SyncIQ target) but after that there is another set of application will need to write to that directory. Think of it as refreshing QA instance from Prod.

Preferably incremental sync but if there is a more efficient way of doing full intra-cluster copy, i am game for that as well.

Chris Pepper

unread,
May 28, 2013, 2:49:23 PM5/28/13
to isilon-u...@googlegroups.com
dynamox,

Ah, that is a sensitive topic. Yes, SyncIQ can run with both client and server on the same cluster -- you can specify 127.0.0.1 in the network target, for instance -- but I strongly recommend against this. Our performance was in the 10mbit/sec range, and I spoke to more than a dozen Isilon Support reps about this ~~3-4 years ago. Most of them were convinced local SyncIQ was simply impossible, while others assured me it is indeed a supported configuration. But nobody was able to actually help with the local configuration, and we never fixed the performance problem. Instead we switched to a set of local rsync jobs to parallelize the work.

SyncIQ is somewhat fragile and Isilon's support for it is poor in general. Your scenario is far outside the realm where you can expect useful assistance.

Chris

LinuxRox

unread,
May 28, 2013, 3:43:32 PM5/28/13
to isilon-u...@googlegroups.com
Thank you Chris, i have a virtual instance of Isilon so i went ahead and tried with 127.0.0.1 but it complained  with message "SyncIQ policy target path overlaps source path on same cluster". I guess they added something to the code to prohibit your from even attempting that. I guess i don't have a choice but to use rsync for the moment.


Keith Nargi

unread,
May 28, 2013, 4:42:08 PM5/28/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
Put an IP address as the target for your siq job to be one of the interfaces on your cluster.  You can definitely do siq within the same cluster.  What you end up doing is one of the following.  Send data out one or more than one interface of the cluster all directed back to a single interface of the cluster or you send data out an back on the same interface.  Obviously only having one interface as your target is going to be a bottleneck and you are going to have workers doing both reads and writes for SIQ.  Will it work if you do this yes, performance is going to be the biggest question mark.  

Sent from my iPhone

LinuxRox

unread,
May 28, 2013, 4:53:51 PM5/28/13
to isilon-u...@googlegroups.com
Keith,

i just tried that and it's giving me the same error message. Do you have an example where it works (what did you specify for root directory, include directory and target directory)

Keith Nargi

unread,
May 28, 2013, 4:58:19 PM5/28/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
Ill send config when I can get in front of a cluster 

Sent from my iPhone

Keith Nargi

unread,
May 29, 2013, 3:32:00 PM5/29/13
to isilon-u...@googlegroups.com
LinuxRox,
I went in an did a quick little loop and created 10 directories and recursively built files that were 128, 256k, 384k ... in each directory of AP1
I created the ap2 directory as my target for SIQ
Total data created was 100 files about 67MB in total.

tester-1# isi version
Isilon OneFS v6.5.5.12 B_6_5_5_164(RELEASE): 0x605050000C000A4:Wed Oct 31 17:44:54 PDT 2012    ro...@fastbuild-03.west.isilon.com:/build/mnt/obj.RELEASE/build/mnt/src/sys/IQ.amd64.release  (I believe this is what you said you were using for a cluster OS)

tester-1# cd /ifs/data/
tester-1# ls
ap1     ap2
tester-1# cd ap1
tester-1# ls -l
total 20
drwxr-xr-x    2 root  wheel  201 May 29 10:31 1
drwxr-xr-x    2 root  wheel  201 May 29 10:33 10
drwxr-xr-x    2 root  wheel  201 May 29 10:31 2
drwxr-xr-x    2 root  wheel  201 May 29 10:31 3
drwxr-xr-x    2 root  wheel  201 May 29 10:32 4
drwxr-xr-x    2 root  wheel  201 May 29 10:32 5
drwxr-xr-x    2 root  wheel  201 May 29 10:32 6
drwxr-xr-x    2 root  wheel  201 May 29 10:32 7
drwxr-xr-x    2 root  wheel  201 May 29 10:33 8
drwxr-xr-x    2 root  wheel  201 May 29 10:33 9
tester-1# cd 1
tester-1# ls -l
total 11855
-rw-r--r--    1 root  wheel   131072 May 29 10:31 Testfile-1
-rw-r--r--    1 root  wheel  1310720 May 29 10:31 Testfile-10
-rw-r--r--    1 root  wheel   262144 May 29 10:31 Testfile-2
-rw-r--r--    1 root  wheel   393216 May 29 10:31 Testfile-3
-rw-r--r--    1 root  wheel   524288 May 29 10:31 Testfile-4
-rw-r--r--    1 root  wheel   655360 May 29 10:31 Testfile-5
-rw-r--r--    1 root  wheel   786432 May 29 10:31 Testfile-6
-rw-r--r--    1 root  wheel   917504 May 29 10:31 Testfile-7
-rw-r--r--    1 root  wheel  1048576 May 29 10:31 Testfile-8
-rw-r--r--    1 root  wheel  1179648 May 29 10:31 Testfile-9

tester-1# cd /ifs/data/ap2
tester-1# ls
tester-1#

Here is my network info

tester-2:isi networks ls p -v
tester-2: subnet0:pool0 - Default pool
tester-2:           In Subnet: subnet0
tester-2:          Allocation: Static
tester-2:              Ranges: 1
tester-2:                      192.168.0.50-192.168.0.59
tester-2:     Pool Membership: 3
tester-2:                      1:ext-1 (up)
tester-2:                      2:ext-1 (up)
tester-2:                      3:ext-1 (up)
tester-2:    Aggregation Mode: Link Aggregation Control Protocol (LACP)
tester-2:        SmartConnect:
tester-2:                      Suspended Nodes  : None
tester-2:                      Auto Unsuspend ... 0
tester-2:                      Zone             : N/A
tester-2:                      Time to Live     : 0
tester-2:                      Service Subnet   : N/A
tester-2:                      Connection Policy: Round Robin
tester-2:
here are my interfaces
tester-1# isi networks ls ifaces | grep ext
1:ext-1         up          subnet0:pool0             192.168.0.50
2:ext-1         up          subnet0:pool0             192.168.0.51
3:ext-1         up          subnet0:pool0             192.168.0.52

Here is my SIQ policy
tester-1# isi sync policy list ap1-ap2 --verbose
Id: fa1f7b2d8b823a971aada0fc443b64ba
    Spec:
        Type: user
        Name: ap1-ap2
        Description:
        Source paths:
            Root Path: /ifs/data/ap1
        Source node restriction:
        Destination:
            Cluster: 192.168.0.51
            Password is present: no
            Path: /ifs/data/ap2
            Make snapshot: off
            Restrict target by zone name: off
            Force use of interface in pool: off
        Predicate: None
        Check integrity: yes
        Skip source/target file hashing: no
        Disable stf syncing: no
        Log level: notice
        Maximum failure errors: 1
        Target content aware initial sync (diff_sync): no
        Log removed files: no
        Rotate report period (sec): 31536000
        Max number of reports: 2000
        Coordinator performance settings:
            Workers per node: 3
    Task: sync manually
    State: on

Run policy
tester-1# isi sync policy run ap1-ap2
Running ap1-ap2
tester-1# isi sync job list
Name    | Action | State   | Started/Last run  | Next Run/Duration
--------+--------+---------+-------------------+------------------
ap1-ap2 | sync   | Running | 05/29/13 11:16:45 |           37 secs

tester-1# isi sync job report
Name    | Act  | St      | Duration | Transfer | Throughput
--------+------+---------+----------+----------+-----------
ap1-ap2 | sync | Running | 3:00     |    34 MB |   1.5 Mb/s

tester-1# isi sync job report
Name    | Act  | St      | Duration | Transfer | Throughput
--------+------+---------+----------+----------+-----------
ap1-ap2 | sync | Running | 5:07     |    59 MB |   1.5 Mb/s


tester-1# isi sync job list
Name    | Action | State | Started/Last run  | Next Run/Duration
--------+--------+-------+-------------------+------------------
ap1-ap2 | sync   | on    | 05/29/13 11:16:45 |     Not scheduled

Report details

tester-1# isi sync policy report ap1-ap2 --verbose
Id: fa1f7b2d8b823a971aada0fc443b64ba
Name: ap1-ap2
   Action: sync
   Sync Type: initial
   Job ID: 1
   Started: Wed May 29 11:16:45 GMT 2013
   Run time: 6:18
   Ended: Wed May 29 11:23:03 GMT 2013
   Status: Success
   Details:
      Directories:
         Visited on source: 11
         Deleted on destination: 0
      Files:
         Total Files: 100
         Actually transferred: 100
         New files: 100
         Updated files: 0
         Automatically retransmitted files: 0
         Deleted on destination: 0
         Skipped for some reason:
            Up-to-date (already replicated): 0
            Modified while being replicated: 0
            IO errors occurred: 0
            Network errors occurred: 0
            Integrity errors occurred: 0
      Bytes:
         Total Network Traffic: 69 MB (72187573 bytes)
         Total Data: 69 MB (72089620 bytes)
         File Data: 69 MB (72089620 bytes)
         Sparse Data: 0B
      Phases (2/2):
         Treewalk (STF_PHASE_TW)
            Start: Wed May 29 11:16:54 GMT 2013
            End: Wed May 29 11:22:50 GMT 2013
         ID map backup (STF_PHASE_IDMAP_SEND)
            Start: Wed May 29 11:22:50 GMT 2013
            End: Wed May 29 11:22:56 GMT 2013

Hope this helps
--
Keith 

LinuxRox

unread,
May 29, 2013, 4:01:27 PM5/29/13
to isilon-u...@googlegroups.com
Keith,

thank you very much for posting detailed information, your example worked perfectly. Here my job that was failing before with error message

"
SyncIQ policy target path overlaps source path on same cluster. Policy testpolicyincl syncs to local cluster and target path /ifs/data/target overlaps source base path /ifs/data (unrunnable)". I guess it's something about the include statement that it does not like:

isilonpoc-1# isi sync policy list testpolicyincl --verbose
Id: c7de2d543321c7933a87cfd8844e6bbb
    Spec:
        Type: user
        Name: testpolicyincl

        Description:
        Source paths:
            Root Path: /ifs/data
            Include: /ifs/data/source
        Source node restriction:
        Destination:
            Cluster: 10.140.13.41
            Password is present: no
            Path: /ifs/data/target

Chris Pepper

unread,
May 29, 2013, 4:02:31 PM5/29/13
to isilon-u...@googlegroups.com
I believe the include should be relative, so just 'source' in this case.

Chris

LinuxRox

unread,
May 29, 2013, 4:17:29 PM5/29/13
to isilon-u...@googlegroups.com
so what do you guys think in terms of performance, i know i will most likely read and write through the same network interface but i have 10G interfaces and my back-end is 7x  108NL nodes. I will overrun my disks before i saturate my NICs. I just think this would be much more speedy than doing rsync. What are you thoughts ?

Chris Pepper

unread,
May 29, 2013, 4:25:19 PM5/29/13
to isilon-u...@googlegroups.com
Run over 127.0.0.1 and don't worry about network bandwidth.

SyncIQ *should* be more efficient than rsync, especially if it can use a snapshot to avoid traversing large directories to find small changes (rsync must read through both source & target files), but you will need to see whether it actually runs well.

Chris

Keith Nargi

unread,
May 29, 2013, 4:53:17 PM5/29/13
to isilon-u...@googlegroups.com
I just retested my config. I added an 11th folder with 7MB of data which encompassed 10 files.  I reran my SIQ job using loopback vs a physical interface.  The job completed in 59 secs with 1.5Mbps of throughput.  Rsync on the same dir->dir took 9mins 37 secs andit looks based on mtime that it rewrote all the data that was in ifs/data/ap2 not just the new data.  Don't forget SIQ will mark ap2 dir as RO automatically..

I say use SIQ.

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.





--
Keith 

Luc Simard

unread,
May 29, 2013, 5:05:48 PM5/29/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
SyncIQ will dedicate one worker per file, which typically beats rsync flat out because all connected nodes can be contributors in the transfer , say 4 workers per node, 10 nodes, 40 workers.

Rsync = 1 job per node, may be a few more, not policy based

I also say use SyncIQ

Luc Simard - 415-793-0989
Messages may contain confidential information.
Sent from my iPhone

Saker Klippsten

unread,
May 29, 2013, 5:56:08 PM5/29/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
Also note synciq is block level not file level like rsync is. so subsequent syncs are way way faster. 

Blake Golliher

unread,
May 29, 2013, 6:16:00 PM5/29/13
to isilon-u...@googlegroups.com
SyncIQ is block based to find file differences, but the transport is per file.  The difference is moving 100TB of 1k files vs. 100TB of 10g files, moving the 10g files will be much faster even though it's the same amount of space.  Other vendor block based means it moves every block with data in it to the remote filesystem, so 100TB of 1k files or 100TB of 10g files transfers at the same time.  

-Blake

Saker Klippsten

unread,
May 29, 2013, 7:01:28 PM5/29/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
  1. Here is some more info on the process. 

    http://www.emc.com/collateral/hardware/white-papers/h8224-replication-isilon-synciq-wp.pdf


    Efficient block-based deltas

    The initial replication of a new policy or a changed policy will perform a full baseline replication of the entire dataset based on the directory and file selection policy criteria. This baseline replication is necessary to ensure all original data is replicated to the remote location. However, every incremental job execution of that policy will transfer only the bytes which have changed since the previous run (on a per-file basis). SynclQ uses internal file system structures to identify changed blocks and, along with parallel data transfer across the cluster, minimizes the replication time window and network use. This is critical in cases where only a small fraction of the dataset has changed, as in the case of virtual machine VMDK files, in which only a block may have changed in a multi-gigabyte virtual disk file. Another example is where an application changed only the file metadata (ACLs, Windows ADS). In these cases, only a fraction of the dataset is scanned and subsequently transferred to update the target cluster dataset. 



LinuxRox

unread,
May 29, 2013, 8:52:44 PM5/29/13
to isilon-u...@googlegroups.com
Keith,

did it make any difference using loopback vs physical interface in terms of performance ?

Keith Nargi

unread,
May 29, 2013, 9:25:47 PM5/29/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
To be honest I didn't really test that.  I can if you'd like but please remember this is being tested inside a virtual cluster.  Performance is better tested on physical gear and preferably in your own environment.  I could test on physical gear in three different sites with 3 different network topologies and see similar results or drastic depending on the switched networks. 


Sent from my iPhone

LinuxRox

unread,
May 29, 2013, 9:33:31 PM5/29/13
to isilon-u...@googlegroups.com
fair enough, i realize each environment is very different ..but if you have a chance to test those two scenarios that would be great.

LinuxRox

unread,
Jun 24, 2013, 3:30:15 PM6/24/13
to isilon-u...@googlegroups.com
Hi guys, sorry to bring this up but i was re-reading the SyncIQ WP Best practices and this sentence stood out "SyncIQ is also able to use the same cluster as a target in order to create local replicas. In this scenario, efficient data transfer occurs across the cluster’s Infiniband back-end network." In my testing if i use 127.0.0.1 as my target cluster name, it always uses my external NIC interfaces to copy the data (as seen in InsightIQ). Thoughts ?

Luc Simard

unread,
Jun 25, 2013, 6:02:07 PM6/25/13
to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com
You can see the internal ip range using isi networks list ifaces -v

But simply point the policy to the FQDN of you cluster to a target path.


Luc Simard - 415-793-0989
Messages may contain confidential information.
Sent from my iPhone

LinuxRox

unread,
Jun 25, 2013, 9:08:13 PM6/25/13
to isilon-u...@googlegroups.com
Luc,

i am not following you, are you referring to SmartZone name ? How do i force SyncIQ to use IB connection for intra-cluster copy ?
Reply all
Reply to author
Forward
0 new messages