How to Organize Your Isilon File System

gtjones

unread,

Jun 26, 2013, 8:16:16 AM6/26/13

to isilon-u...@googlegroups.com

I'm researching how best to organize my file system and balance some of the limitations of Isilon. I have two clusters, one is the primary and the other is the backup. We use syncIQ to backup the primary and create 35 days worth of snaps on the backup cluster. We also have 3-5 days worth of snaps on the primary for quick restore. I'm running 6.5.5.22.

I would like to have it organized by major business area then minor business area. Each request for storage would initiate a new directory (aka data area) in the minor business folders.On that data area folder I would place a CiFS share or NFS export, quota, security, and replication for backups. The potential for 100's of these data areas is likely with each data requiring a backup of 35 days (i.e. 35 snaps). Isilon limits the number of snaps to 2048 per array and we'll easily exceed that. That means I'll need to push the replication job higher in the tree thereby reducing the number of folders that are snapped. If I do this, then the file system structure can get a little ugly or I lose the ability to offer my customer that flexibility.

Potential ways to solve the problem:

1. Create higher in the tree a branch for backup and no-backup. Not ideal since customers often request multiple data areas in the same folder with different backup requirements. Plus customers don't want to see backup/no-backup in their directory path.

2. Same as above, but use soft links to make the data area visible under a single folder. This may solve it. I'm looking for pitfalls on this one.

3. Replicate everything high in the tree and change the exclusion list on the sync job for any new data areas that don't need backed up. Not good because every change requires full sync.

Any experience out there with organizing your file system around this limitation? Are there other things I should be concerned about?

Here's a picture:

Thanks,

Greg

Peter Serocka

unread,

Jun 26, 2013, 9:51:25 AM6/26/13

to isilon-u...@googlegroups.com

Greg,

2048 per cluster is a soft limit, but performance might suffer
starting around "4000 or so".
(page 22 of http://www.emc.com/collateral/hardware/white-papers/h10588-isilon-data-availability-protection-wp.pdf)
You might carefully explore what you backup(!) cluster can bare...

Different snapshots schedules can be "layered".
This allows for a scheme a follows:

The deeper in the dir hierarchy you get, the more
dir-specific schedules are set, albeit with lower
frequency but longer retention time, like in this sketch:

Upper dir-level: hourly, keep 50 hours, 1 dir => 50 snap instances

Middle dir-level: daily, keep 15 days, 10 dirs = 450 snap instances

Deep dir-level: weekly, keep 8 weeks, 100 dirs = 800 snap instances

That gives a pretty flexible scheme with much less that 1500 snapshot
instances for example. At the cost of reduced granularity (in time)
for older snaphost, and reduced granularity (in dir-space) for younger
snaphots.

The hourly snapshots are provided to all clients in this sketch, but
kept for only two days. That can make it acceptable to give them "for
free" even when not requested.

You are using a backup cluster and might not be interested in hourly
snaphots on either cluster, but I hope you get the idea how layering
and correlating dir-depth, frequency and retention can help reducing
the number of snap instances.

That said, the snaphot logic is still deeply interwoven with the file
system itself and you might end up in a layout like:

/ifs / major business / backup-policies (hidden) / data...
/ifs / major business / minor business / links resp. aliases to data

Not sure wether on kind of links or aliases will woth
for both NFS and CIFS...

You can't set quota per minor business in this scheme though, only per
data folder (= minor business x backup).

Peter

> Thanks,
> Greg
>
>
>
> --
> You received this message because you are subscribed to the Google
> Groups "Isilon Technical User Group" group.
> To unsubscribe from this group and stop receiving emails from it,
> send an email to isilon-user-gr...@googlegroups.com.
> For more options, visit https://groups.google.com/groups/opt_out.
>
>

Peter Serocka

unread,

Jun 26, 2013, 9:59:04 AM6/26/13

to isilon-u...@googlegroups.com

On Wed 26 Jun '13 md, at 21:51 st, Peter Serocka wrote:

> Greg,
>
> 2048 per cluster is a soft limit, but performance might suffer
> starting around "4000 or so".

Just for the records -- how beautiful advertising language can be:

"SnapshotIQ supports a highly scalable number of snapshots throughout
an Isilon storage cluster and up to 1,024 snapshots per directory."

from http://www.emc.com/storage/isilon/snapshotiq.htm

Cheers

Peter

Luc Simard

unread,

Jun 26, 2013, 10:55:20 AM6/26/13

to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com

You should balance Snapshot usage against using SyncIQ from your source to target.

Under v7 , you will get additional tool to simplify the restore of snaps, in addition to getting access to failover / fail back

Luc Simard - 415-793-0989
simard.j...@gmail.com
Messages may contain confidential information.
Sent from my iPhone

gtjones

unread,

Jun 26, 2013, 11:41:22 AM6/26/13

to isilon-u...@googlegroups.com

Pete,

Thanks for the helpful reply.

What about something like this.

/ifs/Major Business/Minor Business/hidden directory ("no-backup")/Data Area

The Minor Business directory would have a soft link to the Data Area folder. My SyncIQ job would be set at at the Major Business level and exclude all directories with the name ".no-backup". The soft link in the Minor Business folder gives them access to the Data Area folder.

Not sure about the implications of doing this, such as CIFS using soft links, I can still get useful quota info at the major and minor business levels.

Thoughts?

Thanks Again

Andrew Stack

unread,

Jun 26, 2013, 12:42:34 PM6/26/13

to isilon-u...@googlegroups.com

Hi Greg,

Instead of business units you may want to consider organizing your data by policy with associated SLA (see below). This is basically what you are striving for as far as my quick read. Snap policies are set at a higher level which gets around your 2048 limitation and replication again is done at a high level directory structure. Any directories you create subsequently under each respective path simply gets its associated SLA inheritance. SnycIQ is very straight forward at this point as you just replicate your Prod directories to your DR cluster. Simple and clean.

Cheers,

- Andrew Stack

--

You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Andrew Stack

Sr. Storage Administrator

Genentech

Cell - 650.867.5524

image.png

Peter Serocka

unread,

Jun 27, 2013, 2:15:55 AM6/27/13

to gtjones, isilon-u...@googlegroups.com

Greg,

probably it will mostly depend on the soft links question;

let us know when you try it out in your context.

By creating that single .no-backup folder per Minor Business,

do you feel you have enough flexibility for the future?

Or is it likely you will come up with .backup1, .backup2, ...

after some time?

In contrast to Andrew's suggestion, you are starting

kind of a storage virtualisation from the clients' side here

(different policies or service levels transparently

within one share). It clearly shows both the power as well as the

current limitations of OneFS; I find it highly fascinating.

Actually we a doing a somewhat hybrid approach over here

(considering disk/node pools, snapshots, NDMP tape backup, no SyncIQ).

Our top-level is, in fact, split by policy:

Low-latency, frequent but global snaphots, daily tape => homes/desktops/documents/e-mails

Large capacity, daily/weekly snapshots per share, daily tape => valuable research data

Medium capacity, daily/weekly snaphost per share, no tape => transient research data, "scratch"

The lower levels are differently organized for these three branches.

For example, the "large capacity" branch is subdivided by

research groups (~ major business) and then by projects (~ minor business).

On the projects level we have quotas, snaphot and tape backup policies,

roughly following the layered approach from my previous post.

This helps us in reacting to large unforeseen changes in

volume size (per research project) within short intervals!

While the low-latency branch ("homes")

is organised more flat -- no disruptive fluctuations seen here.

No soft links are needed -- from the clients' view,

the users' home dirs are on different paths/shares

as research data anyway. Same for scratch data.

I am curious to learn wether the announced ViPR product(s)

will further help streamlining and unifying

those complementary approaches.

Peter

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Peter Serocka

CAS-MPG Partner Institute for Computational Biology (PICB)

Shanghai Institutes for Biological Sciences (SIBS)

Chinese Academy of Sciences (CAS)

320 Yue Yang Rd, Shanghai 200031, China

pser...@picb.ac.cn

Peter Serocka

unread,

Jun 27, 2013, 2:36:30 AM6/27/13

to Andrew Stack, isilon-u...@googlegroups.com

Andrew,

this looks very much like the way of designing LUNS with

different charactistics in the first place

and then provisioning storage from those, to meet the SLAs.

Luckily, when doing this on a single OneFS, one clearly benefits

from Isilon's inherent storage-side "virtualisation" (--no LUNs--)!

But what do you do when the snaphots for, say /p01,

get too large due to a single client? Manually removing

snaps will affect all clients in that production branch.

Might be an issue if large fluctuations in volume size are permitted

or data is modified at high rates.

It is exciting to see how both ends (storage view and clients' view)

are moving towards each other. (See my recent post to Greg.)

Peter

On 2013 Jun 27. md, at 00:42 st, Andrew Stack wrote:

Hi Greg,

Instead of business units you may want to consider organizing your data by policy with associated SLA (see below). This is basically what you are striving for as far as my quick read. Snap policies are set at a higher level which gets around your 2048 limitation and replication again is done at a high level directory structure. Any directories you create subsequently under each respective path simply gets its associated SLA inheritance. SnycIQ is very straight forward at this point as you just replicate your Prod directories to your DR cluster. Simple and clean.

<image.png>

Cheers,

- Andrew Stack

LinuxRox

unread,

Jun 27, 2013, 9:12:52 AM6/27/13

to isilon-u...@googlegroups.com

Isilon support gave me hell when they saw that i had snapshots that were over-lapping each other and causing out-of-sequence snapshot deletes, something to consider.

Peter Serocka

unread,

Jun 28, 2013, 1:07:16 AM6/28/13

to LinuxRox, isilon-u...@googlegroups.com

Interesting, can you elaborate or send pointers?

I know that when manually deleting a series of snapshots

one by one, it is more efficient to start with the

youngest (with its expiry day the farthest in the future),

instead of proceeding chronologically. Because chronologically,

covered dated is sort of pushed (or handed over) to the next

snapshot, while in reverse order, the "salami gets sliced

from one end" (= far future).

(Of course usually one would delete all snaps in question with

a single isi snap delete call and let OneFS sort things out.)

So I wonder what makes Isilon support scared of "out-of-seqence" deletes?

Peter

Peter Serocka

unread,

Jul 1, 2013, 12:37:04 AM7/1/13

to LinuxRox, isilon-u...@googlegroups.com

Thanks - as snaphots can be created manually at any
irregular scheme, support's notion of over-lapping
remains unclear to me. Scheduling is just automation,
and as long as snaphosts cover the same dir, I can't
see any rationale for the "origin" (scheduling-wise)
of a snapshot instance having any effect on the delete.

We have 4-hourly, daily and weekly snapshots
heavily over-lapping on both our X200 and 108NL nodes.

I quickly reviewed the past SnapshotDelete jobs
and found most of them finishing within just 0 seconds,
with the largest of them freeing 2-5 Mio LINs.
Within 0 seconds. As isi job hist -v says.

A few dozen deletes took 100-200 seconds though,
and three deletes within the past four months took
more than 1000 seconds with 0.5-1.5 Mio LINs deleted per run.

The deleted sizes in bytes or blocks are not
reported, but probably the 1000s-deletes match
those few multi-TB deletes that have been there.

Otherwise, if occasional 1000s-delete are the price
for "over-lapping" schedules, well, I am willing to
pay that price. (Deletes are scheduled to night hours
anyway...)

Any deeper insight from Isilon is highly welcome!

Peter

On 2013 Jul 1. md, at 10:34 st, LinuxRox wrote:

> i had a couple of directories that had multiple schedules with different retention
[...]

Andrew Stack

unread,

Jul 1, 2013, 12:28:14 PM7/1/13

to isilon-u...@googlegroups.com, LinuxRox

Overlapping snapshots is an inherent problem with OneFS and becomes particularly noticeable/problematic in larger clusters with millions of LIN's. Basically what ends up occurring is that the snapshot deletion job take much longer to process. This in turn makes other operations wait and you end up with a cascade effect of backend operations taking long intervals to complete.

The reason being (short version) that if you have daily and hourly snapshots scheduled for the same path your cluster must do a comparison between the nightly and daily to see what blocks it can recover and which blocks to keep. This constitutes an overlap condition.

To solve this Isilon recommends that you choose either keep hourly, daily or weekly snaps but not any combination of the three for any particular path. What this means in practical terms is that you either:

A. Store you data by snapshot policy. (i.g. I have a directory that keeps 7 daily snapshots). This is o.k. if you are deploying a cluster for the first time.

B. If you already deployed your cluster then you are likely going to have to revisit your policies and flatten them to either hourly, daily, or weekly. (i.g. I have a directory with 4 hourly snaps per day and 7 daily - I would change this to 4 hourly per day with a retention of a week).

The B option has it's own penalty in that you are going to store more snaps thus taking up more space. However, it does speed up snapshot deletion and is one more best practice from Isilon to keep you cluster running optimally.

Long term, the hope / expectation is that Isilon works through this issue but for now these are your options.

I hope this helps clarify matters.

Cheers,

Andrew Stack

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

--

Luc Simard

unread,

Jul 2, 2013, 4:44:15 AM7/2/13

to isilon-u...@googlegroups.com, isilon-u...@googlegroups.com, LinuxRox

Peter

While I don't have a lot of data on your cluster environment,

Andrew has provided a few good strategies in this thread , while this is not a problem specifically, it is liked under 6.5.5.x that every LINs must be updated in overlapping snapshots, forcing OneFS to walk the longer path, and deleting out of order may not meet your expectations in terms of disk space recovery.

Always delete from oldest to newest.

Andrew's clusters are mostly running 6.5.5.x , Under v7.0.x and beyond, the data & disk layout has changed and you should see some gains there, if you tend to snapshot very files, the. I strongly recommend you go to 7.0.1.6 or newer. Otherwise, move to 6.5.5.23 which was released just yesterday.

Both are available on support.emc.com

Cheers

Luc Simard - 415-793-0989

simard.j...@gmail.com

Messages may contain confidential information.

Sent from my iPhone

Peter Serocka

unread,

Jul 2, 2013, 5:41:31 AM7/2/13

to Luc Simard, isilon-u...@googlegroups.com, LinuxRox

Luc and Andrew,

thanks a lot for your input.

The other thread "Isilon small files performance"

has a perfect example for SnapshotDelete being just

one more member of a cascade of stowing/stowed jobs,

just what you (Andrew) have been warning about.

Under such conditions, it is perfect advice to keep

SnapshotDelete's activities minimal, by avoiding

over-lapping schedules. (NDMP snapshots will not

follow this. Ouch.) A deeper understanding of the

snapshot mechanism I'd find highly fascinating,

and benefitial for operating OneFS, but it's probably

beyond what can be discussed here.

Still, for our 4 nodes NL108 cluster with

100+ Mio files, after having sorted out many issues

with the job priorities/impacts/schedule,

and with careful monitoring of the job engine:

I am happy to say that the over-lapping

snaphots are working fine for us. I am aware

we might be stretching OneFS to its out-of-the-box

limits, but it has been beneficial, and it's fun.

Glad to read OneFS 7 has more improvements!

Cheers

Peter

Richard Kunert

unread,

Jul 8, 2013, 1:18:52 PM7/8/13

to isilon-u...@googlegroups.com

Out of order snapshot deletions have been a big issue for us. When deleting snapshots of very large files (virtual machines) an out of order deletion can take long enough literally to make a VM crash. The VM files are locked during the process and the VM host (Citrix XenServer in this case) logs that the Isilon is not responding. After 120 seconds (yes, that long) Linux gives up on the volume and marks it readonly. We are very careful now to avoid out of order deletions, especially on those directories, and we have specific Nagios checks to see if any root volumes have gone readonly. We have had the same issues adding new nodes, the autobalance jobs lock files as they're being shifted around. VMs crashing left and right.

This has been helped considerably by changing the NFS connection parameters that XenServer uses, which by default are fairly bizarre (soft, short timeouts, etc.). VMware's default NFS settings are better, we haven't had nearly as many issues with our vSphere cluster.

NDMP backups also create snapshots unless you point them at an existing snapshot. That was an unexpected source of out of order deletions for us.

--

Richard

Peter Serocka

unread,

Jul 9, 2013, 2:21:37 AM7/9/13

to Richard Kunert, isilon-u...@googlegroups.com

Richard

thanks a lot! I will double check for undetected impact

on the workflows over here; traditional NFS3 file access, no VMs...

Peter

--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
For more options, visit https://groups.google.com/groups/opt_out.

Reply all

Reply to author

Forward