File Search

55 views
Skip to first unread message

sc...@tacomadata.com

unread,
Jan 10, 2023, 4:16:46 PM1/10/23
to Isilon Technical User Group
Hello 

I have a few Isilon clusters that store vast numbers of medical images / studies etc that are  accessed by PACS and CPACS systems.  Sometimes our imaging people lose track of a study in their system and will ask me if I have a file named (example) "1.2.234.0.1.234567.6.123123.12345.20180426061923.13.1106".  attempting to traverse the inode tree using 'find' would take months. 

I have FSA (file system analytics) running and capturing info for InsightIQ - aren't these just  SQLite databases?  

Is there a way that is faster than 'find & grep' to check for the existence and location of a file on Isilon? 
 

Jerry

unread,
Jan 10, 2023, 4:22:07 PM1/10/23
to isilon-u...@googlegroups.com
I don't know if FSA is a sqlite database.
but if you have this many files, cataloging filenames into some type of DB might be faster for finding files later.
If your files are only appended and in some structure that you can catalog by day/month or something I would take the hit of a few months to catalog it, then have some way to add new files to that catalog.

Remember you can breakup find into multiple threads to build that last by directories, but sounds like you need some kind of metadata store thats not the filesystem.
Let us know what you end up doing for this, curious.


--
You received this message because you are subscribed to the Google Groups "Isilon Technical User Group" group.
To unsubscribe from this group and stop receiving emails from it, send an email to isilon-user-gr...@googlegroups.com.
To view this discussion on the web visit https://groups.google.com/d/msgid/isilon-user-group/205b0176-69f2-40d7-b012-3247a70295a3n%40googlegroups.com.

Jason Davis

unread,
Jan 10, 2023, 7:31:28 PM1/10/23
to isilon-u...@googlegroups.com
MIght want to have a look at a 3rd party tool to file walk the clusters. In essence the idea is to run "find" and store file system metadata in a database for later querying.

In the free category there is this and I've used this with success on multi-billion file Isilon deployments:

Diskover

This will walk a file system and the data collected is stored in Elastic/OpenSearch. There is a UI that's provided as well that's nice for analytics and file searches through their commercial offering is more full featured.

Dell/EMC/Isilon at various points have had offerings in this space (Outside of FSA) but I'm not sure if they are still supported or if they were a free value add or a paid licensing option.

- Jason

Saker Klippsten

unread,
Jan 10, 2023, 9:17:37 PM1/10/23
to isilon-u...@googlegroups.com
I would check out Diskover started out as a little project for VFX media and entertainment folks.. It's grown a lot since then.




--

bob flynn

unread,
Jan 11, 2023, 3:43:20 AM1/11/23
to isilon-u...@googlegroups.com
yes and no !

yes it gathers the data and put in database.

no, by default it tracks names down to a directory level depth of 5. So if your file is in a directory > 5 a search will fail.

The nest level can be increased, but there is a tradeoff on FSA size and how long processing takes.

--

John Beranek - PA

unread,
Jan 11, 2023, 5:47:37 AM1/11/23
to Isilon Technical User Group
Dell offer DataIQ which should come as a "free, but licensed" product. From our experience this means that getting the license when you're not ordering anything else is possible, but a bit long-winded.


John

Reply all
Reply to author
Forward
0 new messages