scrub as a library

53 views
Skip to first unread message

Colomban Wendling

unread,
Apr 24, 2012, 4:25:35 PM4/24/12
to diskscru...@googlegroups.com
Hi,

I am the author of nautilus-wipe[1], an extension for Nautilus that
provides wiping integration. I currently use secure-delete as a
backend. However, it is not really maintained anymore and lacks
features that scrub has. I thus consider to switch from secure-delete
to scrub.

To achieve that, it would be really convenient to be able to link
against a scrub library rather than to spawn and control subprocesses.
Would you welcome a scrub library? If yes, would you consider doing the
job? Or reviewing and merging a git branch?

Thank you in advance, and best regards,
Colomban


[1]. http://wipetools.tuxfamily.org/nautilus-wipe.html

Jim Garlick

unread,
Apr 24, 2012, 4:36:32 PM4/24/12
to diskscru...@googlegroups.com
Hi Columban,

Can we keep the API pretty simple?  What would you expect it
to look like?

Jim



--
You received this message because you are subscribed to the Google Groups "diskscrub-discuss" group.
To post to this group, send email to diskscru...@googlegroups.com.
To unsubscribe from this group, send email to diskscrub-disc...@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/diskscrub-discuss?hl=en.


Colomban Wendling

unread,
Apr 24, 2012, 7:09:34 PM4/24/12
to diskscru...@googlegroups.com
Hi,

> Can we keep the API pretty simple? What would you expect it
> to look like?

Yes, a simple API would be very nice, I actually though about a
high-level API, quite like what Scrub provides for CLI.

I didn't really tried to get a API draft, but there are a few things
that I would basically need:

* high-level version of the scrub_*() functions, to either remove a
file/directory or to fill free space and(/or) INodes;
* progress reporting support.


Maybe something like this:

/* lists available sequences */
list_t *scrub_sequence_list (void);
/* creates/gets a sequence corresponding to @description */
scrub_sequence_t *scrub_sequence_parse (const char *description);

/* signature of the progress callback */
typedef void (*scrub_progress_callback_t) (double progress,
void *data);

/* checks scrub signature */
bool scrub_check_signature (const char *filename);

/* if disabling hardware random generator is important */
void scrub_use_hwrand (bool enable);

/* scrubs a file/character device, etc */
bool scrub_file (const char *filename,
const scrub_sequence_t *seq,
bool no_sign,
bool force,
bool remove,
bool follow_links,
scrub_progress_callback_t progress_callback,
void *progress_data);
/* scrubs free space */
bool scrub_free (const char *basedir,
const scrub_sequence_t *seq,
bool no_sign,
bool force,
scrub_progress_callback_t progress_callback,
void *progress_data);
/* scrubs inode */
bool scrub_dirent (const char *filename,
scrub_progress_callback_t progress_callback,
void *progress_data);


I didn't include the blocksize and device size parameter in there
because I'm not completely sure what they are useful for, but they
should probably be somewhere there (either as parameter of the
scrub_{file,free}() functions or as global states like hwrand).

What do you think?

Regards,
Colomban

Colomban Wendling

unread,
Apr 25, 2012, 9:59:25 AM4/25/12
to diskscru...@googlegroups.com
Hi again,

Le 25/04/2012 01:09, Colomban Wendling a �crit :
> [...]
>
>
> Maybe something like this:
>
> /* lists available sequences */
> list_t *scrub_sequence_list (void);
> /* creates/gets a sequence corresponding to @description */
> scrub_sequence_t *scrub_sequence_parse (const char *description);
>
> /* signature of the progress callback */
> typedef void (*scrub_progress_callback_t) (double progress,
> void *data);
>
> /* checks scrub signature */
> bool scrub_check_signature (const char *filename);
>
> /* if disabling hardware random generator is important */
> void scrub_use_hwrand (bool enable);
>
> /* scrubs a file/character device, etc */
> bool scrub_file (const char *filename,
> const scrub_sequence_t *seq,
> bool no_sign,
> bool force,

Actually the force parameter is probably not a good idea and the caller
should simply deal with signed files itself by calling
scrub_check_signature(). The reason is that the caller needs to be able
to check whether the signature is there or not anyway, and having
scrub_file() or scrub_free() with this parameter means that they does
the check too, which is redundant.

Moreover, it would probably require those functions to be able to tell
their caller that they didn't do anything because the signature was
there, and then it become a headache. So thinking about it a little
further, I'd remove those "force" parameters.


Regards,
Colomban

Jim Garlick

unread,
Apr 25, 2012, 1:31:01 PM4/25/12
to diskscru...@googlegroups.com
Hi,

Couple of thoughts:

- Drop 'force' and 'no-sign' and make the default be no-sign.
I would consider scrub's "signatures" (incidentally not a cryptographic
signature) to be more useful in a command line environment, to communicate
"you just scrubbed that file, do you really want to do it again?"

- The scrub dirent (not inode) thing really just does successive renames
on the target file so that if the file name itself was secret, you've
written your patterns on it. It is of dubious usefulness because
we have no way to be sure that patterns actually make it to the storage
media. I'd say drop the interface to that.

- My guess is blocksize (a performance variable) need not be settable here

- The device size is for when scrub cannot determine the size of a
device using OS-specific ioctls. Probably not an issue on linux.

- Can you support only "scrub file" where file is a special file or
a regular file, and skip the "scrub free", which fills up your
file system with tmp files, scrubs them, then removes them?

- Drop support for custom sequences. We support a large variety of
canned sequences and can add more as needed.

Assuming all of the above is acceptable, here's a simpler api that
has a higher probability of remaining stable as scrub evolves, IMHO.

/* init/finalize scrub session */
scrub_handle_t scrub_init (const char *path);
void scrub_fini (scrub_handle_t handle);

/* get available scrub methods, fun return is count */
/* Note: methods[] filled in (up to len) with descriptions of methods.
* methods[] index can be used in scrub_attr_set(h, SCRUB_METHOD, index)
* Caller must determine these indices dynamically not hardwire them.
*/
int scrub_methods_get (scrub_handle_t h, char **methods, int len)

/* get/set value of session attributes */
/* SCRUB_METHOD - (index) get/set scrub method
/* SCRUB_REMOVE - (1=remove, 0=leave) remove reg file after scrubbing
/* SCRUB_HWRAND_DISABLE - (1=disable, 0=use if available) hw rand
*/
int scrub_attr_set (scrub_handle_t h, int attr, int val);
int scrub_attr_get (scrub_handle_t h, int attr, int *val);

/* do the work (progress_cb can be NULL) */
int scrub_write (scrub_handle_t h,
void (*progress_cb)(void *arg, double pct_complete),
void *arg);

I would be willing to implement something along these lines.
What do you think?

Jim

Colomban Wendling

unread,
Apr 29, 2012, 9:35:03 AM4/29/12
to diskscru...@googlegroups.com
Hi,

Le 25/04/2012 19:31, Jim Garlick a �crit :
> Hi,
>
> Couple of thoughts:
>
> - Drop 'force' and 'no-sign' and make the default be no-sign.
> I would consider scrub's "signatures" (incidentally not a cryptographic
> signature) to be more useful in a command line environment, to communicate
> "you just scrubbed that file, do you really want to do it again?"

Having the signature may give the same benefit in a UI environment,
telling the user she doesn't need to run it again here. However I agree
it's not a really important feature.

> - The scrub dirent (not inode) thing really just does successive renames
> on the target file so that if the file name itself was secret, you've
> written your patterns on it. It is of dubious usefulness because
> we have no way to be sure that patterns actually make it to the storage
> media. I'd say drop the interface to that.

I would think fsync() is enough after each rename, but maybe I'm
effectively naive. And yes, the filename is way less sensitive than the
data itself anyway. So again, it's probably OK not to include it.

> - My guess is blocksize (a performance variable) need not be settable here

Agreed. The only use I really see for this is that if the library don't
have a way to guess the better value, maybe the calling code *could* do
it on its own, like by querying for some storage media attributes like
cache size or something. But that's not that important and would need a
lot of care from the caller to actually improve the performances anyway.

> - The device size is for when scrub cannot determine the size of a
> device using OS-specific ioctls. Probably not an issue on linux.

Ok. I don't really need non-Linux support myself, although *BSD & co
would be cool, but I also guess these would provide appropriate ioctls
anyway. So your call, but I won't miss it.

> - Can you support only "scrub file" where file is a special file or
> a regular file, and skip the "scrub free", which fills up your
> file system with tmp files, scrubs them, then removes them?

No, I need both. Actually one of the (many) reasons I would like to go
away from secure-delete and use Scrub instead is that secure-delete is
dumb with filling free space and doesn't support the underlying FS to
have a file size limit (especially problematic on FAT).

> - Drop support for custom sequences. We support a large variety of
> canned sequences and can add more as needed.

Sounds sensible to me. The user won't probably know what sequence to
give anyway, and I guess if she knows enough to want a particular
sequence she probably can ask for it to be added.

> Assuming all of the above is acceptable, here's a simpler api that
> has a higher probability of remaining stable as scrub evolves, IMHO.
>
> /* init/finalize scrub session */
> scrub_handle_t scrub_init (const char *path);
> void scrub_fini (scrub_handle_t handle);
>
> /* get available scrub methods, fun return is count */
> /* Note: methods[] filled in (up to len) with descriptions of methods.
> * methods[] index can be used in scrub_attr_set(h, SCRUB_METHOD, index)
> * Caller must determine these indices dynamically not hardwire them.
> */
> int scrub_methods_get (scrub_handle_t h, char **methods, int len)

IIUC, to be sure to get all the available methods, the caller would do:


char **list;
int len;

len = scrub_methods_get(handle, NULL, 0);
list = malloc(len * sizeof *list);
scrub_methods_get(handle, list, len);
/* use list ... */
free(list);


Right? If so, is it really better than returning an allocated list and
a length, and let the user free the list? I don't really care, but I'm
more used to such style of API. Like:


char ** scrub_methods_get(scrub_handle_t, int *len);


and used like:


int len;
char ** list;

list = scrub_methods_get(handle, &len);
/* use list ... */
free(list);


Maybe that's only style though, so you can just ignore my remark if you
want.

> /* get/set value of session attributes */
> /* SCRUB_METHOD - (index) get/set scrub method
> /* SCRUB_REMOVE - (1=remove, 0=leave) remove reg file after scrubbing
> /* SCRUB_HWRAND_DISABLE - (1=disable, 0=use if available) hw rand
> */
> int scrub_attr_set (scrub_handle_t h, int attr, int val);
> int scrub_attr_get (scrub_handle_t h, int attr, int *val);
>
> /* do the work (progress_cb can be NULL) */
> int scrub_write (scrub_handle_t h,
> void (*progress_cb)(void *arg, double pct_complete),
> void *arg);
>
> I would be willing to implement something along these lines.
> What do you think?

Looks OK to me, but as said above I would like to have a way to scrub
free space too. Maybe simply adding an attribute, say SCRUB_OPERATION
or something would be fine?

Thanks a lot for working on this :)


Off topic, but I was wondering: AFAICT Scrub cannot (yet?) recursively
scrub directories. Could this bet added, or the caller have to deal
with it if it wants to?


Best regards,
Colomban

Jim Garlick

unread,
Apr 30, 2012, 12:13:23 PM4/30/12
to diskscru...@googlegroups.com
Hi Columban,

OK, I'll make accomodations for
- dynamically allocating the methods array
- SCRUB_OPERATION attr
- recursive scrub

Will follow up with a revised API after I get through some other
things need to do this week.

Regards,

Jim

Jim Garlick

unread,
May 7, 2012, 1:37:43 PM5/7/12
to diskscru...@googlegroups.com
Columban,

Just wanted to update that I didn't get to this last week and likely
won't be able to this week as one of my other projects just got some
test time on a big cluster.

Do you have a time frame for integrating this API into your package?

Jim

Colomban Wendling

unread,
May 7, 2012, 2:04:00 PM5/7/12
to diskscru...@googlegroups.com
Hi Jim,

Le 07/05/2012 19:37, Jim Garlick a �crit :
> Columban,
>
> Just wanted to update that I didn't get to this last week and likely
> won't be able to this week as one of my other projects just got some
> test time on a big cluster.

First, thanks a lot for working on it at all :)

> Do you have a time frame for integrating this API into your package?

Yes; ideally I should make a release integrating it at the end of the
month (like the 29th or something). So a great timeline would be like:

may 15th-22th: (or before of course)
libscrub integration
may 22th-29th:
testing & debugging
may 29th:
nautilus-wipe release

But I also guess that I don't need a whole week to integrate your API,
so having it around the 18th should be enough. Moreover it's not a
problem for the testers to require a development version of scrub, so
the API could very well be tested 'til like the 25th.

And of course ideally for me you make a release integrating the lib
before may 29th.


I understand very well it's quite short, and if that can't be reached
I'll try to report a little the release, -- or give my beta-testers a
shorter frame and force them to work harder :)


Again, thanks a lot for your responsiveness and your paste and future
work :) And if I can help, just tell me, I'll be happy to.

Best regards,
Colomban

Jim Garlick

unread,
May 14, 2012, 4:37:15 PM5/14/12
to diskscru...@googlegroups.com
Columban,

My apologies again for being somewhat unresponsive.
Status is I still don't have much time to work on this as another project is taking up most of my time, but I did just push the following to git:

- Fix all support functions to not exit or leak memory on error.

- Add automake/libtool infrastructure:
  Run ./autogen.sh, then ./configure --enable-libscrub

- in ./libscrub, support functions are rebuilt as position-independent code.
  Prototype for api is in scrub.c, stubs in libscrub.c.

I made libscrub not build by default so that for now I only have to make this work on linux.
(I don't know how to control symbol visibility in AIX or solaris and I don't want to know :-)

I am not sure that I will be able to meet your schedule but I will do my best as time permits.
I almost certainly will not get recursive scrubbing or -X (fill up file system) mode in that time frame.

Any feedback on what's there so far is welcome!

Regards,

Jim

On Mon, May 7, 2012 at 11:04 AM, Colomban Wendling <list...@herbesfolles.org> wrote:
Hi Jim,

Best regards,
Colomban

Colomban Wendling

unread,
May 16, 2012, 9:15:47 PM5/16/12
to diskscru...@googlegroups.com
Hi Jim,

Le 14/05/2012 22:37, Jim Garlick a �crit :
> Columban,
>
> My apologies again for being somewhat unresponsive.
> Status is I still don't have much time to work on this as another
> project is taking up most of my time, but I did just push the
> following to git:
>
> - Fix all support functions to not exit or leak memory on error.

Great, that's an important part of the story :)

> - Add automake/libtool infrastructure:
> Run ./autogen.sh, then ./configure --enable-libscrub
>
> - in ./libscrub, support functions are rebuilt as position-independent code.
> Prototype for api is in scrub.c, stubs in libscrub.c.
>
> I made libscrub not build by default so that for now I only have to make
> this work on linux.
> (I don't know how to control symbol visibility in AIX or solaris and I
> don't want to know :-)

As I already said, although portability on non-Linux would be cool, it's
not a must-have for me :)

> I am not sure that I will be able to meet your schedule but I will do my
> best as time permits.
> I almost certainly will not get recursive scrubbing or -X (fill up file
> system) mode in that time frame.

Well, we'll see what we can get. Unfortunately without -X I couldn't
entirely rely on scurb, but I'll still be happy with what I can have by
that time. However, I'm wondering what makes difficult to support -X,
and if I can help in any way to support it? Anyway, thank you again
for working on this :)

> Any feedback on what's there so far is welcome!

At first glance, plus a basic test of usage in my application shows that
it looks good but:

*) is it useful that scrub_methods_get() takes a context? It doesn't
seem convenient, since one could want to ask the user what method she
wants before creating the context (and even before knowing what to
actually delete).

*) do you think that it would make sense to allow changing the path
bound to the context, like scrub_path_set()? It would allow to re-use
the context with another path, allowing not to re-set the attributes.
However maybe it breaks a little the context-driven design; so it's just
a though.


Best regards,
Colomban

Jim Garlick

unread,
May 16, 2012, 9:58:17 PM5/16/12
to diskscru...@googlegroups.com
On Wed, May 16, 2012 at 6:15 PM, Colomban Wendling <list...@herbesfolles.org> wrote:
Hi Jim,


Le 14/05/2012 22:37, Jim Garlick a écrit :
> I almost certainly will not get recursive scrubbing or -X (fill up file
> system) mode in that time frame.

Well, we'll see what we can get.  Unfortunately without -X I couldn't
entirely rely on scurb, but I'll still be happy with what I can have by
that time.  However, I'm wondering what makes difficult to support -X,
and if I can help in any way to support it?   Anyway, thank you again
for working on this :)

I am thinking mainly about callbacks for progress.
Right now progress is calculated per file.  Would need to overhaul that for -X.
Maybe I could give you an inaccurate progress without too much pain :-)
 
> Any feedback on what's there so far is welcome!

At first glance, plus a basic test of usage in my application shows that
it looks good but:

*) is it useful that scrub_methods_get() takes a context?  It doesn't
seem convenient, since one could want to ask the user what method she
wants before creating the context (and even before knowing what to
actually delete).

I think we could do without the context here.
 
*) do you think that it would make sense to allow changing the path
bound to the context, like scrub_path_set()?  It would allow to re-use
the context with another path, allowing not to re-set the attributes.
However maybe it breaks a little the context-driven design; so it's just
a though.

Maybe that's OK - I cant' think of why not right now.
 
JIm


Jim Garlick

unread,
Jun 7, 2012, 1:35:53 PM6/7/12
to diskscru...@googlegroups.com
Hi Columban,

I have further interrupts going on that are preventing me from getting
this done on your schedule (sorry!). I've committed a checkpoint of what
I have so far and tried to capture our emails in 'issue 17' on google
code. Let's take our discussion off the diskscrub-discuss list and into
the issue tracker if you don't mind.

Jim


On Wed, May 16, 2012 at 06:58:17PM -0700, Jim Garlick wrote:
> On Wed, May 16, 2012 at 6:15 PM, Colomban Wendling
> <[1]list...@herbesfolles.org> wrote:
>
> Hi Jim,
>
> Le 14/05/2012 22:37, Jim Garlick a *crit :
> > I almost certainly will not get recursive scrubbing or -X (fill up
> file
> > system) mode in that time frame.
>
> Well, we'll see what we can get. *Unfortunately without -X I couldn't
> entirely rely on scurb, but I'll still be happy with what I can have by
> that time. *However, I'm wondering what makes difficult to support -X,
> and if I can help in any way to support it? * Anyway, thank you again
> for working on this :)
>
> I am thinking mainly about callbacks for progress.
> Right now progress is calculated per file. *Would need to overhaul that
> for -X.
> Maybe I could give you an inaccurate progress without too much pain :-)
> *
>
> > Any feedback on what's there so far is welcome!
>
> At first glance, plus a basic test of usage in my application shows that
> it looks good but:
>
> *) is it useful that scrub_methods_get() takes a context? *It doesn't
> seem convenient, since one could want to ask the user what method she
> wants before creating the context (and even before knowing what to
> actually delete).
>
> I think we could do without the context here.
> *
>
> *) do you think that it would make sense to allow changing the path
> bound to the context, like scrub_path_set()? *It would allow to re-use
> the context with another path, allowing not to re-set the attributes.
> However maybe it breaks a little the context-driven design; so it's just
> a though.
>
> Maybe that's OK - I cant' think of why not right now.
> *
> JIm
>
> --
> You received this message because you are subscribed to the Google Groups
> "diskscrub-discuss" group.
> To post to this group, send email to diskscru...@googlegroups.com.
> To unsubscribe from this group, send email to
> diskscrub-disc...@googlegroups.com.
> For more options, visit this group at
> http://groups.google.com/group/diskscrub-discuss?hl=en.
>
> References
>
> Visible links
> 1. mailto:list...@herbesfolles.org
Reply all
Reply to author
Forward
0 new messages