binary object solution pack appending ".bin" to OBJ downloads

118 views
Skip to first unread message

Alex Garnett

unread,
Jul 13, 2016, 7:49:53 PM7/13/16
to islandora
Hi folks,

Does anyone have any idea why the binary object solution pack would be appending ".bin" to all object downloads? This isn't exactly helpful as we generally want to serve binary objects with the same extension that they were uploaded with, but I suspect it might not actually be the intended behavior as I can't find it anything in the module code that would plausibly be doing this.

Thanks!

bgil...@pitt.edu

unread,
Jul 14, 2016, 11:44:21 AM7/14/16
to islandora
Hi Alex -- 

The default code simply glues together the datastream label and gets the file extension from the mimetype (and if there is an extension in the label already, it supersedes the mimetype value).  There is the possibility that your specific datastream has an incorrect mimetype OR that the label value ends with ".bin".

We wrote a module that allows datastream filename configuration using tokens.  These tokens can create filenames based on the datastream, the fedora object, and even other system tokens like current date.  

Datastream tokensTokens for islandora relating to datastreams objects.
Datastream's ID[dsfilename:id]Datastream ID.
Datastream's label[dsfilename:ds-label]Datastream label.
Extension from Datastream's Mimetype[dsfilename:fileextension]File extension derived from mimetype.
Fedora object PID[dsfilename:pid]Full PID of object in Fedora repository.
Fedora object label[dsfilename:label]Fedora object label.
Fedora object namespace[dsfilename:namespace]Fedora object namespace.
Fedora short PID[dsfilename:shortpid]Fedora object pid without namespace.

With these tokens the filename pattern can be set up to a very descriptive file such as "[current-date:short]_[dsfilename:id]_[dsfilename:label].[dsfilename:fileextension]".  You could even set up the pattern to not use any tokens and name all datastream downloads the same with a pattern like  "download.abc".


I hope this helps.

Brian Gillingham

University of Pittsburgh | University Library System

Brad Spry

unread,
Jul 14, 2016, 12:42:05 PM7/14/16
to islandora
Alex,

Islandora leans on Drupal's includes/file.mimetypes.inc, which ultimately associates the mime-type 'application/octet-stream' with the .bin extension.

Honestly, mime-type resolution isn't designed to work in reverse (mime-type -> extension).   Therein lies the issue...

Please review ISLANDORA-1612; it serves as catch-all ticket for mime-type issues like you and I are experiencing.    Please add your particular issue and experience to the ticket.


Brad

Diego Pino

unread,
Jul 14, 2016, 1:26:07 PM7/14/16
to islandora
Pronom, pronom is our friend. Sadly (or for better) efforts will be made to have better file format detection in CLAW, but you can hack around current 7.x-1.x importers to add external ID software step for binaries.

Alex Garnett

unread,
Jul 15, 2016, 12:08:55 PM7/15/16
to islandora
Huh. OK, thanks folks. In this case, I really don't have any need of PRONOM (or at least, that would add more complications than it resolves); I have .zip OBJ datastreams that wind up being served as .zip.bin, which is purely inconvenient.

Alex Garnett

unread,
Jul 15, 2016, 12:17:40 PM7/15/16
to islandora
Also, the mimetype appears to actually be correct in the Manage tab:


This happens to .sav files too, so I don't think it's just an issue with .zip misidentification.

Alex Garnett

unread,
Jul 15, 2016, 12:34:21 PM7/15/16
to islandora
Tried out your filenamer module too and sadly Fedora seems convinced that [dsfilename:fileextension] is "bin" -- it's not even clear how I'd change includes/file.mimetypes.inc. Hopefully a problem just for me!

Brandon Weigel

unread,
Nov 22, 2016, 3:01:29 PM11/22/16
to islandora
Hi Alex,

We've run into this ourselves just today - with a SAV file - so not just a problem for you. Did you manage to find a solution, or are you just accepting it?

- Brandon

Brandon Weigel

unread,
Nov 22, 2016, 5:13:22 PM11/22/16
to islandora
Update on this -- I tried ingesting the file as a Zip archive, and .bin was not appended to my filename. I'd be curious to know why this happened to your zip file and not to mine.

Alex Garnett

unread,
Nov 23, 2016, 10:54:02 AM11/23/16
to islandora
I did eventually solve it with the Filenamer module, by configuring the filename pattern to just "[dsfilename:ds-label]"

Alex Kent

unread,
Mar 11, 2019, 2:18:30 PM3/11/19
to islandora
Has anyone else run into this issue recently? What is the current recommended fix? 

Thank you in advance!

Alex 

dbs...@uncc.edu

unread,
Mar 12, 2019, 7:35:59 AM3/12/19
to islandora
Alex,

Here is a roughed out methodology using DROID to sniff the files and subsequent PRONOM query.  It is totally dependent on PRONOM actually having all the datapoints you need (official file.ext & mime-type):

1. Use DROID to evaluate file


2. DROID returns PUID, ex: fmt/803


3. Access http://www.nationalarchives.gov.uk/pronom/fmt/803


4. Scrape: <input type='hidden' name='strFileFormatID' value='1603' />


5. Save as XML:

curl -F strAction='Save As XML' -F strFileFormatID=1603 http://www.nationalarchives.gov.uk/PRONOM/Format/proFormatDetailListAction.aspx


6. Get file extension:

<ExternalSignature>

<ExternalSignatureID>1696</ExternalSignatureID>

<Signature>e01</Signature>

<SignatureType>File extension</SignatureType>

</ExternalSignature>


7. Get mime-type:

<FileFormatIdentifier>

<Identifier>application/encase</Identifier>

<IdentifierType>MIME</IdentifierType>

</FileFormatIdentifier>



Brad


dp...@metro.org

unread,
Mar 12, 2019, 10:58:44 AM3/12/19
to islandora
Alex,

In a private message you told me that the mime type recorded in the datastream property in those objects is  'application/x-zip'
You can see that there is no mapping for that mime type in core
but islandora extends the mime detect to more less common types here 
Where x-zip is actually defined, which means you should get a .zip extension on download

So, the actual down happens here

And the code that adds the extension is this one

$extension = '.' . islandora_get_extension_for_mimetype($datastream->mimetype);
   
// Prevent adding on a duplicate extension.
    $label
= $datastream->label;
    $extension_length
= strlen($extension);
    $duplicate_extension_position
= strlen($label) > $extension_length ?
      strripos
($label, $extension, -$extension_length) :
      FALSE
;
    $filename
= $label;
   
if ($duplicate_extension_position === FALSE) {
      $filename
.= $extension;
   
}


As you can see, the only time extension is not added is when the datastream label has already the calculated extension (to be honest the code is a bit convoluted but it works).
So how you end with an application/octet-stream (key 15 in that drupal mapping that maps to .bin) is a mistery for me.

Can you double check your islandora version and put some dpm() messages to check that mime type calculation for that datastream download is kicking in?

Cheers

Diego

Alex Garnett

unread,
Mar 12, 2019, 11:14:44 AM3/12/19
to islandora
I still use https://github.com/ulsdevteam/islandora_datastream_filenamer, as far as I can tell from my site configuration...
Reply all
Reply to author
Forward
0 new messages