Gmail Calendar Documents Reader Web more »
Recently Visited Groups | Help | Sign in
Google Groups Home
Group info
Members: 14
Language: English
Group categories: Not categorized
More group info »
Recent pages and files
ALA Imaging Process    

The life of a herbarium sheet during the imaging process at the ALA 

 

Data Entry


First, a barcode is applied to a folder containing one or more herbarium sheets. Then, through the imaging interface, the Folder Barcode is scanned into Arctos and an Identification is entered. (Herbarium folders are organized by taxonomy.) A barcode is applied to a herbarium sheet in the folder, then scanned into the appication using a laser scanner, which acts as an input device. A unique identifier (generally the ALA Accession Number) is entered for the sheet. A button on the form saves data to a dedicated table within Arctos after all sheets have been scanned.

 

 

 Data Checking and Loading


Each evening, a script runs to confirm taxonomy (from the Folder Identification), check for potential duplicates in Arctos, and verify that pre-determined data quality checks have been met. Those records that do not pass the tests are flagged, and an email is automatically sent to the imaging Google group. Arctos is then queried to find possible preexisting records (based on the identifier entered at initial entry). Those records' container locations are updated.


The records that do pass all checks and are not found in Arctos are moved into the Arctos Bulkloader and subsequently processed into Arctos. Arctos uses a hierarchical container model to track physical object locations. The specimens themselves are collection objects, herbarium sheets contain specimens, and folders contain sheets. Folder containers are labeled according to the Folder Identification created at the initial data entry step.


Photography

 

Sheets which have a barcode affixed are ready for photography . Sheets may be photographed at any point after the initial entry step; it is not necessary for data to enter Arctos first.

 

A Canon Mark II EOS-1 DS, several external hard drives, and a laser barcode reader are attached to an iMac. The camera is mounted in a fixed position over a fenced bed, which also contains a scale and a color standard, and triggered through the computer. Images are automatically transferred to the computer and saved to a folder upon which an ActionScript is enabled. Upon a file entering the folder, users are prompted for a file name. This name is created by scanning the barcode earlier attached to the herbarium sheet. Another ActionScript then moves the image to a temporary directory. A shell script runs every 30 minutes to move images from the temporary directory to two other physical hard drives.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Images can be taken at a rate of approximately 1500 per day depending on folder sizes and available staff.  


Image Transformation


Images from the camera are in Canon's proprietary .cr2 format. Adobe DNG Converter is used via a shell script to transform the RAW images into DNG, a lossless open image standard.


File Transfer

 

A script runs constantly on the Imager to transfer converted DNG files to the Ranger supercomputer at Texas Advanced Computing Center (TACC) via SCP. SSH/HPN is utilized to maximize throughput capacity. Transferred images are logged and duplicates are not transferred.

 

Once per day, yet another script compares local and transferred files by name and filesize. Any images that did not transfer correctly are removed for another attempt. Once local and remote files match by name, filesize, and count, the images are moved to a "confirmed" directory.


Meanwhile, at TACC....

 

Images in "confirmed" directories are periodically purged to the iRODS system. iRODS generates checksums and transfers images to Ranch at TACC and the San Diego Supercomputing Center. This is automated and handled by the TACC staff.


High-resolution and thumbnail JPGs are created for every DNG on Ranger using ImageMagick. These will be transferred to a new computer at TACC beginning approximately 1 November 2008.


Linking Images to Specimens

 

Arctos queries TACC's iRODS directory structure nightly. Any new images are located and processed into Media. JPGs, when available, are also linked to TACC, and a Media Relationship is created between the high resolution JPG and the original DNG. Thumbnail JPGs are utilized as a preview for both the DNG and the JPG.





Version: 
2 messages about this page
Oct 22 2008 by Steffi Ickert-Bond
Great,
I like it a lot and for our upcoming plans this is exactly what we are
looking for...
Keep on ticking, in Seattle almost made it back, five hour flight from DC...
Best, Steffi.
On Wed, Oct 22, 2008 at 2:20 PM, ZJMEYERS85@gmail.com
Oct 22 2008 by ZJMEYERS85@gmail.com
I updated the camera model and added the pic. Its looking good

Click on http://groups.google.com/group/ALA_Imaging/web/ala-imaging-process?hl=en
- or copy & paste it into your browser's address bar if that doesn't
work.
Create a group - Google Groups - Google Home - Terms of Service - Privacy Policy
©2009 Google