我是看到一个人说自己买了这个。250 usd。
http://www.teleread.org/blog/2006/06/03/700-page-an-hour-scanner-to-help-digitize-books/
Liviu Says:
June 3rd, 2006 at 5:19 pm
It is realistic to get 600 pages/hr with a regular scanner. I use an
OpticBook3600 (~250$) and at 300 dpi, b@w, jpg/tif landscape output, I
do 10 pages (5 dp sheets) for hc/tp and 14 pages for pb per minute, so
approximative 300 pages per 1/2 hour and I also watch a movie on another
screen or on my portable dvd when scanning.
达不到 600 p/hr 我的经验,大约 200 - 300 p/hr
The Atiz scanner is a lot like the Internet Archive’s Scribe scanner, in
that both use digital cameras. In the case of Scribe, the page scan
resolutions vary depending upon the size of the page, but for typical
books the resolution is around 500 dpi, and the full color depth is
preserved.
oh, plustek is a taiwan company. so maybe all these are made by taiwan
company. it's invented by taiwan people. hehe.
http://www.plustek.com/about_us/about_us.asp 我说呢。
http://www.youtube.com/watch?v=9QC3zG5bpTY Atiz Bookdrive 自动翻页的扫描
仪我猜它是靠真空吸住书页翻页 速度也不快。
还有更复杂但是很慢的
http://www.youtube.com/watch?v=KkwERsAyDzU&feature=related
http://www.youtube.com/watch?v=u8vKFz09ric&feature=related automatically
book reading machine
http://www.youtube.com/watch?v=XheLKg3FVVQ&NR=1
atiz 是个泰国人开的公司。做书本扫描仪。我看它最近抄了 internet archive
发明的 Scribe 的扫描仪原理,
做了一些改进,一些软件配套的快速书本扫描仪。用 canon camera,
配上自动翻页的话就好了。但是相机拍摄的照片质量不如平面扫描。
http://diy.atiz.com/specs/ bookdrive DIY 售价 3500 usd
http://booksnap.atiz.com/ 这是便宜一点的,用 canon G 系列相机
对小组织可以接受。对个人太贵。
我是搜索如何用 xnview plugin处理扫描书页的黑边,找到了这个 atiz bookdrive
个人翻录专用词:bootleg http://en.wikipedia.org/wiki/Bootleg_recording
Internet Archive Scribe book scanner
Scribe Software
http://scribesw.sourceforge.net/
Photos: Internet Archives' book project | ZDNet Photo Gallery
http://content.zdnet.com/2346-9595_22-13911-5.html
http://www.youtube.com/watch?v=hlOQuuLYavY&feature=related
<http://www.youtube.com/watch?v=hlOQuuLYavY&feature=related> 这个广告配
的音乐,拍摄风格,像色情片 动态也象 这个扫描仪一次扫两页,很聪明的设计
http://www.youtube.com/watch?v=8KCaLwbrlqU&feature=related
EOS 450D Large books digitizing
http://www.youtube.com/watch?v=vlgWNGKTR2A 这个 kirtas 机器人扫描仪,速
度快。
用 xnview 里面提供的 autocrop 可以达到实现部分效果。对于黑边,可以用
Image > automatic crop,设置 background color 为
黑色,测试一个合适的 tolerance 值,就可以部分去除黑边。参见附件 book
scan crop black border.ppt。
试验了一下 Imaging Express [4] ,去除黑边不完全,没有达到需求,但是其提
供的自动放正图片的功能 (deskew, straighten) 尚可,未研究是否支持批处理。
BordersHelper [3] 的介绍说只支持单色图像,看它的说明似乎基于 [6] 给出的
算法。但是下载网页失效了。
Recogniform [1] 是一个开发库,试验了一下基于同样技术的 Recogniform Image
Processor 5.0 [7],效果不错,批处理方式。
但试用版无法正常使用,会在输出图片上画一个大叉。购买价格太贵,约 900 欧。
Recogniform 支持 DLL 调用或者 ActiveX 调用,支持去黑边 black border
removal, 纠正歪斜 deskew, 去黑点 despeckle, 去表格线条 line removal,
动态阀值转换为黑白图像 dynamic thresholding 。意大利人做的,售价 600
欧。以下摘自 SDK 的 demo 说明文件 DemoReco.txt。 试验了 SDK 的
Evaluation download,需要密码,无法正常使用,只供演示。
------
Deskew
------
This function estimates the image inclination angle and correct it
using a very fast and accurate fine-rotation method.
You can select to estimate only or to estimate and correct the skew.
You have to supply the maximum angle to check, the angle resolution
and you can balance the speed/accuracy
Works on monochorme, grayscale and color images.
---------
Despeckle
---------
This function allows to remove noise points from the image.
You have to supply the max width and height of the isolate points
to be removed.
Works on monochorme images.
------------
Line Removal
------------
This function finds and remove horizontal and vertical lines.
You can choice to remove only horizontal or vertical lines or
both.
You can supply the minumum size for horizontal and vertical
lines as well as the maximum size of "holes" in lines.
As option you can choice to reconnect crossed character and to
clean lines border after remotion.
NOTE: To have best results you need to deskew the image before to
use this function.
Works on monochorme images.
------------
Thresholding
------------
This function allows to convert a 256 grayscale image in a monochrome
image.
You can choice the kind of algo to use.
Here you will find our "state of art" algorithms.
Works on color and grayscale images.
============================================
www.recongniform.com - in...@recogniform.com
============================================
Adobe CS3 Photoshop 据说有个 Crop and Straighten 功能可以切除黑边 [5]。
看了一下说明,是用于一次扫描多张照片,然后自动切割成单张照片的。
许多扫描文件处理软件甚至支持自动去除订书钉,打孔等阴影 [9][10][11],自动
倾斜纠正,背景去除,污点去除等等许多功能。论文 [6] 中有他提出的算法和其
他著名的商业软件的处理效果的对比。
找到一个 opensource 的软件,unpaper [12],这是最有希望作为改进的基础。它
支持单/双页版心(page mask, 也就是 bounding box)识别,页面边距 (border)
识别,大块阴影去除 (black filter),污点去除 (noise filter, blur
filter),背景去除 (gray filter), 版心扶正 (deskew),版心重定位居中
(border aligning) 等许多复杂的功能。先试验它的算法对我的需求是否有效。估
计可能其识别阴影的算法,对于我采用了短边距扫描之后的边缘阴影很窄的情形效
果不会太好。
opensource 软件 ImageMagick 支持 -fuzzy -trim [13],和 xnview 的 auto
crop 功能类似,可以用于切除宽度均匀的黑边。如果黑边是倾斜的,宽度不均
匀,就无法完全切除了。
opensource 软件包 NetPBM 10.35 版之后提供一个 pgmdeshadow 命令 [15],可
以对灰度图像的边缘渐变阴影进行亮度修正。这适合修正那些普通平板扫描仪,相
机拍摄的书本页面边缘的阴影,尤其当阴影中有需要保留的文字,不能简单切除的
时候。这一算法应该加入 unpaper 。[6] 的算法有公开的论文,也可以加入
unpaper。我想把我设想的算法也加进去。参见附件 book scan crop black
border.ppt。
目前可以采用的图像处理的库,opensource 的有 ImageMagick /
GraphicsMagick, NetPBM, GD [17]。xnview 的 GFL SDK 是 freeware 但是非
opensource,次优选择。
如果采用 ImageMagick,可以直接处理 TIFF, JPG, PNG 等图像文件格式,而
NetPBM 则需要转换为 pbm, pgm, ppm, pam 等中间格式,如果不用管道数据交换
方式,就要用硬盘文件。为加快速度,可以采用内存虚拟盘 Ramdisk 保存临时文件。
[1] Black Border Removal Library (Royalties Free)
http://www.recogniform.com/black-border-removal.htm
[2] GFL SDK http://pagesperso-orange.fr/pierre.g/xnview/engfl.html
GFL SDK is a free library (used by XnView) for developers who would like
to support graphics image formats easily.
[3] BordersHelper
http://3d2f.com/programs/40-861-bordershelper-download.shtml
[4] Imaging Express http://www.vixelsoft.com/imaging.htm
[5] Scripting and Crop and Straighten photos
http://www.adobeforums.com/webx/.59b54905
[6] Efficient Removal of Noisy Borders from Monochromatic Documents
http://www.springerlink.com/content/r7jwxc7lxnd39bwh/
[7] Recogniform Image Processor 5.0
http://www.recogniform.com/image-processing.htm
[8] demo application http://www.recogniform.com/demoreco.zip
[9] Punch Hole Removal - Document Imaging Toolkit
http://www.blackice.com/docimagingPUNCHHOLE.htm
[10] LEADTOOLS Image Processing Functions, Hole Punch Remove
http://leadtools.com/SDK/Image%20Processing/functions/Hole%20Punch%20Remove/
[11] Inlite Research Products
http://www.inliteresearch.com/homepage/products/products.html
ClearImage Repair, ClearImage Tools, iRondo Imaging Station
[12] Unpaper http://unpaper.berlios.de/doc/unpaper.html
[13] ImageMagick command line options
http://imagemagick.org/script/command-line-options.php#fuzz
[14] Book 'scan' improvement software
http://www.teleread.org/blog/2007/03/10/book-scan-improvement-software/
[15] pgmdeshadow http://netpbm.sourceforge.net/doc/pgmdeshadow.html
[16] GraphicsMagick http://www.graphicsmagick.org
[17] GD Graphics Library http://en.wikipedia.org/wiki/GD_Graphics_Library