HTML Scraping

9 views
Skip to first unread message

Brian @ Elsewares

unread,
Oct 18, 2012, 1:33:20 PM10/18/12
to jo...@d20pfsrd.com
John ~ First of all, this email is a Google-back account, so it should be OK for adding to Drive.

Second, I've created HTML scraping and automating tools for clients of mine, and I'd be happy to build one for you for the Equipment Guide, or any other projects you may have.  I'm working on a more generic version of the code to deploy as a web service, and this would be a good test.

Let me know!

Brian ~ Chief Dishwasher, Game Developer and & Web Guru
http://elsewar.es
We build stories.

john reyst

unread,
Oct 18, 2012, 2:07:58 PM10/18/12
to Brian @ Elsewares, d20pfsrd.com Collaborators
Hey Brian!

Right now we have a web-based Ruby on Rails app that I use to extract text from and add links to PDF files. However, it's not perfect, often due to the quality or formatting of the PDF source document. I'm constantly looking to add features or improvements to that app to make our processes faster or easier. Do you have any experience in that area or suggestions on alternative solutions? The app uses a combination of perl, ruby, rails, and a few different open source pdf manipulation apps (pdfbox and pdftohtml) to operate.

--
John Reyst
jreyst (gtalk)
248-635-9432 Cell


--
--
You received this message because you are a collaborator on d20pfsrd.com and as a result are subscribed to the "d20pfsrd.com-collaborators" Google Group.
 
To unsubscribe from this group, send email to
d20pfsrdcom-collab...@googlegroups.com
 
 
 

John.L...@flagstar.com

unread,
Oct 19, 2012, 11:50:01 AM10/19/12
to Brian @ Elsewares, d20pfsrd.com Collaborators, john reyst

I can grant you access to the server where the app runs and then you just
poke it a bit to see if its something you'd be willing to mess with at all.
I have a couple of user interface tweaks I'd like to do to it as well as
some minor actual feature changes/additions if you decide you might like to
give it a whirl?


(Embedded image John Reyst
moved to file: Systems Analyst 4
pic21991.gif) IT Services - Infrastructure
Flagstar Bank Logo 5151 Corporate Dr.
Troy, MI 48098
Office: (248) 312-6118
Cell: (248) 635-9432
John.L...@flagstar.com
www.flagstar.com







From: "Brian @ Elsewares" <br...@elsewares.org>
To: john reyst <jre...@gmail.com>
Cc: "d20pfsrd.com Collaborators"
<d20pfsrdcom-...@googlegroups.com>
Date: 10/19/2012 11:39 AM
Subject: Re: [d20pfsrdcom] HTML Scraping



I don't think I can offer much on top of that - I imagine it's the Perl
that does the grunt work?  I work with Rails and Ruby, however, so if
there's something that you want to do there, I think I could be of help.

Brian ~ Chief Dishwasher, Game Developer and & Web Guru
http://elsewar.es
We build stories.



This e-mail may contain data that is confidential, proprietary or non-public personal information, as that term is defined in the Gramm-Leach-Bliley Act (collectively, Confidential Information). The Confidential Information is disclosed conditioned upon your agreement that you will treat it confidentially and in accordance with applicable law, ensure that such data isn't used or disclosed except for the limited purpose for which it's being provided and will notify and cooperate with us regarding any requested or unauthorized disclosure or use of any Confidential Information.
By accepting and reviewing the Confidential information, you agree to indemnify us against any losses or expenses, including attorney's fees that we may incur as a result of any unauthorized use or disclosure of this data due to your acts or omissions. If a party other than the intended recipient receives this e-mail, he or she is requested to instantly notify us of the erroneous delivery and return to us all data so delivered.
pic21991.gif
Reply all
Reply to author
Forward
0 new messages