PDF/UA support?

20 views
Skip to first unread message

Ashish Sethi

unread,
Nov 17, 2020, 1:54:15 PM11/17/20
to PDF::Reader
Hi,

I am wondering if the gem offers PDF/UA support?

- To read PDF/UA
- To extract PDF/UA tabs an values
etc.

What I am really looking for is speech to text log of complete PDF/UA doc.

If it is supported, how could I go about using it in ruby code?

Thx,
Ashish

James Healy

unread,
Nov 19, 2020, 10:53:02 PM11/19/20
to pdf-r...@googlegroups.com
I'm not super familiar with PDF/UA, but I suspect the answer is: yes
it's possible to read and extract that data you're after, but it might
not be neat.

I'm reasonably confident pdf-reader can parse all content in PDFs.
However, there's no helper methods for many features.

If you loop over the document and print some of the data on each page,
you might find what you're after. Possibly deeply nested in hashes and
arrays.

require 'pdf/reader'

pdf = PDF::Reader.new(ARGV[0])
pdf.pages.each do |page|
puts page.attributes.inspect
puts page.xobjects.inspect
puts page.raw_content
end

James
> --
> You received this message because you are subscribed to the Google Groups "PDF::Reader" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to pdf-reader+...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/pdf-reader/61d72ba1-226e-40d9-9f5f-78b0676c9cd5n%40googlegroups.com.
Reply all
Reply to author
Forward
0 new messages