Ifound that both the "export selection as" and the "copy with formatting" functions were able to go directly to Excel without using Word, though that also works. The key in Excel is to simply use ctrl+v to paste, not to right-click and try to use a special paste or the default, which went to 'Keep text only'.
To save a specific table from a PDF document, draw a selection box around it, right-click the selection and choose Export Selection As, then choose Excel Workbook from the file type menu. You can also run OCR on demand to convert a bitmap image of a table into a real table. All the formatting of the cells will be transferred across, where supported.
That doesn't work for me! What used to copy and paste as a nice table is now a wreck. I have to reconcile several credit card statements in Excel at my job, it used to be so easy to copy and paste as a table. Now some of the rows paste into separate columns but most don't now. I ran the OCR and it made no difference. I've tried saving the file as an Excel spreadsheet and a Word doc, but the columns don't come up correctly. Is there a fix for this?
This is a major pain in the neck. I'm working with X and 9 Pro and whilst the latter wasn't perfect, it was easy to extract info from tables. X exports data imperfectly to Excel. Formatting looks great but what use is that when stores individual cells with multiple data. Painful painful painful. I have to downgrade to be productive. This should be a bug or feature fix if anyone at Adobe is listening.
I am using "Copy with formatting". I've tried every different way I can see and nothing works. It's unfortunate because this feature is the reason I upgraded from Reader after doing a trial with version 9. Is there any way to downgrade to Acrobat 9 standard using the same reg key?
Yes I too found the copy and paste in Acrobat 9 functional , upgraded to X which has been a huge waste of money. I started with Acrobat 4.0 and usually each version provided better features until now hopefully someone at Adobe monitors these posts and will provide a fix in the future. Will probably have to uninstall Version X and reinstall 9
Had I known that at the time would have been worth it as it was when I went to install X it told me to unistall the old version and when I did and then tried to run X it said I couldn't upgrade because it couldn't find a current version on my computer so had dig out the old copy so that the X would install and give me credit for owning a previous version. So far haven't seen X do anything better than version 9 for the purposes I use it for. I'll try installing 9 back over the top of X , thanks for the tip.
It's almost certain that the PDF file isn't tagged (either at all, or properly). Accessible, tagged files will export to spreadsheets as Acrobat can understand what's a table, what's a header cell, etc. whereas in an untagged file it has to make guesses based on the separation of each text block (it can't see the page visually as you do, so what looks like a perfectly-obvious table to a human is actually very difficult to detect in code).
The alternative is to do column or block copies. In the past this was done by using the alt key when selecting the area of interest (works in my AA9). That allows you to copy column by column and generally the lines separations will still be used. Hope that does it.
Thanks for the response however I've tried this and it just doesn't work right, after having worked with Acrobat X even more it just isn't as good as 9.0, documents I used to be able to OCR now have more unrecognized or miss recognized items than 9.0 and the only thing I have changed is going from 9.0 to 10.0. This appears to have been nothing but a profit move from Adobe just a few bells and whistles. This is the first time Adobe has failed to deliver a significant improvement by introducing a new version and I started with 4.0
I have discovered a work around. Select the table you are trying to copy, right click and select "export selection as" and save it as a .docx file. Open the word document, copy the table again and paste it into an open excel file. The columns will be maintained and your data will be in seperate cells instead of all of it in one cell. It's not one easy step like it used to be, but it does work.
I have the same problem. I'm running Acrobat Pro X. I'm trying to extract or copy with formatting a beautiful table from a PDF document into Excel OR Word and the table does not export or paste into Word or Excel; with the proper formatting. OCR doesn't read it for some reason. In your instructions you say:
What Acrobat Pro version are you referencing? I'm not sure where you access "Advanced". Properties, settings, preferences, what? I try Preferences, then Accessibility, and don't see what you're seeing. I see this. No idea where this Touchup Reading order is.
We just had some of our users get new computers. Instead of Acrobat 9 standard, they now have Acrobat XI Standard. While they used to use "Copy as Table" and paste directly into Word or Excel with good results, they are unable to do so. This feature seems to be completely removed. Now we are using "Export selection as..." instead. However, the result is that there are many more steps involved. We want to paste into an existing document. Before, we just did the selection, right-click, copy as table, paste into existing document. Now, we do selection, right-click, Export selection as..., create a new file each time, open file, select from new file, paste in new document, delete unneeded file. This is much more burdensome than procedure in version 9, and it is much more confusing to the users who now have to create and understand extra files that are being created just to handle stuff that used to be handled through copying to and pasting from the clipboard.
You can use Copy with formatting option to copy tables from a PDF . You can right click on a selection and then choose copy with formatting from the menu. You can also do a rectangular selection, by holding down Ctrl key and then dragging mouse to make a rectangular selection over a PDF content, and then right click on the selection to choose Copy with formatting from the options.
Thank you, Aprorov, Copy with Formatting does not perform the same. There was a feature in 9 called Copy as Table as well. This is now missing and performs differently than Copy with Formatting, which brings a lot of the superficial formatting with it when you paste, but doesn't maintain the tables as version 9 did. The export works, but takes many more steps.
Thanks for checking it out. Ideally, Copy with Formatting *should* work for most of the cases. So, does it 'not work' for some/one particular case or do you face the issue with other PDFs as well?
Thanks Apoorv, the specific instance we have is quite confidential. Although I know we've had other examples that aren't, the one we're working on is. The Copy as Table feature worked well in these cases.
How can I export my PDF to excel?
I tried exporting to word and opening in excel but it took only the first record in each page. And the format is not perfect either.
The following method worked only only on 1 page. It didn't capture all pages.
copy text in Acrobat as a Table and get the table data in Excel. If a Paste in Excel doesn't work, you may need to temporarily paste in Word, then open it in Excel.
When I copy text out of a PDF file and into a text editor, it ends up mangled in a variety of ways. Formatting like bold and italics are lost; soft line breaks within a paragraph of text are converted to hard line breaks; dashes to break a word over two lines are preserved even when they shouldn't be; and single and double quotes are replaced with ? signs.
Firstly, you have to understand what a PDF is. PDFs are designed to mimic a printed page, and they are designed only as an output format, not an input format. a PDF is basically a map containing the exact location of characters (individual letters or punctuation, etc.) or images. In most cases, a PDF does not even store information about where one word ends and another begins, much less things like soft breaks vs. hard breaks for paragraph endings.
Anyway, it's up to your software to implement some kind of "artificial intelligence" to extract merely from the locations of individual characters what is a word, what is a paragraph, and so on. Different software is going to do this better than others, and it's also going to depend on how the PDF was made. In any case, you should never expect perfect results. Having the output PDF is not the same as having the source document. Far better to try to obtain that if you can.
There is free software that can be used to extract text from PDFs with some of formatting intact, but again, don't expect perfect results. See, e.g., calibre (which can convert to RTF format), pdftohtml/pdfreflow or the AbiWord word processor (with all import/export plugins enabled). There's also a PDF import plugin for OpenOffice.
Another option is to download and start using the free pdf viewer, Foxit (its good).Then you can 'Save As' and choose .txt to convert it to a text file.That will preserve all the formatting. Dunno whether you can do the same in Adobe because I stopped using it a while ago when I converted to Foxit.
There is a very good online tool called Sej-da. Its deals with Advanced PDF Manipulation. There is no software to download. As it is a new online tool it is currently still in Beta. It allows you to extract text from a PDF, as well as providing a myriad of other PDF functionalities
For tables: With Acrobat 9/10 there was a select tables feature. With Acrobat X you can just click Save As > Spreadsheet > Excel. It even concatenates pages into one long spreadsheets. Awesome feature.
Foxit will toggle between displaying the original file as normal PDF or as text by pressing Ctrl + 6 (With a little fiddling with the zoom level of the text mode there's not much jump in position back and forth between reading and copying)
3a8082e126