Hi,
pdf is a document format (like odt, doc, docx, rtf). tesseract is processing images.
You did not mention what programing language(s) you plan to use, but there plenty of tool for pdf text extraction e.g.
textract (python) [1]
If you have "stupid pdf" (just somebody embed to pdf scanned images), just extract images from pdf and then you can use them in tesseract.
Another option is to convert pdf to images (so you can process them with tesseract).I have very good experience with mupdf, but people use ghostscript
also. There are plenty examples how to do it on the internet (e.g. in python [2]) .
Few days ago I found tesseract-ocr-wrapper[3], that focus on OCRing of "stupid pdfs". So maybe this can help you.
Just use the already available tools.