Set gAvDoc = CreateObject("AcroExch.AVDoc") GPDFPath = "C:\." ' FILL IN PATH TO YOUR PDF HERE You need to declare an Acrobat object of type AVDoc, this is what contains the search capability. OK! I just figured out how to search for a string!!!! So, in my document, I am searching for the string "Funnel of Doubt". 'Initialize Acrobat by creating App object GPDFPath = "C:\Documents and Settings\." ' FILL IN YOUR PATH HERE 'hard coding for a PDF to open, it can be changed when needed. Private Sub AcrobatFindText(aryTexttoFind() As String, strfileout As String)ĭim i As Integer, j As Integer, m As Integerĭim sText As String 'String to search forĭim foundText As Integer 'Holds return value from "FindText" method Update the gPDFPath to point to your own PDF and let me know what happens. Then paste this code below and see if you can at least open the PDF. Make sure these are checked in your Excel Visual Basic environment under the Tools/References menu. However, you do need the reference to Acrobat, which should already be available with Office 2007. I only have Acrobat Reader, so don't despair. I am using Excel 2007 to develop my code in. Nice to meet you Silver Lion! I don't have the Acrobat developer's environment so don't worry. 'print information between "date:" and "re:" to file, as this is the report date Print #ioutfile, jso.getPageNthWord(0, m) "," Print #ioutfile, jso.getPageNthWord(0, m) " " If LCase(word) = "re" Then lMarkRight = i - 1 If LCase(word) = "date" Then lMarkLeft = i + 1 Set gPDDoc = CreateObject("AcroExch.PDDoc")Ĭount = jso.getPageNumWords(j) 'if argument is 0, searches current page, else searches all You could modify it to just report your boolean instead of writing to the file. This is an extract of the code I am using. This code runs in Excel right now and I have specified the path to the PDF file previosly in the string variable gPDFPath I have to figure out the punctuation next. I want to pick out the "January 20, 2010" from the PDF - though right now the code just searches for WORDS and does not pick up punctuation. I am still learning this and right now, am searching for text between the words "DATE" and "RE" in documents, to capture the report date which would appear as The only issue that I have found is that the document you are searching has to be a legitimate PDF document (as opposed to something scanned to PDF from a copier, say) I also only use Reader and yesterday wrote code to search for strings. The problem is, this is tough to google (I'll be ecstatic if someone can prove me wrong).It is absolutely possible. If there was a way to SAVE THE PDF AS TEXT from Internet Explorer, like if I could manipulate the Internet Explorer/Adobe API to allow me to download as text, that would do the trick. After ALL the PDFs for the day are downloaded, another loop comes along and converts to files to text. From there I found code that downloaded/saved the PDF into an actual PDF file (not just an IE cache file). Pretty neat, but, it can be faster.Ĭurrent process.the vendor gives an ID which I figured out how to throw it into Internet Explorer and make the PDF appear in IE. I am 95% to triumph! I have created a program that lists thousands of new PDFs from a vendor, downloads them, converts them to text, and runs a list of keywords against them to see if they are of interest to us. I have Office 2013 and Adobe Acrpbat XI Pro, this is for work so any other software is a non-starter.
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |