|
|
< Day Day Up > |
|
Hack 94 Script Acrobat Using Visual Basic on Windows
Drive Acrobat using VB or Microsoft Word's Visual Basic for Applications (VBA). Adobe Acrobat's OLE interface enables you to access or manipulate PDFs from a freestanding Visual Basic script or from another application, such as Word. You can also use Acrobat's OLE interface to render a PDF inside your own program's window. The Acrobat SDK [Hack #98] comes with a number of Visual Basic examples under the InterAppCommunicationSupport directory. The SDK also includes OLE interface documentation. Look for IACOverview.pdf and IACReference.pdf. These OLE features do not work with the free Reader; you must own Acrobat.
The following example shows how easily you can work with PDFs using Acrobat OLE. It is a Word macro that scans the currently open PDF document for readers' annotations (e.g., sticky notes). It creates a new Word document and then builds a summary of these annotation comments. 7.3.1 The CodeTo add this macro to Word, select Tools Example 7-1. VBA code for summarizing commentsSub SummarizeComments( )
Dim app As Object
Set app = CreateObject("AcroExch.App")
If (0 < app.GetNumAVDocs) Then
' a PDF is open in Acrobat
' create a new Word doc to hold the summary
Dim NewDoc As Document
Dim NewDocRange As Range
Set NewDoc = Documents.Add(DocumentType:=wdNewBlankDocument)
Set NewDocRange = NewDoc.Range
Dim found_notes_b As Boolean
found_notes_b = False
' get the active doc and drill down to its PDDoc
Dim avdoc, pddoc As Object
Set avdoc = app.GetActiveDoc
Set pddoc = avdoc.GetPDDoc
' iterate over pages
Dim num_pages As Long
num_pages = pddoc.GetNumPages
For ii = 0 To num_pages - 1
Dim pdpage As Object
Set pdpage = pddoc.AcquirePage(ii)
If (Not pdpage Is Nothing) Then
' iterate over annotations (e.g., sticky notes)
Dim page_head_b As Boolean
page_head_b = False
Dim num_annots As Long
num_annots = pdpage.GetNumAnnots
For jj = 0 To num_annots - 1
Dim annot As Object
Set annot = pdpage.GetAnnot(jj)
' Popup annots give us duplicate contents
If (annot.GetContents <> "" And _
annot.GetSubtype <> "Popup") Then
If (page_head_b = False) Then ' output the page number
NewDocRange.Collapse wdCollapseEnd
NewDocRange.Text = "Page: " & (ii + 1) & vbCr
NewDocRange.Bold = True
NewDocRange.ParagraphFormat.LineUnitBefore = 1
page_head_b = True
End If
' output the annotation title and format it a little
NewDocRange.Collapse wdCollapseEnd
NewDocRange.Text = annot.GetTitle & vbCr
NewDocRange.Italic = True
NewDocRange.Font.Size = NewDocRange.Font.Size - 1
NewDocRange.ParagraphFormat.LineUnitBefore = 0.6
' output the note text and format it a little
NewDocRange.Collapse wdCollapseEnd
NewDocRange.Text = annot.GetContents & vbCr
NewDocRange.Font.Size = NewDocRange.Font.Size - 2
found_notes_b = True
End If
Next jj
End If
Next ii
If (Not found_notes_b) Then
NewDocRange.Collapse wdCollapseEnd
NewDocRange.Text = "No Notes Found in PDF" & vbCr
NewDocRange.Bold = True
End If
End If
End Sub7.3.2 Running the CodeOpen a PDF in Acrobat, as shown in Figure 7-6. In
Word, run the macro by selecting Tools Figure 7-6. PDF Comments displayed in Acrobat![]() Figure 7-7. The PDF Comments in Word after extraction via SummarizeComments![]() 7.3.3 Hacking the HackThis script demonstrates the typical process of drilling down through layers of PDF objects to find desired information. Here is a simplified sketch of the layers:
These OLE objects closely resemble the objects exposed by the Acrobat API [Hack #97] . The API gives you much more power, however. |
|
|
< Day Day Up > |
|