Date: May 13, 2012.
Pdfselect Pro is a tool for converting pdf, image and text files. It allows
The commands for the task described in 1 and 2 can be triggered by pushing corresponding buttons, see the image below.
The other tasks are run by choosing corresponding menu items.
and
Let us first describe the tasks 1 and 2. There are two ways to use these particular capabilities of Pdfselect Pro. The first modus is called pdfselect, and it is applicable, when the source is a single pdf document, which has to be opened in Preview. Note that the document has to be the front document in Preview.
Clicking then either the button „run pdfselect“ or „run pdfselect + open pdf“ in Pdfselect Pro’s main window will open a dialogue window where you specify which pages should be extracted by entering the page numbers as a list, e.g., the list
entered in a dialogue window, will produce a pdf file containing the pages , , and page in that order. The paper size will be that of the original file. The string „-“ will insert all pages, „ “ all pages starting with page 4.
Getting the path of a displayed pdf file from Preview requires „UI Scripting“, i.e., in „System Preferences/Universal Access“ the option „Enable access for assistive devices“ has to be checked.
The new pdf file formed by combining the extracted pages will be named by adding a suffix to the name of the original file (original name-suffix). The default suffix is , but you a given the option to choose an arbitrary suffix.
The second modus can be used in case of a single source file or in case of multiple source files. Pdfselect Pro then merges the selected pages into a common pdf file named according to your wishes. The pdf files are chosen with the help of standard open file dialogues and the page selection follows the identical rules as explained in the previous section. Note that also image files can be chosen, they will be converted to pdf files before being included in the combined pdf file.
The merged pdf file will be saved in
When a pdf file is front in Preview a single page of it can be converted to another format by choosing the menu item „Convert pdf in Preview to Image“ . You will then be asked to specify which page should be converted and to which image format. If you don’t specify a page, the first page will be converted.
Possible image formats are „jpeg, tiff, png, gif, jp2, bmp, psd, tga“.
When choosing „Convert a single file“ an open file dialog lets you pick a pdf a image file to be converted to another format. The possible formats are „pdf, jpeg, tiff, png, gif, jp2, bmp, psd, tga“. Again only a single page of a pdf file will be converted which can either be specified or will be the first page of the document.
The last item in this menu is a batch command for converting pdf or image files in a folder (including subfolders) to a different format. The sources can have different formats, which are specified in a dialog as a list of suffixes, e.g., after returning the list „jpg,png,tiff“ in a dialog, all files the full names of which are ending in one of these suffixes and are contained in the chosen folder, or in a subfolder, will be converted to a selected format. Note that the list should not contain any spaces. A list containing a single suffix will be also fine.
A message will be shown when the batch conversion is done.
The first two menu items let you convert single pdf files or selected pages to a plain text file or to one of the formats „doc, docx, html, rtfd, wordml, odt“. Note that the first conversion is always a conversion to plain text; when one of the other formats is specified, then, the text file will be converted to the other format. In this case there always will be two converted files, a text file and and a file in the specified format. The plain text conversion and any additional conversions are very good, fast and stable, even when the original pdf files contain images or complex mathematical formulas. The text files will all be encoded in UTF-8, hence Umlaute, accents, and special letters will be properly converted.
There is also a command for a batch conversion of pdf files to plain text or to one of the others formats. The batch conversion is very fast and the resulting files have the same quality as in case of a single conversion.
When choosing the menu item „Convert text file to pdf“ a text file or an image file can be picked for being converted to a pdf file. Note that this conversion to pdf is a different from the previous conversion in the menu „Convert Images“ , even when an image file is chosen as source file, since the pdf file will have a fixed page size which can be specified in the Preferences. The default page size is „Letter“.
|
The menu item „Convert text file to another format“ allows to convert a text to one of the formats „txt, rtf, doc, docx, html, rtfd, wordml, odt“.
There are also batch versions of the above commands.
The commands in this menu are similar to the commands described in the previous section, the only difference is that instead of a plain text conversion a conversion to a rich text format is performed. However, converting a pdf file to a rtf file is rather precarious and will in general fail when the pdf file contains complex graphics, tables or a lot of mathematical formulas. Yet the conversion will work properly when the pdf files contain mostly text. You should consider the rich text conversion as an option which may or may not work depending on the complexity of the content of the pdf file. This in sharp contrast to the plain text conversion which will always work.
This caveat especially applies to the batch conversion of many pdf files to rich text.
However, after upgrading to OS 10.7 the conversion to rtf, and hence, to the formats doc and docx (MS Word), has improved tremendously and is comparable to the conversion to plain text.
Word files are encoded in a proprietary format and hence cannot be converted properly by other software. We therefore use Word itself to convert doc or docx files to pdf, html or xml. These tasks are automated such that batch conversion is possible, i.e., you choose which formats should be converted, doc, docx, or both, and the folder which should be searched in—recall that a search will include subfolders, hence selecting your home folder will result that all your Word files be converted, or more precisely, converted copies will be created.
The application MS Word should be running before a conversion is going to take place. A dialogue will indicate the end of the conversions.
In the Preferences the paper size for the text to pdf conversion can be defined, the possible options are A4, Letter, or Legal.
Install Pdfselect Pro in the Applications folder
Pdfselect Pro requires Mac OS 10.6 or later.
Pdfselect Pro in the Mac App Store: Pdfselect Pro
Flashmode, RAMDisk, Pdfselect Pro, Pdfselect, TEX Scripts, TeXOnline
E-mail address: gerhardt@math.uni-heidelberg.de