Benjamin Woolley 

An open book

Benjamin Woolley explains why PDF files are the parchment of our age
  
  


Papyrus, parchment, vellum: these are the media upon which the history of past civilisations was written. Ours, it seems, will be recorded in PDF. The spread of the "portable document format", developed by Adobe to provide a universal method for viewing and printing documents on different computers, has been remarkable. Adobe claims that Acrobat Reader, the software required to read PDF documents, has been downloaded nearly 500m times, and the format has become the most widely used on the web after HTML (hypertext mark-up language).

This popularity has been further consolidated by the launch of a new version of PDF, 1.5. Among a number of technical enhancements, this adds support for multimedia content such as Flash and QuickTime movies.

There is also a new version of Acrobat (6.0), the software used to create, edit and publish PDF documents, which promises to take the format even further. This provides more powerful tools for adding layers (useful for engineering documents), annotations and digital signatures. Particular effort has gone into making it easy to convert documents into the format. Microsoft Office files, images, scanned items and a host of other sources can now be turned into PDFs more or less with a mouse click.

Thanks largely to PDF, the spread of digital documents across the internet has now engulfed our PCs. There are tools for alleviating this problem, though most are aimed at (and priced for) the corporate market. DTSearch and Isys are two of the best, allowing you to index all the data on a hard drive, irrespective of its format. They provide sophisticated methods for indexing groups of files, and allow Boolean and "fuzzy" searches, to find combinations of words, or words with variant spellings. Isys also supports "synonym rings", to locate documents containing not just a particular word, but any with a similar meaning. Another program, AskSam, is popular with academics and researchers, allowing them to collate documents of all sorts into a single text database.

The new version of PDF does little to help ease the information overload that these tools are designed to deal with. On the plus side, it continues in the tradition of providing a common, secure and accessible format for documents. All the text retrieval tools mentioned above can handle PDF. But it does them no particular favours. Used in conjunction with the free Reader, a PDF document is really a closed book. You cannot edit it, add bookmarks or other elements. Adobe, for good commercial reasons, reserves those privileges for owners of Acrobat, which costs from £235.

More crucially, you cannot link it to other documents that may give it relevance, as you can with HTML. For example, if a document features a person's name, PDF can not easily create links to that name elsewhere on the local hard drive or the internet. Even with Acrobat, it is hard to link one piece of information to another, particularly when other document formats are involved. This is what makes it the parchment of our age: a stable, accessible but almost inert medium: good for preserving a record of our age, but perhaps in its present form, not ideal for the demands of the next one.

 

Leave a Comment

Required fields are marked *

*

*