Portable Document Format - Wikipedia. The Portable Document Format (PDF) is a file format used to present documents in a manner independent of application software, hardware, and operating systems.[3] Each PDF file encapsulates a complete description of a fixed- layout flat document, including the text, fonts, graphics, and other information needed to display it. Screen shot shows Adobe Reader 5.0 main window, the first software to support Portable Document Format (PDF). History and standardization[edit]PDF was developed in the early 1. It was among a number of competing formats such as Dj. Vu, Envoy, Common Ground Digital Paper, Farallon Replica and even Adobe's own Post. Script format. In those early years before the rise of the World Wide Web and HTML documents, PDF was popular mainly in desktop publishingworkflows. ![]() Adobe Systems made the PDF specification available free of charge in 1. PDF was a proprietary format controlled by Adobe, until it was officially released as an open standard on July 1, 2. International Organization for Standardization as ISO 3. ISO Committee of volunteer industry experts. In 2. 00. 8, Adobe published a Public Patent License to ISO 3. Adobe that are necessary to make, use, sell, and distribute PDF compliant implementations.[8]However, there are still some proprietary technologies defined only by Adobe, such as Adobe XML Forms Architecture (XFA) and Java. Script extension for Acrobat, which are referenced by ISO 3. ISO 3. 20. 00- 1 specification. These proprietary technologies are not standardized and their specification is published only on Adobe’s website.[9][1. Many of them are also not supported by popular third- party implementations of PDF. So when organizations publish PDFs which use these proprietary technologies, they present accessibility issues for some users. On July 2. 8, 2. 01. ISO 3. 20. 00- 2 was published by the ISO. The Portable Document Format (PDF) is a file format used to present documents in a manner independent of application software, hardware, and operating systems. Each.
![]() Technical foundations[edit]The PDF combines three technologies: A subset of the Post. Script page description programming language, for generating the layout and graphics. A font- embedding/replacement system to allow fonts to travel with the documents. A structured storage system to bundle these elements and any associated content into a single file, with data compression where appropriate. Post. Script[edit]Post. Script is a page description language run in an interpreter to generate an image, a process requiring many resources. It can handle graphics and standard features of programming languages such as if and loop commands. PDF is largely based on Post. Script but simplified to remove flow control features like these, while graphics commands such as lineto remain. Often, the Post. Script- like PDF code is generated from a source Post. Script file. The graphics commands that are output by the Post. Script code are collected and tokenized. Any files, graphics, or fonts to which the document refers also are collected. Then, everything is compressed to a single file. Therefore, the entire Post. Script world (fonts, layout, measurements) remains intact. As a document format, PDF has several advantages over Post. Script: PDF contains tokenized and interpreted results of the Post. Script source code, for direct correspondence between changes to items in the PDF page description and changes to the resulting page appearance. PDF (from version 1. Post. Script does not. Post. Script is an interpreted programming language with an implicit global state, so instructions accompanying the description of one page can affect the appearance of any following page. Therefore, all preceding pages in a Post. Script document must be processed to determine the correct appearance of a given page, whereas each page in a PDF document is unaffected by the others. As a result, PDF viewers allow the user to quickly jump to the final pages of a long document, whereas a Post. Script viewer needs to process all pages sequentially before being able to display the destination page (unless the optional Post. Script Document Structuring Conventions have been carefully complied with). Technical overview[edit]File structure[edit]A PDF file is a 7- bit ASCII file, except for certain elements that may have binary content. A PDF file starts with a header containing the magic number and the version of the format such as %PDF- 1. The format is a subset of a COS ("Carousel" Object Structure) format.[1. A COS tree file consists primarily of objects, of which there are eight types: [1. Boolean values, representing true or false. Numbers. Strings, enclosed within parentheses ((..)), may contain 8- bit characters. Names, starting with a forward slash (/)Arrays, ordered collections of objects enclosed within square brackets ([..])Dictionaries, collections of objects indexed by Names enclosed within double pointy brackets (< <..> > )Streams, usually containing large amounts of data, which can be compressed and binary. The null object. Furthermore, there may be comments, introduced with the percent sign (%). Comments may contain 8- bit characters. Objects may be either direct (embedded in another object) or indirect. Indirect objects are numbered with an object number and a generation number and defined between the obj and endobj keywords. An index table, also called the cross- reference table and marked with the xref keyword, follows the main body and gives the byte offset of each indirect object from the start of the file.[1. This design allows for efficient random access to the objects in the file, and also allows for small changes to be made without rewriting the entire file (incremental update). Beginning with PDF version 1. This technique reduces the size of files that have large numbers of small indirect objects and is especially useful for Tagged PDF. At the end of a PDF file is a trailer introduced with the trailer keyword. It contains. A dictionary. An offset to the start of the cross- reference table (the table starting with the xref keyword)And the %%EOFend- of- file marker. The dictionary contains. A reference to the root object of the tree structure, also known as the catalog. The count of indirect objects in the cross- reference table. And other optional information. There are two layouts to the PDF files: non- linear (not "optimized") and linear ("optimized"). Non- linear PDF files consume less disk space than their linear counterparts, though they are slower to access because portions of the data required to assemble pages of the document are scattered throughout the PDF file. Linear PDF files (also called "optimized" or "web optimized" PDF files) are constructed in a manner that enables them to be read in a Web browser plugin without waiting for the entire file to download, since they are written to disk in a linear (as in page order) fashion.[1. PDF files may be optimized using Adobe Acrobat software or QPDF. Imaging model[edit]The basic design of how graphics are represented in PDF is very similar to that of Post. Script, except for the use of transparency, which was added in PDF 1. PDF graphics use a device- independent. Cartesian coordinate system to describe the surface of a page. A PDF page description can use a matrix to scale, rotate, or skew graphical elements. A key concept in PDF is that of the graphics state, which is a collection of graphical parameters that may be changed, saved, and restored by a page description. PDF has (as of version 1. Vector graphics[edit]As in Post. Script, vector graphics in PDF are constructed with paths. Paths are usually composed of lines and cubic Bézier curves, but can also be constructed from the outlines of text. Unlike Post. Script, PDF does not allow a single path to mix text outlines with lines and curves. Paths can be stroked, filled, clipping. Strokes and fills can use any color set in the graphics state, including patterns. PDF supports several types of patterns. The simplest is the tiling pattern in which a piece of artwork is specified to be drawn repeatedly. This may be a colored tiling pattern, with the colors specified in the pattern object, or an uncolored tiling pattern, which defers color specification to the time the pattern is drawn. Beginning with PDF 1. There are seven types of shading pattern of which the simplest are the axial shade (Type 2) and radial shade (Type 3). Raster images[edit]Raster images in PDF (called Image XObjects) are represented by dictionaries with an associated stream. The dictionary describes properties of the image, and the stream contains the image data. Less commonly, a raster image may be embedded directly in a page description as an inline image.) Images are typically filtered for compression purposes. Ghostscript, Ghostview and GSview. Welcome to the Home Page for Ghostscript, an interpreter for the. Post. Script language and for PDF, and related software and. For Ghostscript versions 9.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
November 2017
Categories |