PDF (Portable Document Format) is a widely used cross-platform open electronic document format. It is really convenient thanks to its compatibility with numerous programs. As a result, it is possible to view PDF documents using different software. And regardless of the device used, the page configuration remains the same. That’s why PDF is often chosen for creating different reports, including the cases when programs written in Delphi are used. For working with this format with the Delphi tools, there are various components, including free and commercial ones. As the format is rather flexible and has a lot of capacities, components for working with it are rather complex and quite often they add excessive information to created files. But in some cases, it is possible to use a minimum set of functionality for creating PDF files. Though a detailed description of the PDF file structure can take several hundreds of pages, we will focus on key points.
A PDF file consists of objects of different types connected with each other. The set of object types is limited. Each object has its unique number that allows objects to refer to each other (it is called indirect links). There are also direct links to objects – shifts in bytes from the beginning of a file. At the end of the file, there are usually a table of direct links to objects, a link to the beginning of this table, information about a number of objects and an indirect link to the first object. When you are viewing a document, it is being read from the end. Objects of parameters’ names and values are separated from each other with spaces and a #10 (“new line”) character. For positioning graphical objects on the page, a two-dimensional coordinate system with the beginning in the lower-left page corner is used. Relative points that equal 1/72 inches are used. The dimensions of the page are indicated in these units. The simplest file can have at least one page and its content will look the following way:
A table with direct links can be absent and the majority of programs will be able to read a file without it. The order of objects in the table should correspond to the object numeration but the objects themselves can be placed in the file with no particular order. It is convenient when you are creating a file but do not know the number of pages in advance – a parent object Pages with a page list can be placed at the end of the file. But you need to book a number for it.
Let’s have a look at what we need to have for creating the simplest PDF document with an unknown number of pages. Some invariant lines can be taken as a sample from a finished file and manage only direct and indirect links. It is convenient to use TStream for creating a document body as, for example, images have means for recording data into a stream. TList is suitable for counters of direct links. A minimum set of announced variables can look the following way:
123 | Directs: TList<integer>; // List of direct links for Cross-Reference Table
PageObjects: TList<integer>; // List of indirect pages links
PDFStream: TStream; // Main stream for storing PDF body |
The Pages object with the list of pages will be placed at the end of the file. And we need to take it into account when we are adding indirect links for new objects. For our convenience, let’s announce a constant and we will add it to the counter.
Here’s an example of creating a beginning of a document:
123456789 | procedure PDF_Init(const AFileName: string);
var s: AnsiString;
begin
Directs := TList<integer>.Create;
PageObjects := TList<integer>.Create;
PDFStream := TFileStream.Create(FName, fmCreate);
s := '%PDF-1.5' + #10;
PDFStream.Write(s[1], Length(s));
end; |
How to add an empty page with the set dimentions:
123456789101112 | procedure PDF_AddEmptyPage(APageWidth, APageHeight: double);
var
s: AnsiString;
begin
Directs.Add(PDFStream.Position); // Add direct link for cross-reference table
FPageObjects.Add(Directs.Count + cParamCount); // Add indirect link
s := (Directs.Count + cParamCount).ToString + ' 0 obj' + #10 + '<<' + #10 +
'/Type /Page' + #10 + '/Parent 2 0 R' + #10 + '/MediaBox [0 0 ' +
FloatToStrF(PageW, ffFixed, 5, 3) + ' ' + FloatToStrF(PageH, ffFixed, 5, 3)
+ ']' + #10 + '>>' + #10 + 'endobj' + #10;
PDFStream.Write(s[1], Length(s));
end; |
And we can end the document the following way:
12345678910111213141516171819202122232425262728293031323334353637383940414243444546 | procedure PDF_SaveClose;
var
LStartXref: integer;
s: AnsiString;
begin
if PageObjects.Count = 0 then // The document must have at least one page
AddPage(200, 300);
// Page objects list generate
Directs.Insert(0, PDFStream.Position); // insert link to first object
s := '1 0 obj' + #10 + '<<' + #10 + '/Type /Catalog' + #10 + '/Pages ' + '2 0 R>>' + #10 + 'endobj' + #10;
PDFStream.Write(s[1], Length(s));
Directs.Insert(1, PDFStream.Position); // insert link to second object
s := '2 0 obj' + #10 + '<<' + #10 + '/Type /Pages' + #10 + '/Count ' + PageObjects.Count.ToString + #10 + '/Kids [ ';
for var i: integer := 0 to PageObjects.Count – 1 do
s := s + PageObjects[i].ToString + ' 0 R' + #10;
s := s + ']' + #10 + '>>' + #10 + 'endobj' + #10;
PDFStream.Write(s[1], Length(s));
// Save Cross-Reference Table if needed
if ASaveRefTable then
begin
LStartXref := PDFStream.Position;
s := 'xref' + #10 + '0 ' + (Directs.Count + 1).ToString() + '' + #10 + '0000000000 65535 f' + #10;
PDFStream.Write(s[1], Length(s));
for var L: integer in FDirects do
begin
s := (10000000000 + L).ToString().Substring(1) + ' 00000 n' + #10;
PDFStream.Write(s[1], Length(s));
end;
s := 'trailer' + #10 + '<<' + #10 + ' /Size ' + (Directs.Count + 1).ToString + #10 + ' /Root 1 0 R' + #10 + '>>'
+ #10 + 'startxref' + #10 + LStartXref.ToString + #10 + '%%EOF';
end
else
// or save trailer only
s := 'trailer' + #10 + '<<' + #10 + ' /Root 1 0 R' + #10 + '>>';
PDFStream.Write(s[1], Length(s));
end; |
A document with empty pages has little sense. That’s why let’s have a look at how we can add images to the pages. If a page has some content, we should add the keyword “Contents” into its description as well as links to the contents. When you are inserting an image, it is necessary to add the keyword “XObject “ and indicate links to the image data. Each image should get its conditional inner name.
Example of a page object with three images:
12345678910111213141516 | 3 0 obj
<<
/Type /Page
/Parent 2 0 R
/MediaBox [0 0 595.300 841.900]
/Resources << /ProcSet [/PDF /ImageC]
/XObject
<<
/Img1 5 0 R
/Img2 7 0 R
/Img3 9 0 R
>>
>>
/Contents 4 0 R
>>
endobj |
Example of graphics image output on the page (object coordinates are indicated for a lower left corner)
123456789101112131415161718 | 4 0 obj
<</Length 160
>>
stream
q
297.650 0 0 198.680 10.000 831.900 cm //Image width (in page units), 0, 0, image height, X, Y
/Img1 Do
Q
q
297.650 0 0 198.680 67.530 705.260 cm
/Img2 Do
Q
q
297.650 0 0 198.680 125.060 578.610 cm
/Img3 Do
Q
endstream
endobj |
For describing graphics output, operators q, Q, cm are used. Their detailed description can be found in the documentation of the PDF format.
The same object can be output on different pages in different places and in different sizes. For example, if you need to place a logo on all pages, it is enough just to add to the file one copy of an image and to indicate the same indirect link in the /XObject section and the relevant names of images in the description of graphics output for each page.
Data (“bodies”) of JPEG images are stored in a document practically just in the same way as if they were saved as a separate file. That’s why it is convenient to work with them. The description is placed before image data, for example:
123456789 | 5 0 obj
<<
/Type /XObject
/Subtype /Image
/Name /Img1
/Width 800 /Height 534 /Length 6 0 R
/Filter /DCTDecode
/ColorSpace /DeviceRGB
/BitsPerComponent 8 |
Parameters /Width and /Height are image dimensions in pixels, /Length points to the objects where the size of image data will be saved (the length of the block in bytes). The length can be indicated directly if you know it before recording image data into the stream. Image data should be placed between the keywords “stream” and “endstream”.
For creating a PDF document with images, we need to announce a data type:
In addition to what we’ve already had, we need to announce new variables (consequently, we will need to create TList objects before their use and a variable ImageCount should be initialized):
123 | ImageCount: integer; // Image counter for the whole document
ImgPositions: TList<TImagePosition>; // Positions of images on the current page
ImgNames: TList<AnsiString>; // Names of Images on the current page |
The procedure of adding a page to the document should be upgraded as we need to take into account the number of images:
1234567891011121314151617181920212223242526 | procedure PDF_AddPage(AWidth, AHeight: double; AImagesCount: integer);
var
i: integer;
s: AnsiString;
LImgName: AnsiString;
begin
ImgPositions.Clear;
ImgNames.Clear;
…
if AImagesCount > 0 then
begin
s := s + '/Resources << /ProcSet [/PDF /ImageC]' + #10 + '/XObject' + #10 + '<<' + #10;
for i := 1 to AImagesCount do
begin
LImgName := 'Img' + (FImageCount + i).ToString;
FImgNames.Add(LImgName);
s := s + '/' + LImgName + ' ' + (FDirects.Count + i * 2 + cParamCount).ToString + ' 0 R' + #10;
end;
s := s + '>>' + #10 + '>>' + #10 + '/Contents [ ' + (FDirects.Count + 1 + cParamCount).ToString + ' 0 R ]' + #10;
end;
s := s + '>>' + #10 + 'endobj' + #10;
FPDFStream.Write(s[1], Length(s));
end; |
After creating pages, we need to create a list of image positions and then to add the description of graphics output into the PDF stream.
1234567891011121314151617181920212223242526 | procedure AddImagePosition(AX, AY, AWidth, AHeight: double);
var
LImagePosition: TImagePosition;
begin
LImagePosition.X := AX;
LImagePosition.Y := AY;
LImagePosition.Width := AWidth;
LImagePosition.Height := AHeight;
ImgPositions.Add(LImagePosition);
end;
procedure SaveImagePositions;
var
s, LImgS: AnsiString;
begin
LImgS := '';
for var i: integer := 0 to ImgPositions.Count – 1 do
LImgS := LImgS + 'q' + #10 + FloatToStrF(ImgPositions[i].Width, ffFixed, 5, 3) + ' 0 0 ' +
FloatToStrF(ImgPositions[i].Height, ffFixed, 5, 3) + ' ' + FloatToStrF(ImgPositions[i].X, ffFixed, 5, 3) + ' ' +
FloatToStrF(ImgPositions[i].Y, ffFixed, 5, 3) + ' cm' + #10 + '/' + ImgNames[i] + ' Do' + #10 + 'Q' + #10;
Directs.Add(PDFStream.Position);
s := (Directs.Count + cParamCount).ToString + ' 0 obj' + #10 + '<</Length ' + Length(LImgS).ToString + #10 + '>>' +
#10 + 'stream' + #10 + LImgS + 'endstream' + #10 + 'endobj' + #10;
PDFStream.Write(s[1], Length(s));
end; |
After this preparation, we can add the image data to the stream. Let’s have a look at the way to record a JPEG image (or an image of any other format converted into JPEG) during the VCL app development (below you can see a fragment of procedure code):
1234567891011121314151617181920212223242526272829303132333435363738394041424344454647484950515253545556 | procedure PDF_SaveImage(AGraphic: TGraphic; AImageIndex, AJpegQuality: integer);
var
LImageSize: integer;
LJpegImage: TJPEGImage;
LBitmap: TBitmap;
s: AnsiString;
begin
…
LJpegImage := TJPEGImage.Create;
try
if AGraphic is TPngImage then
begin
LBitmap := TBitmap.Create;
try
LBitmap.Assign(AGraphic);
LJpegImage.Assign(LBitmap);
finally
LBitmap.Free;
end;
end
else
begin
LJpegImage.Assign(AGraphic);
end;
LJpegImage.CompressionQuality := AJpegQuality; // 100 – maximum quality, but big size
inc(ImageCount);
// Image properties
Directs.Add(FPDFStream.Position);
s := (Directs.Count + cParamCount).ToString + ' 0 obj' + #10 + '<<' + #10 + '/Type /XObject' + #10 +
'/Subtype /Image' + #10 + '/Name /' + ImgNames[AImageIndex] + #10 + '/Width ' + AGraphic.Width.ToString +
' /Height ' + AGraphic.Height.ToString + ' /Length ' + (Directs.Count + 1 + cParamCount).ToString + ' 0 R' + #10
+ '/Filter /DCTDecode' + #10 + '/ColorSpace /DeviceRGB' + #10 + '/BitsPerComponent 8' + #10 + '>>' + #10 +
'stream' + #10;
PDFStream.Write(s[1], Length(s));
// Save image and calculating image size
LImageSize := FPDFStream.Position;
LJpegImage.SaveToStream(FPDFStream);
LImageSize := FPDFStream.Position – LImageSize;
Result := true;
finally
LJpegImage.Free;
end;
s := #10 + 'endstream' + #10 + 'endobj' + #10;
FPDFStream.Write(s[1], Length(s));
// Save image size
FDirects.Add(FPDFStream.Position);
s := (FDirects.Count + cParamCount).ToString + ' 0 obj' + #10 + LImageSize.ToString + #10 + 'endobj' + #10;
FPDFStream.Write(s[1], Length(s));
…
end; |
With this method, it is possible to add contents of a standard component TImage into PDF documents, using TImage.Picture.Graphic as an input parameter. JPEG images can be recorded in the stream without additional transformations by copying data from the image file.
During the development of an FMX app, it is more convenient to use FMX.Graphics.TBitmap as a procedure parameter. Have a look at the following procedure code fragment:
1234567891011121314151617181920212223 | procedure PDF_SaveImage(ABitmap: FMX.Graphics.TBitmap; AImageIndex: integer; AJpegQuality: integer);
var
LImageSize: integer;
LBitmapSurface: TBitmapSurface;
LCodecParams: TBitmapCodecSaveParams;
s: AnsiString;
begin
…
LBitmapSurface := TBitmapSurface.Create;
try
LBitmapSurface.Assign(ABitmap);
LCodecParams.Quality := 90; // 100 – maximum quality, but big size
// Save image and calculating image size
LImageSize := FPDFStream.Position;
TBitmapCodecManager.SaveToStream(FPDFStream, LBitmapSurface, '.jpg', @LCodecParams);
LImageSize := FPDFStream.Position – LImageSize;
Result := true;
finally
LBitmapSurface.Free;
end;
…
end; |
Quite often there can be a necessity to create a PDF document from images (for example, from those that you get after scanning a paper document). The examples that we’ve considered above will be enough for solving this task. The algorithm of actions is the following one:
- To create a list of images.
- To calculate the necessary sizes of the pages based on the required resolution by sequentially loading images into the VCL.Graphics TGraphics or FMX.Graphics.TBitmap object. If an image occupies the whole page, it can be calculated the following way:
- To add a page with a calculated size.
- To add an image to the page with 0,0 coordinates and the sizes that are equal to those of the page.
- After adding all objects, to save the file.
Example of a procedure for creating a page from an image (using a previously written code):
1234567891011121314151617181920212223242526 | procedure PDF_AddImagePage(const AFileName: string; ADPI: integer): boolean;
var
LImageWidth, LImageHeight: integer;
LPageWidth, LPageHeight: double;
LBitmap: FMX.Graphics.TBitmap;
begin
LBitmap := FMX.Graphics.TBitmap.Create;
try
LBitmap.LoadFromFile(AFileName);
LImageWidth:= ABitmap.Width;
LImageHeight:= ABitmap.Height;
LPageWidth := LImageWidth * 72 / ADPI;
LPageHeight := LImageHeight * 72 / ADPI;
AddPage(LPageWidth, LPageHeight, 1);
AddImagePosition(0, 0, LPageWidth, LPageHeight);
SaveImagePositions;
PDF_SaveImage(LBitmap, 0, 85);
finally
FreeAndNil(LBitmap);
end;
end; |
So, in this article, we’ve considered the simplest way to create a PDF document with images. It’s worth noting that for using this method in real projects, it will be more comfortable to use not just separate procedures and variables but a class with a correspondent set of fields and methods.