[nycphp-talk] Compressing PDF's
Hans Zaunere
hans at nyphp.org
Thu Jul 10 14:26:33 EDT 2003
Jerry Kapron wrote:
> >Hans Zaunere wrote:
> >Since PDF is just text, why not gzip, bzip2 or even zip?
>
> I know I wasn't specific enough, when I said "compress".
> Raw PDF format is just text. However contents of a PDF file can be
> optimized (compressed). I'm not looking to create a .zip or .gz file
> (that would be a nobrainer). I want to compress the PDF file
> "internally". Most PDF's created with Acrobat/Distiller are already
> compressed.
> If you download this PDF:
> http://www.tax.state.ny.us/pdf/2000/wt/nys45mn_100.pdf
> and open it in a text editor, you'll see that some parts are binary.
> Those are FlateCoded content streams.
>
> The file I'm working with was created with Adobe Illustrator and saved
> as raw PDF (text only). I need raw PDF to use it as a template (by
> preg_replacing some "variable text"). The problem is that the file is
> 700Kb (way too big for this web app). When I open it and save optimized
> in Adobe Distiller, the size is reduced to 195Kb, but the compressed
> file can not be used directly as a template anymore.
> I could take two different routes:
> 1) use the raw PDF file as a template > preg_replace some text >
> compress the new PDF > send it to the client
>
> 2) use an already compressed PDF file as a template > fetch and
> uncompress the FlateCoded streams > preg_replace some text > recompress
> the modified content > send the new PDF to the client.
>
> I know I could also use PDF4PHP to create a compressed PDF file from
> scratch but for performance reasons I really wanted to stick to using a
> template file. I searched the web but could not find any ready code
> specifically for what I want to do. I'm looking under the hood of the
> PDF4PHP class (it support FlateDecode compression) to get an idea how to
> uncompress the compressed streams. Any suggestions, pointers or code
> would be greatly appreciated.
Ahh, I knew it seemed too easy. I'm sure you've been over http://us2.php.net/pdf
Wish I could help more; to me, it's better to receive a PDF :)
H
More information about the talk
mailing list