RSS/Atom feed Twitter
Site is read-only, email is disabled

Extracting images from a PDF

This discussion is connected to the gimp-user-list.gnome.org mailing list which is provided by the GIMP developers and not related to gimpusers.com.

This is a read-only list on gimpusers.com so this discussion thread is read-only, too.

20 of 20 messages available
Toggle history

Please log in to manage your subscriptions.

Extracting images from a PDF CaLy 14 Sep 20:35
  Extracting images from a PDF fa-flyingalone 15 Sep 06:27
   Extracting images from a PDF fa-flyingalone 15 Sep 06:31
  Extracting images from a PDF rich404 15 Sep 10:11
   Extracting images from a PDF Dwain Alford via gimp-user-list 15 Sep 15:52
    Extracting images from a PDF CaLy 16 Sep 10:41
     Extracting images from a PDF fa-flyingalone 17 Sep 07:41
      Extracting images from a PDF Dwain Alford via gimp-user-list 17 Sep 19:06
  Extracting images from a PDF Joel Rees via gimp-user-list 18 Sep 18:27
   Extracting images from a PDF fa-flyingalone 20 Sep 04:22
   Extracting images from a PDF CaLy 24 Sep 12:25
    Extracting images from a PDF Joel Rees via gimp-user-list 24 Sep 18:19
  Extracting images from a PDF Andrew-Yellow-Jackets 24 Sep 15:51
   Extracting images from a PDF Liam R E Quin 24 Sep 16:21
    Extracting images from a PDF dep 24 Sep 16:26
     Extracting images from a PDF Joel Rees via gimp-user-list 24 Sep 17:54
     Extracting images from a PDF Liam R E Quin 24 Sep 18:19
      Extracting images from a PDF dep 24 Sep 18:25
       Extracting images from a PDF Liam R E Quin 24 Sep 19:39
        Extracting images from a PDF fa-flyingalone 27 Sep 02:45
2019-09-14 20:35:54 UTC (over 4 years ago)
postings
3

Extracting images from a PDF

Hello. I am Carlos and its my first question to this board, wich i think it will be great because it´s so great as the articles and tutorials and downloads i am watching ...

The thing is...
When i open a PDF file, and extracting the images and saving the images... 1) wich one should I choose? Open in layers, or image? 2) is there a way to save/export the image in the size of the pdf? because i saw there is a default size (width and height) wich is saved, but u can edit that size ... so ---
My question basically is : when i go to file / export it have specific size (the image to be exported). Is that size the original quality from PDF, or it is the default from gimp?

Thank you and i wish you understood my question.

CaLy (via www.gimpusers.com/forums)
2019-09-15 06:27:44 UTC (over 4 years ago)
postings
20

Extracting images from a PDF

Hello. I am Carlos and its my first question to this board, wich i think it will
be great because it´s so great as the articles and tutorials and downloads i am
watching ...

The thing is...
When i open a PDF file, and extracting the images and saving the images...
1) wich one should I choose? Open in layers, or image? 2) is there a way to save/export the image in the size of the pdf? because i saw
there is a default size (width and height) wich is saved, but u can edit that
size ... so ---
My question basically is : when i go to file / export it have specific size (the
image to be exported). Is that size the original quality from PDF, or it is the
default from gimp?

Thank you and i wish you understood my question.

Hello Carlos
I guess there are many ways to do the saving of images in gimp from a pdf, This is the way I do mine

when you load the pdf in gimp 1-use the 'Resolution' tumbler up arrow (RS) to increase the size of images, I go for a height of 5000 px using (RS) that gives a good size image and spare size to crop if needed,
2-now you image size is worked out - click open pages as 'images ' if you want to use each one separately if you chose layers all the pages will be in one image as layers
3-select only the pages you want to use or select all and 'import' 4-the image files will show up as tabs and do what you want then 5-then lastly use 'export as' NOT 'overwrite pdf', when exporting I choose png format holds more info at higher Mb'sinc alpha channel instead of jpeg unless you want smaller file size
Is this what you needed
I did the step by step only to explain the way works for me,

Good luck Hope this helps

fa-flyingalone (via www.gimpusers.com/forums)
2019-09-15 06:31:18 UTC (over 4 years ago)
postings
20

Extracting images from a PDF

Hello Carlos
I guess there are many ways to do the saving of images in gimp from a pdf,
This is the way I do mine

when you load the pdf in gimp 1-use the 'Resolution' tumbler up arrow (RS) to increase the size of images, I
go for a height of 5000 px using (RS) that gives a good size image and spare
size to crop if needed,
2-now you image size is worked out - click open pages as 'images ' if you want
to use each one separately if you chose layers all the pages will be in one
image as layers
3-select only the pages you want to use or select all and 'import' 4-the image files will show up as tabs and do what you want then 5-then lastly use 'export as' NOT 'overwrite pdf', when exporting I choose png
format holds more info at higher Mb'sinc alpha channel instead of jpeg unless
you want smaller file size
Is this what you needed
I did the step by step only to explain the way works for me,

Good luck Hope this helps

Edit :) if you not sure = ask.

fa-flyingalone (via www.gimpusers.com/forums)
rich404
2019-09-15 10:11:08 UTC (over 4 years ago)

Extracting images from a PDF

Hello. I am Carlos and its my first question to this board, wich i think it will
be great because it´s so great as the articles and tutorials and downloads i am
watching ...

The thing is...
When i open a PDF file, and extracting the images and saving the images...
1) wich one should I choose? Open in layers, or image? 2) is there a way to save/export the image in the size of the pdf? because i saw
there is a default size (width and height) wich is saved, but u can edit that
size ... so ---
My question basically is : when i go to file / export it have specific size (the
image to be exported). Is that size the original quality from PDF, or it is the
default from gimp?

Thank you and i wish you understood my question.

There are some things to remember for a PDF.

It is a 'finished' format meant for viewing or printing. The document size, A4 , US Letter...is a property. For printing pixels-per-inch (ppi aka dpi) has an effect but not always an easy way to determine the original size. see screenshot https://i.imgur.com/g8IP95S.jpg Scanned documents in particular can have various original ppi.

A PDF document can be in forms, vectors or a bitmap ( say a scanned image) or a mix, image a bitmap + text.

Gimp is not always the best tool, It might be possible to open in some other application and copy an image "as original" to Gimp . This example using LibreOffice https://i.imgur.com/a9N3Mtx.jpg but not if the document comes from a scan.

Gimp is a bitmap editor. Open a PDF in Gimp and it is converted to 100% bitmap regardless of original format.

Open a PDF in Gimp and the default ppi is 100. For quality, change this to 300 ppi. Only need a specific page? Then select just that page. screenshot: https://i.imgur.com/OH5Sw2s.jpg

Then use the rectangular select tool, position and size over the image, Edit -> Copy for the contents. Edit -> Paste-As -> New image. Export that as required.

Not the same size in pixels as an extracted image? All depends on the ppi.

rich404 (via www.gimpusers.com/forums)
Dwain Alford via gimp-user-list
2019-09-15 15:52:52 UTC (over 4 years ago)

Extracting images from a PDF

On Sun, Sep 15, 2019 at 5:11 AM rich404 wrote:

There are some things to remember for a PDF.

It is a 'finished' format meant for viewing or printing. The document size, A4 ,
US Letter...is a property. For printing pixels-per-inch (ppi aka dpi) has an
effect but not always an easy way to determine the original size. see screenshot
https://i.imgur.com/g8IP95S.jpg Scanned documents in particular can have various
original ppi.

A PDF document can be in forms, vectors or a bitmap ( say a scanned image) or a
mix, image a bitmap + text.

also, a pdf is an "image" of the original document. adobe created the format as a universal document format to be shared and the recipient did not have to have the program(s) used to create the document (i.e. ms office or libreoffice). so a pdf is like the johnny cash song about working in a cadillac plant and building his caddy over the years with the different year parts. i guess you could say that it's a faux page layout program similar ti scribus, indesign, quarkexpress, etc.

hope this helps

On Sun, Sep 15, 2019 at 5:11 AM rich404 wrote:

Hello. I am Carlos and its my first question to this board, wich i think it will
be great because it´s so great as the articles and tutorials and downloads i am
watching ...

The thing is...
When i open a PDF file, and extracting the images and saving the images...
1) wich one should I choose? Open in layers, or image? 2) is there a way to save/export the image in the size of the pdf? because i saw
there is a default size (width and height) wich is saved, but u can edit that
size ... so ---
My question basically is : when i go to file / export it have specific size (the
image to be exported). Is that size the original quality from PDF, or it is the
default from gimp?

Thank you and i wish you understood my question.

There are some things to remember for a PDF.

It is a 'finished' format meant for viewing or printing. The document size, A4 ,
US Letter...is a property. For printing pixels-per-inch (ppi aka dpi) has an
effect but not always an easy way to determine the original size. see screenshot
https://i.imgur.com/g8IP95S.jpg Scanned documents in particular can have various
original ppi.

A PDF document can be in forms, vectors or a bitmap ( say a scanned image) or a
mix, image a bitmap + text.

Gimp is not always the best tool, It might be possible to open in some other
application and copy an image "as original" to Gimp . This example using LibreOffice https://i.imgur.com/a9N3Mtx.jpg but not if the document comes from a
scan.

Gimp is a bitmap editor. Open a PDF in Gimp and it is converted to 100% bitmap
regardless of original format.

Open a PDF in Gimp and the default ppi is 100. For quality, change this to 300
ppi. Only need a specific page? Then select just that page. screenshot: https://i.imgur.com/OH5Sw2s.jpg

Then use the rectangular select tool, position and size over the image, Edit ->
Copy for the contents. Edit -> Paste-As -> New image. Export that as required.

Not the same size in pixels as an extracted image? All depends on the ppi.

-- rich404 (via www.gimpusers.com/forums) _______________________________________________ gimp-user-list mailing list
List address: gimp-user-list@gnome.org List membership: https://mail.gnome.org/mailman/listinfo/gimp-user-list List archives: https://mail.gnome.org/archives/gimp-user-list

2019-09-16 10:41:38 UTC (over 4 years ago)
postings
3

Extracting images from a PDF

also, a pdf is an "image" of the original document. adobe created the format as a universal document format to be shared and the recipient did
not have to have the program(s) used to create the document (i.e. ms office
or libreoffice). so a pdf is like the johnny cash song about working in a
cadillac plant and building his caddy over the years with the different
year parts. i guess you could say that it's a faux page layout program similar ti scribus, indesign, quarkexpress, etc.

hope this helps

Thank you ALL for your solutions ! I will see and it was just a curious thing ,because there is somewhere where i saw posted images from a magazine, wich i think it is a extraction of pdf, right? and i was curious about the sizes, dimensions and all. Maybe the images are not extracted from the pdf of magazine, maybe they are from before putting it in the pdf, i mean they are "crude" images, like for example the jpg of the cover of the magazine, and inside pages of the mag ...because there are some wich are covers, for example, and they are in bad quality, or resized to appear like hq but it´s not real hq, because when i increase the zoom on irfanview or other image viewers, the squares of pixels starts to appear ...

CaLy (via www.gimpusers.com/forums)
2019-09-17 07:41:18 UTC (over 4 years ago)
postings
20

Extracting images from a PDF

Thank you ALL for your solutions ! I will see and it was just a curious thing ,because there is somewhere where i
saw posted images from a magazine, wich i think it is a extraction of pdf,
right? and i was curious about the sizes, dimensions and all. Maybe the images are not extracted from the pdf of magazine, maybe they are from
before putting it in the pdf, i mean they are "crude" images, like for example
the jpg of the cover of the magazine, and inside pages of the mag ...because
there are some wich are covers, for example, and they are in bad quality, or
resized to appear like hq but it´s not real hq, because when i increase the zoom
on irfanview or other image viewers, the squares of pixels starts to appear ...

Hello Carlos
This is what the resolution tumbler is good for, if you import the images at low resolution you can get low quality images, experimenting for me I found to roll up the resolution until gets to 300 to 500 in the resolution tumbler that's approx 5000 pixels height , if you leave the resolution at default of 100 the images of small size and look like low quality and you will see the squares of pixels when zoom in small amount , also exporting images in jpeg at 85% will give lower quality viewing when zoomed in too ,
I prefer to export in png format keeps more information including alpha channel,
if you do want to export as jpeg go for 100% quality , you can always make copies and make low 85% or lower exports as copies and have the better images in your own folder as your collection .. A lot of experimenting is the usual with Gimp I have found try to use copies of so if I stuff it up it is not a problem . Hope this helps

fa-flyingalone (via www.gimpusers.com/forums)
Dwain Alford via gimp-user-list
2019-09-17 19:06:28 UTC (over 4 years ago)

Extracting images from a PDF

Hello Carlos

I prefer to export in png format keeps more information including alpha channel,


if you are exporting for print, .tif maintains more information than .png. if there is no alpha channel in the file, you can add one to allow transparency if using paper other than white for the final product.

hth

Joel Rees via gimp-user-list
2019-09-18 18:27:32 UTC (over 4 years ago)

Extracting images from a PDF

For what it's worth, if you have a lot of images buried in a PDF file, extracting them all, one at a time, with the GIMP can be tedious.

I generally use imagemagick to extract images from PDFs (imagemagick convert).

2019年9月15日(日) 5:36 CaLy :

Hello. I am Carlos and its my first question to this board, wich i think it will
be great because it´s so great as the articles and tutorials and downloads i am
watching ...

The thing is...
When i open a PDF file, and extracting the images and saving the images... 1) wich one should I choose? Open in layers, or image? 2) is there a way to save/export the image in the size of the pdf? because i saw
there is a default size (width and height) wich is saved, but u can edit that
size ... so ---
My question basically is : when i go to file / export it have specific size (the
image to be exported). Is that size the original quality from PDF, or it is the
default from gimp?

Thank you and i wish you understood my question.

-- CaLy (via www.gimpusers.com/forums)
_______________________________________________ gimp-user-list mailing list
List address: gimp-user-list@gnome.org List membership: https://mail.gnome.org/mailman/listinfo/gimp-user-list List archives: https://mail.gnome.org/archives/gimp-user-list

2019-09-20 04:22:28 UTC (over 4 years ago)
postings
20

Extracting images from a PDF

For what it's worth, if you have a lot of images buried in a PDF file, extracting them all, one at a time, with the GIMP can be tedious.

I generally use imagemagick to extract images from PDFs (imagemagick convert).

2019年9月15日(日) 5:36 CaLy :

Hello Joel Rees
I'll be looking at imagemagick and it's features and options , imagemagic didnot know it could that,
as good as gimp is - it is tedious to extract many images one at a time from a pdf.
Thanks a heap.

fa-flyingalone (via www.gimpusers.com/forums)
2019-09-24 12:25:08 UTC (over 4 years ago)
postings
3

Extracting images from a PDF

For what it's worth, if you have a lot of images buried in a PDF file, extracting them all, one at a time, with the GIMP can be tedious.

I generally use imagemagick to extract images from PDFs (imagemagick convert).

2019年9月15日(日) 5:36 CaLy :

Sorry to disturb you, i see two executables to download for 64bits (windows):

1) Win64 static at 16 bits-per-pixel component 2) Win64 dynamic at 8 bits-per-pixel component 3) Win64 static at 8 bits-per-pixel component 4) Win64 dynamic at 16 bits-per-pixel component with high dynamic-range imaging enabled

Wich one should i download to extract the images from pdf? and btw it is from a command line in imagemagick right?

Thanks !

CaLy (via www.gimpusers.com/forums)
2019-09-24 15:51:51 UTC (over 4 years ago)
postings
1

Extracting images from a PDF

I think OP wants the image without re-sampling. I've found two different ways to do this.

1) I have an Adobe Acrobat license at work. I can go into edit mode, pick the image, then tell it to open with GIMP.

1.5) Also with Acrobat, Save the PDF as Word, PowerPoint, etc., then grab the image out of the new document.

2) Open the PDF with Inkscape. It doesn't appear you can export individual images without having to specify a resolution. But it the PDF has more than one raster image, you can export them in bulk and it appears to use the original resolution.

Andrew-Yellow-Jackets (via www.gimpusers.com/forums)
Liam R E Quin
2019-09-24 16:21:52 UTC (over 4 years ago)

Extracting images from a PDF

On Tue, 2019-09-24 at 17:51 +0200, Andrew-Yellow-Jackets wrote:

1) I have an Adobe Acrobat license at work. I can go into edit mode, pick the
image, then tell it to open with GIMP.

2) Open the PDF with Inkscape.

On Linux™ systems you can also do this by opening the image in the GNOME Document Viewer (evince) and right-clicking Save Image.

slave ankh

web slave for https://www.fromoldbooks.org/
with fabulous vintage art and fascinating texts to read.
Click here to order a new mouse pointer.
dep
2019-09-24 16:26:04 UTC (over 4 years ago)

Extracting images from a PDF

said Liam R E Quin:
| On Tue, 2019-09-24 at 17:51 +0200, Andrew-Yellow-Jackets wrote: | > 1) I have an Adobe Acrobat license at work. I can go into edit mode, | > pick the
| > image, then tell it to open with GIMP. | >
| > 2) Open the PDF with Inkscape.
|
| On Linux™ systems you can also do this by opening the image in the | GNOME Document Viewer (evince) and right-clicking Save Image.

The heritage of the PDF has something to do with it also. A lot of PDFs are basically themselves a single image per page, in which case you open it in the GIMP and crop it down to the image, which you save by whatever name and in whatever format you want, all the while weeping over the poor quality of the thing, unless it's a very high resolution PDF. No?

dep

Some pictures:
http://www.ipernity.com/doc/depscribe/album
Joel Rees via gimp-user-list
2019-09-24 17:54:30 UTC (over 4 years ago)

Extracting images from a PDF

2019年9月25日(水) 1:26 dep :

said Liam R E Quin:
| On Tue, 2019-09-24 at 17:51 +0200, Andrew-Yellow-Jackets wrote: | > 1) I have an Adobe Acrobat license at work. I can go into edit mode, | > pick the
| > image, then tell it to open with GIMP. | >
| > 2) Open the PDF with Inkscape.
|
| On Linux™ systems you can also do this by opening the image in the | GNOME Document Viewer (evince) and right-clicking Save Image.

The heritage of the PDF has something to do with it also. A lot of PDFs are
basically themselves a single image per page, in which case you open it in the GIMP and crop it down to the image, which you save by whatever name and in whatever format you want,

:)

all the while weeping over the poor

quality of the thing, unless it's a very high resolution PDF. No? --
dep

:-(

Yeah.

Joel Rees via gimp-user-list
2019-09-24 18:19:07 UTC (over 4 years ago)

Extracting images from a PDF

2019年9月24日(火) 21:25 CaLy :

For what it's worth, if you have a lot of images buried in a PDF file, extracting them all, one at a time, with the GIMP can be tedious.

I generally use imagemagick to extract images from PDFs (imagemagick convert).

2019年9月15日(日) 5:36 CaLy :

Sorry to disturb you, i see two executables to download for 64bits (windows):

Where?

1) Win64 static at 16 bits-per-pixel component

2) Win64 dynamic at 8 bits-per-pixel component 3) Win64 static at 8 bits-per-pixel component 4) Win64 dynamic at 16 bits-per-pixel component with high dynamic-range imaging
enabled

I hope you are looking at this tagged point on this page:

https://imagemagick.org/script/download.php#windows

Wich one should i download to extract the images from pdf?

I would download the recommended one there if you have a 64 bit CPU, since you seem to be using Microsoft Windows on a machine with a fairly reasonable amount of memory, etc.

If you need to compile it and don't like Microsoft's Visual C environment, Cygwin can be useful. But I doubt you need to compile it, anyway.

Mac OS downloads are above that tag on the same page.

If you are using a Linux or BSD OS, you can install the imagemagick package from the distribution's packaging system, which I recommend, even though it won't be the latest-greatest version.

and btw it is from a

command line in imagemagick right?

Thanks !

Yes, imagemagick is a command-line tool, although there are (limited, of course) GUI imagemagick tools available in some OS/environment combinations.

Some of the GUI tools support bulk operations, but the command-line tool gives greatest flexibility, with a bit of practice.

Liam R E Quin
2019-09-24 18:19:14 UTC (over 4 years ago)

Extracting images from a PDF

On Tue, 2019-09-24 at 16:26 +0000, dep wrote:

The heritage of the PDF has something to do with it also. A lot of PDFs are basically themselves a single image per page, in which case you open it in the GIMP and crop it down to the image,

To be clear, don't open the PDF in GIMP, as this will re-sample the image.

all the while weeping over the poor quality of the thing, unless it's a very high resolution PDF. No?

A lot depends on what you’re trying to accomplish. Sometimes i've used the Inkscape front end to “potrace” to make an SVG vector version of something, for example. And there've been times when i’ve bought a physical printed copy of a book so i can scan the images at 2400dpi.

Liam (slave ankh)

Liam Quin - web slave for https://www.fromoldbooks.org/
dep
2019-09-24 18:25:43 UTC (over 4 years ago)

Extracting images from a PDF

said Liam R E Quin:

| To be clear, don't open the PDF in GIMP, as this will re-sample the | image.

Could you elaborate a bit here? Specifically, the harm you see coming from this?

dep

Some pictures:
http://www.ipernity.com/doc/depscribe/album
Liam R E Quin
2019-09-24 19:39:18 UTC (over 4 years ago)

Extracting images from a PDF

On Tue, 2019-09-24 at 18:25 +0000, dep wrote:

said Liam R E Quin:

To be clear, don't open the PDF in GIMP, as this will re-sample the image.

Could you elaborate a bit here? Specifically, the harm you see coming from this?

If the PDF file contains a JPEG image that was encoded at 200dpi (say), and you open the PDF in GIMP at 300dpi, GIMP will use a library to render the PDF to a bitmap image at 300dpi, so that library will take the 200dpi embedded image, render it to a bitmap, and then enlarge it (artifacts of compression and all), probably using a simple linear or cubic interpolation.

This means that every pixel in the image GIMP sees will be an average of the actual pixel values around it in the original.

What you want to do is to extract the original 200dpi (in this example) image and then have GIMP open that, not lose the quality by changing the size first.

There's no easy way to know the resolution of the embedded images; in some cases ImageMagick's "identify" command will list them, and e.g. https://superuser.com/questions/193485/extract-images-in-pdf-without-affecting-the-resolution links to a simple program to extract the actual JPEG images from PDF without reencoding them -
https://www.perlmonks.org/?node_id=720495

If you just need to rotate them, you can then use e.g. jpegtran, which is lossless.

Also note the free version of Acrobat also changes the sizes of the images by resampling.

slave liam (ankh on IRC)

https://www.fromoldbooks.org/
2019-09-27 02:45:10 UTC (over 4 years ago)
postings
20

Extracting images from a PDF

If the PDF file contains a JPEG image that was encoded at 200dpi (say),
and you open the PDF in GIMP at 300dpi, GIMP will use a library to render the PDF to a bitmap image at 300dpi, so that library will take the 200dpi embedded image, render it to a bitmap, and then enlarge it (artifacts of compression and all), probably using a simple linear or cubic interpolation.

This means that every pixel in the image GIMP sees will be an average of the actual pixel values around it in the original.

What you want to do is to extract the original 200dpi (in this example)
image and then have GIMP open that, not lose the quality by changing the size first.

There's no easy way to know the resolution of the embedded images; in some cases ImageMagick's "identify" command will list them, and e.g. https://superuser.com/questions/193485/extract-images-in-pdf-without-affecting-the-resolution links to a simple program to extract the actual JPEG images from PDF without reencoding them -
https://www.perlmonks.org/?node_id=720495

If you just need to rotate them, you can then use e.g. jpegtran, which is lossless.

Also note the free version of Acrobat also changes the sizes of the images by resampling.

slave liam (ankh on IRC)

I'm not too concerned about the techincal side as long as it does the job I am not an expert Just to clear up this is method that gets me very good results and works on any PDF ...Import PDF as images at a height of approx 5000 px individual pages then export as PNG that size gives plenty of room to crop and still save at reasonably good high end quality .. Keep It Simply So ...

fa-flyingalone (via www.gimpusers.com/forums)