Transloadit

Extract thumbnail images from documents

🤖/document/thumbs generates an image for each page in a PDF file or an animated GIF file that loops through all pages.

Things to keep in mind

  • If you convert a multi-page PDF file into several images, all result images will be sorted with the first image being the thumbnail of the first document page, etc.
  • You can also check the meta.thumb_index key of each result image to find out which page it corresponds to. Keep in mind that these thumb indices start at 0, not at 1.

Usage example

Convert all pages of a PDF document into separate 200px-wide images:

{
  "steps": {
    "thumbnailed": {
      "use": ":original",
      "robot": "/document/thumbs",
      "width": 200,
      "resize_strategy": "fit",
      "trim_whitespace": false
    }
  }
}

Parameters

  • output_meta

    Record<string, boolean> | boolean

    Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.

    For images, you can add "has_transparency": true in this object to extract if the image contains transparent parts and "dominant_colors": true to extract an array of hexadecimal color codes from the image.

    For videos, you can add the "colorspace: true" parameter to extract the colorspace of the output video.

    For audio, you can add "mean_volume": true to get a single value representing the mean average volume of the audio file.

    You can also set this to false to skip metadata extraction and speed up transcoding.

  • result

    boolean (default: false)

    Whether the results of this Step should be present in the Assembly Status JSON

  • queue

    "batch"

    Setting the queue to 'batch', manually downgrades the priority of jobs for this step to avoid consuming Priority job slots for jobs that don't need zero queue waiting times

  • force_accept

    boolean (default: false)
      Force a Robot to accept a file type it would have ignored.
    

    By default Robots ignore files they are not familiar with. 🤖/video/encode, for example, will happily ignore input images.

    With the force_accept parameter set to true you can force Robots to accept all files thrown at them. This will typically lead to errors and should only be used for debugging or combatting edge cases.

  • use

    string | Array<string> | Array<object> | object

    Specifies which Step(s) to use as input.

    • You can pick any names for Steps except ":original" (reserved for user uploads handled by Transloadit)
    • You can provide several Steps as input with arrays:
      {
        "use": [
          ":original",
          "encoded",
          "resized"
        ]
      }
      
  • page

    string | number | null (default: null)

    The PDF page that you want to convert to an image. By default the value is null which means that all pages will be converted into images.

  • format

    (default: "png")

    The format of the extracted image(s).

    If you specify the value "gif", then an animated gif cycling through all pages is created. Please check out this demo to learn more about this.

  • delay

    string | number

    If your output format is "gif" then this parameter sets the number of 100th seconds to pass before the next frame is shown in the animation. Set this to 100 for example to allow 1 second to pass between the frames of the animated gif.

    If your output format is not "gif", then this parameter does not have any effect.

  • width

    string | number

    Width of the new image, in pixels. If not specified, will default to the width of the input image

  • height

    string | number

    Height of the new image, in pixels. If not specified, will default to the height of the input image

  • resize_strategy

    (default: "pad")
  • background

    string (default: "#FFFFFF")

    Either the hexadecimal code or name of the color used to fill the background (only used for the pad resize strategy).

    By default, the background of transparent images is changed to white. For details about how to preserve transparency across all image types, see this demo.

  • alpha

    "Remove" | "Set"

    Change how the alpha channel of the resulting image should work. Valid values are "Set" to enable transparency and "Remove" to remove transparency.

    For a list of all valid values please check the ImageMagick documentation here.

  • density

    string

    While in-memory quality and file format depth specifies the color resolution, the density of an image is the spatial (space) resolution of the image. That is the density (in pixels per inch) of an image and defines how far apart (or how big) the individual pixels are. It defines the size of the image in real world terms when displayed on devices or printed.

    You can set this value to a specific width or in the format widthxheight.

    If your converted image has a low resolution, please try using the density parameter to resolve that.

  • antialiasing

    boolean (default: false)

    Controls whether or not antialiasing is used to remove jagged edges from text or images in a document.

  • colorspace

    Sets the image colorspace. For details about the available values, see the ImageMagick documentation.

    Please note that if you were using "RGB", we recommend using "sRGB". ImageMagick might try to find the most efficient colorspace based on the color of an image, and default to e.g. "Gray". To force colors, you might then have to use this parameter.

  • trim_whitespace

    boolean (default: true)

    This determines if additional whitespace around the PDF should first be trimmed away before it is converted to an image. If you set this to true only the real PDF page contents will be shown in the image.

    If you need to reflect the PDF's dimensions in your image, it is generally a good idea to set this to false.

  • pdf_use_cropbox

    boolean (default: true)

    Some PDF documents lie about their dimensions. For instance they'll say they are landscape, but when opened in decent Desktop readers, it's really in portrait mode. This can happen if the document has a cropbox defined. When this option is enabled (by default), the cropbox is leading in determining the dimensions of the resulting thumbnails.

Demos

Related blog posts