
Recognize objects in images
🤖/image/describe recognizes objects in images and returns them as English words.
You can use the labels that we return in your application to automatically classify images. You can also pass the labels down to other Robots to filter images that contain (or do not contain) certain content.
Usage example
Recognize objects in an uploaded image and store the labels in a JSON file:
{
"steps": {
"described": {
"robot": "/image/describe",
"use": ":original",
"provider": "aws"
}
}
}
Parameters
output_meta
Record<string, boolean> | boolean
Allows you to specify a set of metadata that is more expensive on CPU power to calculate, and thus is disabled by default to keep your Assemblies processing fast.
For images, you can add
"has_transparency": true
in this object to extract if the image contains transparent parts and"dominant_colors": true
to extract an array of hexadecimal color codes from the image.For videos, you can add the
"colorspace: true"
parameter to extract the colorspace of the output video.For audio, you can add
"mean_volume": true
to get a single value representing the mean average volume of the audio file.You can also set this to
false
to skip metadata extraction and speed up transcoding.result
boolean
(default:false
)Whether the results of this Step should be present in the Assembly Status JSON
queue
"batch"
Setting the queue to 'batch', manually downgrades the priority of jobs for this step to avoid consuming Priority job slots for jobs that don't need zero queue waiting times
force_accept
boolean
(default:false
)Force a Robot to accept a file type it would have ignored.
By default Robots ignore files they are not familiar with. 🤖/video/encode, for example, will happily ignore input images.
With the force_accept parameter set to true you can force Robots to accept all files thrown at them. This will typically lead to errors and should only be used for debugging or combatting edge cases.
use
string | Array<string> | Array<object> | object
Specifies which Step(s) to use as input.
- You can pick any names for Steps except
":original"
(reserved for user uploads handled by Transloadit) - You can provide several Steps as input with arrays:
{ "use": [ ":original", "encoded", "resized" ] }
Tip
That’s likely all you need to know about
use
, but you can view Advanced use cases.- You can pick any names for Steps except
provider
· required
Which AI provider to leverage.
Transloadit outsources this task and abstracts the interface so you can expect the same data structures, but different latencies and information being returned. Different cloud vendors have different areas they shine in, and we recommend to try out and see what yields the best results for your use case.
granularity
"full" | "list"
(default:"full"
)Whether to return a full response (
"full"
) including confidence percentages for each found label, or just a flat list of labels ("list"
).format
"json" | "meta" | "text"
(default:"json"
)In what format to return the descriptions.
"json"
returns a JSON file."meta"
does not return a file, but stores the data inside Transloadit's file object (under${file.meta.descriptions}
) that's passed around between encoding Steps, so that you can use the values to burn the data into videos, filter on them, etc.
explicit_descriptions
boolean
(default:false
)Whether to return only explicit or only non-explicit descriptions of the provided image. Explicit descriptions include labels for NSFW content (nudity, violence, etc). If set to
false
, only non-explicit descriptions (such as human or chair) will be returned. If set totrue
, only explicit descriptions will be returned.The possible descriptions depend on the chosen provider. The list of labels from AWS can be found in their documentation. GCP labels the image based on five categories, as described in their documentation.
For an example of how to automatically reject NSFW content and malware, please check out this blog post.
Demos
Related blog posts
- Tech preview: new AI Robots for enhanced media processing
- Introducing the OCR Robot for easy text extraction
- Celebrating transloadit’s 2021 milestones and progress
- Building an alt-text to speech generator with Transloadit
- How to automate content moderation using Transloadit (NSFW)
- Use Transloadit to automatically filter NSFW images