Overview
The functionalities described herein are accessible through two primary methods: our cloud-based managed service API for direct integration, and our SDK framework which provides a convenient wrapper for streamlined development. The 'Jobs' and 'Endpoints' detailed below are consistent across both platforms, representing operations you can execute whether you are making direct API calls or using the corresponding SDK functions.
Our multi-modal data discovery and preparation services help you unlock the full potential of your datasets and archives.
We excel at converting images, videos, sounds, texts, and 3D point clouds and image + text multimodal into high-accuracy embedding vector and hash representations.
We also specialize in processing vector operations at scale, including highlight sampling, trimming, reverse search, and clustering.
You can fine-tune and keep proprietary models on your data for optimal accuracy, similarity, and relevance.
Our services allow you to chain tasks effectively, providing a seamless workflow for:
Validating datasets and balancing data distribution
Automating content preprocessing, redundancy and relevance filtering, and abuse detection
Boosting productivity in data labeling with smart prioritization and propagation
Enhancing search and recommendation capabilities on your platforms
All data files must belong to the same content type. The supported file extensions are:
Image: "jpg", "jpeg", "png", "bmp", "gif", "tiff", "tif"
Video: "mp4", "avi", "mov", "wmv"
Sound: "mp3", "wav"
Point_Cloud: "xyz", "xyzn", "xyzrgb", "pts", "pcd", "ply", "stl", "obj", "off", "gltf"
Text: "txt"
Multimodal Image + Text: txt, jpg, jpeg, png, bmp, gif, tiff, tif (txt files are embedded as text, image files are embedded as images, image files with filenames starting with “text_” are embedded as image + text, ex: “text_the sentence to embbed.jpg”)