Local Curator Package

Your Comprehensive Multi-Modal Data Processing Solution

The Local Curator Package is a Python package designed for local multi-modal data discovery, preparation, and analysis. It provides tools for converting diverse data types (images, videos, sounds, texts, and 3D point clouds) into high-accuracy embedding vector and hash representations. It is designed for local execution, giving users complete control over their data processing environment and ensuring data privacy.

Overview

The Local Curator Package allows you to:

  • Process data locally for enhanced control and compliance.

  • Utilize a modular design, selecting subpackages based on your needs

  • Work with images, videos, sounds, texts, and 3D point clouds.

  • Scale vector operations to handle large datasets.

  • Customize models for optimal accuracy.

Subpackages

Trimming/Sampling Package:

Generates representative data subsets and extracts segments.

  • sample_data: Extracts samples using highlight or time interval sampling.

  • trim_by_highlights: Extracts trimmed segments based on highlights.

Segmentation Package:

Automates image segmentation through fine-tuning.

  • auto_image_segmentation_with_fine_tuning: Trains a custom model on labeled data to segment objects and export masks.

Vectorization/Indexing Package:

Transforms data into vector embeddings and indexes them.

  • create_archive: Initializes an archive.

  • update_parameters: Modifies archive settings.

  • index: Vectorizes and indexes data.

  • list_content: Retrieves content lists.

  • remove_content: Deletes content.

Fine-Tuned Vectorizers Package:

Offers pre-trained and customizable vectorization models.

  • fine_tune_vectorizer: This job allows you to customize the vectorizer model to better suit your specific data.

Reverse Search, Clustering, Inliers and Outliers Package:

Enables similarity-based data exploration and organization.

  • search: Performs similarity searches.

  • inliers_outliers: Identifies inliers and outliers.

  • cluster_by_number_of_clusters: Groups data into clusters.

Calibration Package:

Fine-tunes similarity and relevance metrics.

  • calibrate_similarity: Fine-tunes similarity metrics.

  • extract_similarity_dataset: Extracts highlight pairs for calibration.

  • cluster_by_calibrated_similarity: Clusters using calibrated similarity.

  • calibrate_relevance: Fine-tunes relevance metrics.

Distribution Analysis Package:

Analyzes and optimizes data distribution.

  • data_balance: Provides data balancing recommendations.

  • cross_reference_connection_insights: Identifies document connections.

  • pca_vector_dim_reduction: Performs PCA.

  • find_optimal: Performs optimization-based searches.

Translator Package:

Enables cross-modal search.

  • fine_tune_translator: Customizes translator models.

Licensing

The Local Curator Package is available with a One-Time Perpetual License or a Twelve-Month Payment Plan. Licensing is determined according to selected packages, desired modalities, and customer scale. With either option, you can:

  • Add Subpackages: Expand the functionality of your Local Curator Package as your needs evolve.

  • Obtain License Updates: Stay current with the latest features and improvements.

Support

Optional support services are available.