Skip to main content
The Sinkove Python SDK provides a simple, powerful interface for generating AI datasets programmatically. This guide covers everything you need to know to get started.

Installation

Install and configure the SDK

Quick Start

Create your first dataset in 5 minutes

Examples

Practical code examples and patterns

API Reference

Complete method and class documentation

Prerequisites

Before using the SDK, ensure you have:
  • Python 3.12+ installed
  • API Key from the Sinkove dashboard
  • Organization ID (UUID format) from the Sinkove dashboard
API keys and Organization IDs are covered in the Get Started section.

Installation

pip install sinkove-sdk

Authentication

The SDK requires an API key for authentication:
import os
import uuid
from sinkove import Client

# Set API key as environment variable
os.environ['SINKOVE_API_KEY'] = 'your-api-key-here'

# Initialize client
client = Client(uuid.UUID("your-organization-id"))
Never hardcode API keys in your source code. Use environment variables or secure configuration management.

Basic Usage

import uuid
from sinkove import Client

# Initialize client
organization_id = uuid.UUID("your-organization-id")
model_id = uuid.UUID("your-model-id")
client = Client(organization_id)

# Create dataset
dataset = client.datasets.create(
    model_id=model_id,
    num_samples=20,
    args={"prompt": "chest x-ray showing pneumonia"}
)

# Wait for completion and download
dataset.wait()
dataset.download("medical_dataset.zip", strategy="replace")
print(f"Dataset {dataset.id} ready!")

Core Concepts

Client

The Client class is your main entry point to the SDK:
from sinkove import Client
import uuid

client = Client(uuid.UUID("your-organization-id"))

# Access organization properties
print(f"Organization ID: {client.id}")
print(f"Organization Name: {client.organization_name}")

Datasets

Datasets are the core resource in Sinkove. They go through several states during processing:
StateDescription
PENDINGDataset creation request received
STARTEDDataset generation in progress
READYDataset successfully generated and ready for download
FAILEDDataset generation failed

Dataset Operations

Creating Datasets

dataset = client.datasets.create(
    model_id=uuid.UUID("your-model-id"),
    num_samples=50,
    args={"prompt": "chest x-ray showing pneumonia"}
)
print(f"Created dataset: {dataset.id}")

Listing and Retrieving Datasets

# Get all datasets
datasets = client.datasets.list()

# Get specific dataset
dataset = client.datasets.get(uuid.UUID("dataset-id"))

# Check dataset properties
print(f"State: {dataset.state}")
print(f"Ready: {dataset.ready}")
print(f"Progress: {dataset.metadata.progress if dataset.metadata else 'N/A'}")

Downloading Datasets

The SDK provides flexible download options:
# Basic download (fails if file exists)
dataset.download("output.zip")

# Wait for completion then download
dataset.download("output.zip", wait=True, timeout=300)

# Download with file handling strategy
dataset.download("output.zip", strategy="replace")  # or "skip", "fail"

Monitoring Progress

# Wait for completion
dataset.wait(timeout=600)  # Wait max 10 minutes

# Manual status checking
if dataset.finished:
    if dataset.ready:
        print("Dataset is ready!")
    else:
        print(f"Failed with state: {dataset.state}")

Error Handling

Always implement proper error handling for production use:
try:
    client = Client(organization_id)
    dataset = client.datasets.create(model_id, 50, {"prompt": "chest x-ray"})
    dataset.wait(timeout=1800)  # 30 minutes
    dataset.download("dataset.zip", strategy="replace")

except ValueError as e:
    print(f"Configuration error: {e}")
except TimeoutError:
    print("Dataset generation timed out")
except Exception as e:
    print(f"Error: {e}")

Advanced Features

Custom API Endpoint

import os
os.environ['SINKOVE_API_URL'] = 'https://api.your-instance.com'
client = Client(organization_id)

Multiple Organizations

from sinkove.connector import Connector
from sinkove.organizations.client import OrganizationClient

connector = Connector(api_key="your-api-key")
org_client = OrganizationClient(connector)

# List all organizations
organizations = org_client.list()
for org in organizations:
    print(f"{org.organization_name}: {len(org.datasets.list())} datasets")

Best Practices

  • Security: Use environment variables for API keys
  • Error Handling: Implement timeouts and proper exception handling
  • Performance: Use parallel operations for multiple datasets
  • Resource Management: Clean up downloaded files and monitor disk space

Common Issues

ProblemSolution
ValueError: An API key is requiredSet SINKOVE_API_KEY environment variable
TimeoutError: Dataset processing timed outIncrease timeout or check dataset complexity
Exception: Failed to retrieve download URLEnsure dataset is in READY state

Next Steps

Installation Guide

Detailed installation and setup instructions

Quick Start Tutorial

Step-by-step guide to your first dataset

Code Examples

Advanced patterns and real-world use cases

Full API Reference

Complete method and class documentation