synor/sdk/flutter/lib/synor_compute.dart
Gulshan Yadav 89fc542da4 feat(compute): add model registry and training APIs
Adds comprehensive model management and training capabilities:

synor-compute (Rust):
- ModelRegistry with pre-registered popular models
  - LLMs: Llama 3/3.1, Mistral, Mixtral, Qwen, DeepSeek, Phi, CodeLlama
  - Embedding: BGE, E5
  - Image: Stable Diffusion XL, FLUX.1
  - Speech: Whisper
  - Multi-modal: LLaVA
- ModelInfo with parameters, format, precision, context length
- Custom model upload and registration
- Model search by name/category

Flutter SDK:
- Model registry APIs: listModels, getModel, searchModels
- Custom model upload with multipart upload
- Training APIs: train(), fineTune(), trainStream()
- TrainingOptions: framework, epochs, batch_size, learning_rate
- TrainingProgress for real-time updates
- ModelUploadOptions and ModelUploadResult

Example code for:
- Listing available models by category
- Fine-tuning pre-trained models
- Uploading custom Python/ONNX models
- Streaming training progress

This enables users to:
1. Use pre-registered models like 'llama-3-70b'
2. Upload their own custom models
3. Fine-tune models on custom datasets
4. Track training progress in real-time
2026-01-11 15:22:26 +05:30

103 lines
2.9 KiB
Dart

/// Synor Compute SDK for Flutter/Dart
///
/// A high-performance SDK for distributed heterogeneous computing.
/// Supports CPU, GPU, TPU, NPU, LPU, FPGA, DSP, WebGPU, and WASM processors.
///
/// ## Quick Start
///
/// ```dart
/// import 'package:synor_compute/synor_compute.dart';
///
/// void main() async {
/// // Create client
/// final client = SynorCompute(apiKey: 'your-api-key');
///
/// // Matrix multiplication
/// final a = Tensor.rand([512, 512]);
/// final b = Tensor.rand([512, 512]);
/// final result = await client.matmul(a, b, options: MatMulOptions(
/// precision: Precision.fp16,
/// processor: ProcessorType.gpu,
/// ));
///
/// print('Result shape: ${result.result!.shape}');
/// print('Execution time: ${result.executionTimeMs}ms');
///
/// // LLM Inference
/// final response = await client.inference(
/// 'llama-3-70b',
/// 'Explain quantum computing',
/// options: InferenceOptions(maxTokens: 256),
/// );
/// print(response.result);
///
/// // Streaming inference
/// await for (final token in client.inferenceStream(
/// 'llama-3-70b',
/// 'Write a haiku about computing',
/// )) {
/// stdout.write(token);
/// }
///
/// // Clean up
/// client.dispose();
/// }
/// ```
///
/// ## Features
///
/// - **Matrix Operations**: matmul, conv2d, attention, elementwise, reduce
/// - **LLM Inference**: Standard and streaming inference
/// - **Tensor Management**: Upload, download, and delete tensors
/// - **Job Management**: Submit, poll, cancel, and list jobs
/// - **Pricing**: Get real-time pricing for all processor types
/// - **Usage Statistics**: Track compute usage and costs
///
/// ## Supported Processors
///
/// | Processor | Best For |
/// |-----------|----------|
/// | CPU | General compute, small batches |
/// | GPU | Large matrix operations, training |
/// | TPU | Tensor operations, inference |
/// | NPU | Neural network inference |
/// | LPU | Large language model inference |
/// | FPGA | Custom operations, low latency |
/// | DSP | Signal processing |
/// | WebGPU | Browser-based compute |
/// | WASM | Portable compute |
library synor_compute;
export 'src/types.dart'
show
Precision,
ProcessorType,
Priority,
JobStatus,
BalancingStrategy,
DType,
SynorConfig,
MatMulOptions,
Conv2dOptions,
AttentionOptions,
InferenceOptions,
PricingInfo,
UsageStats,
SynorException,
// Model types
ModelCategory,
ModelFormat,
MlFramework,
ModelInfo,
ModelUploadOptions,
ModelUploadResult,
// Training types
TrainingOptions,
TrainingResult,
TrainingProgress;
export 'src/tensor.dart' show Tensor;
export 'src/job.dart' show JobResult, JobStatusUpdate, Job, JobBatch;
export 'src/client.dart' show SynorCompute;