Adds comprehensive model management and training capabilities: synor-compute (Rust): - ModelRegistry with pre-registered popular models - LLMs: Llama 3/3.1, Mistral, Mixtral, Qwen, DeepSeek, Phi, CodeLlama - Embedding: BGE, E5 - Image: Stable Diffusion XL, FLUX.1 - Speech: Whisper - Multi-modal: LLaVA - ModelInfo with parameters, format, precision, context length - Custom model upload and registration - Model search by name/category Flutter SDK: - Model registry APIs: listModels, getModel, searchModels - Custom model upload with multipart upload - Training APIs: train(), fineTune(), trainStream() - TrainingOptions: framework, epochs, batch_size, learning_rate - TrainingProgress for real-time updates - ModelUploadOptions and ModelUploadResult Example code for: - Listing available models by category - Fine-tuning pre-trained models - Uploading custom Python/ONNX models - Streaming training progress This enables users to: 1. Use pre-registered models like 'llama-3-70b' 2. Upload their own custom models 3. Fine-tune models on custom datasets 4. Track training progress in real-time
103 lines
2.9 KiB
Dart
103 lines
2.9 KiB
Dart
/// Synor Compute SDK for Flutter/Dart
|
|
///
|
|
/// A high-performance SDK for distributed heterogeneous computing.
|
|
/// Supports CPU, GPU, TPU, NPU, LPU, FPGA, DSP, WebGPU, and WASM processors.
|
|
///
|
|
/// ## Quick Start
|
|
///
|
|
/// ```dart
|
|
/// import 'package:synor_compute/synor_compute.dart';
|
|
///
|
|
/// void main() async {
|
|
/// // Create client
|
|
/// final client = SynorCompute(apiKey: 'your-api-key');
|
|
///
|
|
/// // Matrix multiplication
|
|
/// final a = Tensor.rand([512, 512]);
|
|
/// final b = Tensor.rand([512, 512]);
|
|
/// final result = await client.matmul(a, b, options: MatMulOptions(
|
|
/// precision: Precision.fp16,
|
|
/// processor: ProcessorType.gpu,
|
|
/// ));
|
|
///
|
|
/// print('Result shape: ${result.result!.shape}');
|
|
/// print('Execution time: ${result.executionTimeMs}ms');
|
|
///
|
|
/// // LLM Inference
|
|
/// final response = await client.inference(
|
|
/// 'llama-3-70b',
|
|
/// 'Explain quantum computing',
|
|
/// options: InferenceOptions(maxTokens: 256),
|
|
/// );
|
|
/// print(response.result);
|
|
///
|
|
/// // Streaming inference
|
|
/// await for (final token in client.inferenceStream(
|
|
/// 'llama-3-70b',
|
|
/// 'Write a haiku about computing',
|
|
/// )) {
|
|
/// stdout.write(token);
|
|
/// }
|
|
///
|
|
/// // Clean up
|
|
/// client.dispose();
|
|
/// }
|
|
/// ```
|
|
///
|
|
/// ## Features
|
|
///
|
|
/// - **Matrix Operations**: matmul, conv2d, attention, elementwise, reduce
|
|
/// - **LLM Inference**: Standard and streaming inference
|
|
/// - **Tensor Management**: Upload, download, and delete tensors
|
|
/// - **Job Management**: Submit, poll, cancel, and list jobs
|
|
/// - **Pricing**: Get real-time pricing for all processor types
|
|
/// - **Usage Statistics**: Track compute usage and costs
|
|
///
|
|
/// ## Supported Processors
|
|
///
|
|
/// | Processor | Best For |
|
|
/// |-----------|----------|
|
|
/// | CPU | General compute, small batches |
|
|
/// | GPU | Large matrix operations, training |
|
|
/// | TPU | Tensor operations, inference |
|
|
/// | NPU | Neural network inference |
|
|
/// | LPU | Large language model inference |
|
|
/// | FPGA | Custom operations, low latency |
|
|
/// | DSP | Signal processing |
|
|
/// | WebGPU | Browser-based compute |
|
|
/// | WASM | Portable compute |
|
|
library synor_compute;
|
|
|
|
export 'src/types.dart'
|
|
show
|
|
Precision,
|
|
ProcessorType,
|
|
Priority,
|
|
JobStatus,
|
|
BalancingStrategy,
|
|
DType,
|
|
SynorConfig,
|
|
MatMulOptions,
|
|
Conv2dOptions,
|
|
AttentionOptions,
|
|
InferenceOptions,
|
|
PricingInfo,
|
|
UsageStats,
|
|
SynorException,
|
|
// Model types
|
|
ModelCategory,
|
|
ModelFormat,
|
|
MlFramework,
|
|
ModelInfo,
|
|
ModelUploadOptions,
|
|
ModelUploadResult,
|
|
// Training types
|
|
TrainingOptions,
|
|
TrainingResult,
|
|
TrainingProgress;
|
|
|
|
export 'src/tensor.dart' show Tensor;
|
|
|
|
export 'src/job.dart' show JobResult, JobStatusUpdate, Job, JobBatch;
|
|
|
|
export 'src/client.dart' show SynorCompute;
|