Add README.md documentation for: - Main SDK overview with quick start guides - JavaScript/TypeScript SDK - Python SDK - Go SDK - Rust SDK - Java SDK - Kotlin SDK - Swift SDK - Flutter/Dart SDK - C SDK - C++ SDK - C#/.NET SDK - Ruby SDK Each README includes: - Installation instructions - Quick start examples - Tensor operations - Matrix operations (matmul, conv2d, attention) - LLM inference (single and streaming) - Configuration options - Error handling - Type definitions
166 lines
4.5 KiB
Markdown
166 lines
4.5 KiB
Markdown
# Synor Compute SDKs
|
|
|
|
Access distributed heterogeneous compute resources (CPU, GPU, TPU, NPU, LPU, FPGA, DSP, WebGPU, WASM) at 90% cost reduction compared to traditional cloud.
|
|
|
|
## Available SDKs
|
|
|
|
| Language | Package | Status |
|
|
|----------|---------|--------|
|
|
| [JavaScript/TypeScript](./js) | `synor-compute` | Production |
|
|
| [Python](./python) | `synor-compute` | Production |
|
|
| [Go](./go) | `github.com/synor/compute-sdk-go` | Production |
|
|
| [Flutter/Dart](./flutter) | `synor_compute` | Production |
|
|
| [Java](./java) | `io.synor:compute-sdk` | Production |
|
|
| [Kotlin](./kotlin) | `io.synor:compute-sdk-kotlin` | Production |
|
|
| [Swift](./swift) | `SynorCompute` | Production |
|
|
| [Rust](./rust) | `synor-compute` | Production |
|
|
| [C](./c) | `libsynor-compute` | Production |
|
|
| [C++](./cpp) | `synor-compute` | Production |
|
|
| [C#/.NET](./csharp) | `SynorCompute` | Production |
|
|
| [Ruby](./ruby) | `synor_compute` | Production |
|
|
|
|
## Features
|
|
|
|
- **Matrix Operations**: MatMul, Conv2D, Pooling, BatchNorm
|
|
- **AI/ML**: Flash Attention, FFT, Inference (LLMs, Vision, Embeddings)
|
|
- **Multi-Precision**: FP64, FP32, FP16, BF16, INT8, INT4
|
|
- **Automatic Routing**: Cost, Speed, Energy, or Balanced optimization
|
|
- **Streaming**: SSE-based streaming for LLM inference
|
|
- **Job Management**: Async job submission with status polling
|
|
|
|
## Quick Start
|
|
|
|
### JavaScript/TypeScript
|
|
|
|
```typescript
|
|
import { SynorCompute } from 'synor-compute';
|
|
|
|
const client = new SynorCompute('your-api-key');
|
|
|
|
// Matrix multiplication
|
|
const result = await client.matmul(a, b, {
|
|
precision: 'fp16',
|
|
processor: 'gpu'
|
|
});
|
|
|
|
// LLM inference with streaming
|
|
for await (const chunk of client.inferenceStream('llama-3-70b', prompt)) {
|
|
process.stdout.write(chunk);
|
|
}
|
|
```
|
|
|
|
### Python
|
|
|
|
```python
|
|
from synor_compute import SynorCompute, Tensor
|
|
|
|
client = SynorCompute('your-api-key')
|
|
|
|
# Matrix multiplication
|
|
a = Tensor.random((512, 512))
|
|
b = Tensor.random((512, 512))
|
|
result = await client.matmul(a, b, precision='fp16', processor='gpu')
|
|
|
|
# LLM inference with streaming
|
|
async for chunk in client.inference_stream('llama-3-70b', prompt):
|
|
print(chunk, end='')
|
|
```
|
|
|
|
### Go
|
|
|
|
```go
|
|
import "github.com/synor/compute-sdk-go"
|
|
|
|
client := synor.NewClient("your-api-key")
|
|
|
|
// Matrix multiplication
|
|
result, err := client.MatMul(ctx, a, b, synor.WithPrecision(synor.FP16))
|
|
|
|
// LLM inference
|
|
response, err := client.Inference(ctx, "llama-3-70b", prompt)
|
|
```
|
|
|
|
### Rust
|
|
|
|
```rust
|
|
use synor_compute::{SynorCompute, Tensor, Precision, ProcessorType};
|
|
|
|
let client = SynorCompute::new("your-api-key");
|
|
|
|
// Matrix multiplication
|
|
let result = client.matmul(&a, &b)
|
|
.precision(Precision::FP16)
|
|
.processor(ProcessorType::GPU)
|
|
.send()
|
|
.await?;
|
|
|
|
// LLM inference with streaming
|
|
let mut stream = client.inference_stream("llama-3-70b", prompt).await?;
|
|
while let Some(token) = stream.next().await {
|
|
print!("{}", token?);
|
|
}
|
|
```
|
|
|
|
## API Endpoints
|
|
|
|
All SDKs connect to the Synor Compute API:
|
|
|
|
- **Production**: `https://api.synor.io/compute/v1`
|
|
- **Local (Docker)**: `http://localhost:17250`
|
|
|
|
## Processor Types
|
|
|
|
| Type | Description |
|
|
|------|-------------|
|
|
| `cpu` | General-purpose CPU computation |
|
|
| `gpu` | NVIDIA/AMD GPU acceleration |
|
|
| `tpu` | Google TPU for ML workloads |
|
|
| `npu` | Neural Processing Units |
|
|
| `lpu` | Language Processing Units (Groq) |
|
|
| `fpga` | Field-Programmable Gate Arrays |
|
|
| `dsp` | Digital Signal Processors |
|
|
| `webgpu` | Browser-based GPU |
|
|
| `wasm` | WebAssembly runtime |
|
|
| `auto` | Automatic selection (default) |
|
|
|
|
## Precision Levels
|
|
|
|
| Level | Bits | Use Case |
|
|
|-------|------|----------|
|
|
| `fp64` | 64 | Scientific computing |
|
|
| `fp32` | 32 | General purpose (default) |
|
|
| `fp16` | 16 | AI/ML training |
|
|
| `bf16` | 16 | Large language models |
|
|
| `int8` | 8 | Quantized inference |
|
|
| `int4` | 4 | Extreme quantization |
|
|
|
|
## Balancing Strategies
|
|
|
|
| Strategy | Priority |
|
|
|----------|----------|
|
|
| `speed` | Minimize latency |
|
|
| `cost` | Minimize cost |
|
|
| `energy` | Minimize carbon footprint |
|
|
| `latency` | Real-time requirements |
|
|
| `balanced` | Optimal tradeoff (default) |
|
|
|
|
## Local Development with Docker
|
|
|
|
Deploy the compute infrastructure locally:
|
|
|
|
```bash
|
|
cd /path/to/Blockchain.cc
|
|
docker-compose -f docker-compose.compute.yml up -d
|
|
```
|
|
|
|
Services available:
|
|
- **Compute API**: `http://localhost:17250`
|
|
- **CPU Workers**: `http://localhost:17260-17261`
|
|
- **WASM Worker**: `http://localhost:17262`
|
|
- **Spot Market**: `http://localhost:17270`
|
|
- **Redis**: `localhost:17280`
|
|
- **Prometheus**: `http://localhost:17290`
|
|
|
|
## License
|
|
|
|
MIT License - see individual SDK packages for details.
|