viva_glyph/rvq
RVQ - Residual Vector Quantization
Multi-stage quantization where each stage encodes the residual (error) from the previous stage. This allows progressive refinement.
Theory
Based on EnCodec (Défossez et al., 2022):
Stage 1: q1 = argmin_c ||x - c||, r1 = x - codebook1[q1]
Stage 2: q2 = argmin_c ||r1 - c||, r2 = r1 - codebook2[q2]
...
Reconstruction: x' = Σ codebook[i][q[i]]
More stages = better reconstruction but more tokens.
References
- Défossez et al. (2022). High Fidelity Neural Audio Compression.
- Zeghidour et al. (2021). SoundStream. IEEE/ACM TASLP.
Types
RVQ configuration
pub type RvqConfig {
RvqConfig(num_stages: Int, dimension: Int, codebook_size: Int)
}
Constructors
-
RvqConfig(num_stages: Int, dimension: Int, codebook_size: Int)Arguments
- num_stages
-
Number of quantization stages
- dimension
-
Dimension of vectors
- codebook_size
-
Size of each codebook (vocabulary)
RVQ encoder: list of codebooks (one per stage)
pub type RvqEncoder {
RvqEncoder(
codebooks: List(codebook.Codebook),
config: RvqConfig,
)
}
Constructors
-
RvqEncoder(codebooks: List(codebook.Codebook), config: RvqConfig)
Quantized result: list of indices (one per stage)
pub type RvqTokens {
RvqTokens(indices: List(Int), total_error: Float)
}
Constructors
-
RvqTokens(indices: List(Int), total_error: Float)Arguments
- indices
-
Token indices, one per stage
- total_error
-
Total quantization error after all stages
Values
pub fn decode(
encoder: RvqEncoder,
tokens: RvqTokens,
) -> List(Float)
Decode tokens back to vector
pub fn default_config() -> RvqConfig
Default configuration (optimized per DeepSeek R1 validation)
- 6D latent: Johnson-Lindenstrauss optimal for 3D PAD
- 4 stages: matches EnCodec error distribution
- 256 codebook: adequate for emotional granularity
pub fn encode(
encoder: RvqEncoder,
input: List(Float),
) -> RvqTokens
Encode vector to tokens using residual quantization
pub fn from_codebooks(
codebooks: List(codebook.Codebook),
) -> RvqEncoder
Create RVQ encoder from existing codebooks
pub fn new(config: RvqConfig) -> RvqEncoder
Create RVQ encoder with deterministic initialization
pub fn reconstruct(
encoder: RvqEncoder,
input: List(Float),
) -> List(Float)
Round-trip: encode then decode
pub fn reconstruction_error(
encoder: RvqEncoder,
input: List(Float),
) -> Float
Compute reconstruction error
pub fn total_vocabulary(encoder: RvqEncoder) -> Int
Total vocabulary: codebook_size ^ num_stages This is how many unique glyphs are possible