viva_glyph/rvq

RVQ - Residual Vector Quantization

Multi-stage quantization where each stage encodes the residual (error) from the previous stage. This allows progressive refinement.

Theory

Based on EnCodec (Défossez et al., 2022):

Stage 1: q1 = argmin_c ||x - c||, r1 = x - codebook1[q1]
Stage 2: q2 = argmin_c ||r1 - c||, r2 = r1 - codebook2[q2]
...
Reconstruction: x' = Σ codebook[i][q[i]]

More stages = better reconstruction but more tokens.

References

Types

RVQ configuration

pub type RvqConfig {
  RvqConfig(num_stages: Int, dimension: Int, codebook_size: Int)
}

Constructors

  • RvqConfig(num_stages: Int, dimension: Int, codebook_size: Int)

    Arguments

    num_stages

    Number of quantization stages

    dimension

    Dimension of vectors

    codebook_size

    Size of each codebook (vocabulary)

RVQ encoder: list of codebooks (one per stage)

pub type RvqEncoder {
  RvqEncoder(
    codebooks: List(codebook.Codebook),
    config: RvqConfig,
  )
}

Constructors

Quantized result: list of indices (one per stage)

pub type RvqTokens {
  RvqTokens(indices: List(Int), total_error: Float)
}

Constructors

  • RvqTokens(indices: List(Int), total_error: Float)

    Arguments

    indices

    Token indices, one per stage

    total_error

    Total quantization error after all stages

Values

pub fn decode(
  encoder: RvqEncoder,
  tokens: RvqTokens,
) -> List(Float)

Decode tokens back to vector

pub fn default_config() -> RvqConfig

Default configuration (optimized per DeepSeek R1 validation)

  • 6D latent: Johnson-Lindenstrauss optimal for 3D PAD
  • 4 stages: matches EnCodec error distribution
  • 256 codebook: adequate for emotional granularity
pub fn encode(
  encoder: RvqEncoder,
  input: List(Float),
) -> RvqTokens

Encode vector to tokens using residual quantization

pub fn from_codebooks(
  codebooks: List(codebook.Codebook),
) -> RvqEncoder

Create RVQ encoder from existing codebooks

pub fn new(config: RvqConfig) -> RvqEncoder

Create RVQ encoder with deterministic initialization

pub fn num_stages(encoder: RvqEncoder) -> Int

Get number of stages

pub fn reconstruct(
  encoder: RvqEncoder,
  input: List(Float),
) -> List(Float)

Round-trip: encode then decode

pub fn reconstruction_error(
  encoder: RvqEncoder,
  input: List(Float),
) -> Float

Compute reconstruction error

pub fn total_vocabulary(encoder: RvqEncoder) -> Int

Total vocabulary: codebook_size ^ num_stages This is how many unique glyphs are possible

pub fn vocab_size(encoder: RvqEncoder) -> Int

Get codebook size (vocabulary per stage)

Search Document