Interface EncodingRegistry


public interface EncodingRegistry
The EncodingRegistry is used to register custom encodings and to retrieve encodings by name or type. The out-of-the-box supported encodings are registered automatically.
  • Method Details

    • getEncoding

      Optional<Encoding> getEncoding(String encodingName)
      Returns the encoding with the given name, if it exists. Otherwise, returns an empty Optional. Prefer using getEncoding(EncodingType) or getEncodingForModel(ModelType) for built-in encodings.
      Parameters:
      encodingName - the name of the encoding
      Returns:
      the encoding, if it exists
    • getEncoding

      Encoding getEncoding(EncodingType encodingType)
      Returns the encoding with the given type.
      Parameters:
      encodingType - the type of the encoding
      Returns:
      the encoding
    • getEncodingForModel

      Optional<Encoding> getEncodingForModel(String modelName)
      Returns the encoding that is used for the given model type, if it exists. Otherwise, returns an empty Optional. Prefer using getEncodingForModel(ModelType) for built-in encodings.

      Note that you can use this method to retrieve the correct encodings for snapshots of models, for example "gpt-4-0314" or "gpt-3.5-turbo-0301".

      Parameters:
      modelName - the name of the model to get the encoding for
      Returns:
      the encoding, if it exists
    • getEncodingForModel

      Encoding getEncodingForModel(ModelType modelType)
      Returns the encoding that is used for the given model type.
      Parameters:
      modelType - the model type
      Returns:
      the encoding
    • registerGptBytePairEncoding

      EncodingRegistry registerGptBytePairEncoding(GptBytePairEncodingParams parameters)
      Registers a new byte pair encoding with the given name. The encoding must be thread-safe.
      Parameters:
      parameters - the parameters for the encoding
      Returns:
      the registry for method chaining
      Throws:
      IllegalArgumentException - if the encoding name is already registered
      See Also:
    • registerCustomEncoding

      EncodingRegistry registerCustomEncoding(Encoding encoding)
      Registers a new custom encoding with the given name. The encoding must be thread-safe.
      Parameters:
      encoding - the encoding
      Returns:
      the registry for method chaining
      Throws:
      IllegalArgumentException - if the encoding name is already registered