Package com.knuddels.jtokkit.api
Interface EncodingRegistry
public interface EncodingRegistry
The EncodingRegistry is used to register custom encodings and to retrieve
encodings by name or type. The out-of-the-box supported encodings are registered automatically.
-
Method Summary
Modifier and TypeMethodDescriptiongetEncoding(EncodingType encodingType) Returns the encoding with the given type.getEncoding(String encodingName) Returns the encoding with the given name, if it exists.getEncodingForModel(ModelType modelType) Returns the encoding that is used for the given model type.getEncodingForModel(String modelName) Returns the encoding that is used for the given model type, if it exists.registerCustomEncoding(Encoding encoding) Registers a new custom encoding with the given name.registerGptBytePairEncoding(GptBytePairEncodingParams parameters) Registers a new byte pair encoding with the given name.
-
Method Details
-
getEncoding
Returns the encoding with the given name, if it exists. Otherwise, returns an empty Optional. Prefer usinggetEncoding(EncodingType)orgetEncodingForModel(ModelType)for built-in encodings.- Parameters:
encodingName- the name of the encoding- Returns:
- the encoding, if it exists
-
getEncoding
Returns the encoding with the given type.- Parameters:
encodingType- the type of the encoding- Returns:
- the encoding
-
getEncodingForModel
Returns the encoding that is used for the given model type, if it exists. Otherwise, returns an empty Optional. Prefer usinggetEncodingForModel(ModelType)for built-in encodings.Note that you can use this method to retrieve the correct encodings for snapshots of models, for example "gpt-4-0314" or "gpt-3.5-turbo-0301".
- Parameters:
modelName- the name of the model to get the encoding for- Returns:
- the encoding, if it exists
-
getEncodingForModel
Returns the encoding that is used for the given model type.- Parameters:
modelType- the model type- Returns:
- the encoding
-
registerGptBytePairEncoding
Registers a new byte pair encoding with the given name. The encoding must be thread-safe.- Parameters:
parameters- the parameters for the encoding- Returns:
- the registry for method chaining
- Throws:
IllegalArgumentException- if the encoding name is already registered- See Also:
-
registerCustomEncoding
Registers a new custom encoding with the given name. The encoding must be thread-safe.- Parameters:
encoding- the encoding- Returns:
- the registry for method chaining
- Throws:
IllegalArgumentException- if the encoding name is already registered
-