Index
All Classes and Interfaces|All Packages|Constant Field Values
A
- ADA - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- add(int) - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Appends the specified element to the end of this list.
B
- BABBAGE - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- boxed() - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Returns a
List<Integer>containing all the elements in this list in proper sequence (from first to last element).
C
- CL100K_BASE - Enum constant in enum class com.knuddels.jtokkit.api.EncodingType
- clear() - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Removes all the elements from this list.
- CODE_CUSHMAN_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_CUSHMAN_002 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_DAVINCI_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_DAVINCI_002 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_DAVINCI_EDIT_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_SEARCH_ADA_CODE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CODE_SEARCH_BABBAGE_CODE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- com.knuddels.jtokkit - package com.knuddels.jtokkit
- com.knuddels.jtokkit.api - package com.knuddels.jtokkit.api
- countTokens(String) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids and returns the amount of tokens.
- countTokensOrdinary(String) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids and returns the amount of tokens.
- CURIE - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- CUSHMAN_CODEX - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
D
- DAVINCI - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- DAVINCI_CODEX - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- decode(IntArrayList) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Decodes the given list of token ids into a text.
- decodeBytes(IntArrayList) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Decodes the given list of token ids into a byte array.
E
- encode(String) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids.
- encode(String, int) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids.
- encodeOrdinary(String) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids, ignoring special tokens.
- encodeOrdinary(String, int) - Method in interface com.knuddels.jtokkit.api.Encoding
-
Encodes the given text into a list of token ids, ignoring special tokens.
- Encoding - Interface in com.knuddels.jtokkit.api
- EncodingRegistry - Interface in com.knuddels.jtokkit.api
-
The EncodingRegistry is used to register custom encodings and to retrieve encodings by name or type.
- EncodingResult - Class in com.knuddels.jtokkit.api
-
The result of encoding operation.
- EncodingResult(IntArrayList, boolean) - Constructor for class com.knuddels.jtokkit.api.EncodingResult
- Encodings - Class in com.knuddels.jtokkit
- EncodingType - Enum Class in com.knuddels.jtokkit.api
- ensureCapacity(int) - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Increases the capacity of this
IntArrayListinstance, if necessary, to ensure that it can hold at least the number of elements specified by the minimum capacity argument. - equals(Object) - Method in class com.knuddels.jtokkit.api.IntArrayList
F
- fromName(String) - Static method in enum class com.knuddels.jtokkit.api.EncodingType
- fromName(String) - Static method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns a
ModelTypefor the given name, orOptional.empty()if no such model type exists.
G
- get(int) - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Returns the element at the specified position in this list.
- getEncoder() - Method in class com.knuddels.jtokkit.api.GptBytePairEncodingParams
- getEncoding(EncodingType) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Returns the encoding with the given type.
- getEncoding(String) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Returns the encoding with the given name, if it exists.
- getEncodingForModel(ModelType) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Returns the encoding that is used for the given model type.
- getEncodingForModel(String) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Returns the encoding that is used for the given model type, if it exists.
- getEncodingType() - Method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns the encoding type that is used by this model type.
- getMaxContextLength() - Method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns the maximum context length that is supported by this model type.
- getName() - Method in interface com.knuddels.jtokkit.api.Encoding
-
Returns the name of this encoding.
- getName() - Method in enum class com.knuddels.jtokkit.api.EncodingType
- getName() - Method in class com.knuddels.jtokkit.api.GptBytePairEncodingParams
- getName() - Method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns the name of the model type as used by the OpenAI API.
- getPattern() - Method in class com.knuddels.jtokkit.api.GptBytePairEncodingParams
- getSpecialTokensEncoder() - Method in class com.knuddels.jtokkit.api.GptBytePairEncodingParams
- getTokens() - Method in class com.knuddels.jtokkit.api.EncodingResult
-
Returns the list of token ids
- GPT_3_5_TURBO - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- GPT_3_5_TURBO_16K - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- GPT_4 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- GPT_4_32K - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- GptBytePairEncodingParams - Class in com.knuddels.jtokkit.api
-
Parameter for the byte pair encoding used to tokenize for the OpenAI GPT models.
- GptBytePairEncodingParams(String, Pattern, Map<byte[], Integer>, Map<String, Integer>) - Constructor for class com.knuddels.jtokkit.api.GptBytePairEncodingParams
-
Creates a new instance of
GptBytePairEncodingParams.
H
- hashCode() - Method in class com.knuddels.jtokkit.api.IntArrayList
I
- IntArrayList - Class in com.knuddels.jtokkit.api
- IntArrayList() - Constructor for class com.knuddels.jtokkit.api.IntArrayList
-
Constructs an empty list with an initial capacity of ten.
- IntArrayList(int) - Constructor for class com.knuddels.jtokkit.api.IntArrayList
-
Constructs an empty list with the specified initial capacity.
- isEmpty() - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Returns
trueif this list contains no elements. - isTruncated() - Method in class com.knuddels.jtokkit.api.EncodingResult
-
Returns true if the token list was truncated because the maximum token length was exceeded
M
- ModelType - Enum Class in com.knuddels.jtokkit.api
N
- newDefaultEncodingRegistry() - Static method in class com.knuddels.jtokkit.Encodings
-
Creates a new
EncodingRegistrywith the default encodings found in theEncodingTypeenum. - newLazyEncodingRegistry() - Static method in class com.knuddels.jtokkit.Encodings
-
Creates a new
EncodingRegistrywithout anyEncodingTyperegistered.
P
- P50K_BASE - Enum constant in enum class com.knuddels.jtokkit.api.EncodingType
- P50K_EDIT - Enum constant in enum class com.knuddels.jtokkit.api.EncodingType
R
- R50K_BASE - Enum constant in enum class com.knuddels.jtokkit.api.EncodingType
- registerCustomEncoding(Encoding) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Registers a new custom encoding with the given name.
- registerGptBytePairEncoding(GptBytePairEncodingParams) - Method in interface com.knuddels.jtokkit.api.EncodingRegistry
-
Registers a new byte pair encoding with the given name.
S
- set(int, int) - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Replaces the element at the specified position in this list with the specified element.
- size() - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Returns the number of elements in this list.
T
- TEXT_ADA_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_BABBAGE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_CURIE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_DAVINCI_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_DAVINCI_002 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_DAVINCI_003 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_DAVINCI_EDIT_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_EMBEDDING_3_LARGE - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_EMBEDDING_3_SMALL - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_EMBEDDING_ADA_002 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SEARCH_ADA_DOC_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SEARCH_BABBAGE_DOC_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SEARCH_CURIE_DOC_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SEARCH_DAVINCI_DOC_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SIMILARITY_ADA_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SIMILARITY_BABBAGE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SIMILARITY_CURIE_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- TEXT_SIMILARITY_DAVINCI_001 - Enum constant in enum class com.knuddels.jtokkit.api.ModelType
- toArray() - Method in class com.knuddels.jtokkit.api.IntArrayList
-
Returns an array containing all the elements in this list in proper sequence (from first to last element).
- toString() - Method in class com.knuddels.jtokkit.api.EncodingResult
- toString() - Method in class com.knuddels.jtokkit.api.IntArrayList
V
- valueOf(String) - Static method in enum class com.knuddels.jtokkit.api.EncodingType
-
Returns the enum constant of this class with the specified name.
- valueOf(String) - Static method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns the enum constant of this class with the specified name.
- values() - Static method in enum class com.knuddels.jtokkit.api.EncodingType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- values() - Static method in enum class com.knuddels.jtokkit.api.ModelType
-
Returns an array containing the constants of this enum class, in the order they are declared.
- VERY_LARGE_TOKENIZER_BYTE_THRESHOLD_KEY - Static variable in interface com.knuddels.jtokkit.api.Encoding
-
Name of the environment variable key to control when JTokkit should switch to a different tokenizer.
All Classes and Interfaces|All Packages|Constant Field Values