Custom Tokenizer
GenAI
(JavaScript 1.0)
1. Train Tokenizer
Hello world from tokenizer demo corpus
Train Tokenizer
2. Encode/Decode
Encode
Fill Example
Example:
The quick brown fox jumps over the lazy dog! @2025
Encoded IDs:
Decoded Text:
Demo
Input:
Hello world from tokenizer
Tokens:
[2, 4, 5, 6, 7, 3]
Decoded:
Hello world from tokenizer
Vocab size:
10
Special tokens:
<PAD>, <UNK>, <BOS>, <EOS>