Releases · ModelCloud/Tokenicer

21 Feb 09:36

Qubitium

v0.0.4

dd95bdf

Toke(n)icer v0.0.4 Latest

Latest

What's Changed

⚡ Now tokenicer instance dynamically inherits the native tokenizer.__class__ of tokenizer passed in or loaded via our Tokenicer.load() api.
⚡ CI now tests tokenizers from 64 models

fix mpt pad token bug by @CL-ModelCloud in #24
fix model_config bugs by @CL-ModelCloud in #25
test code clean up by @CL-ModelCloud in #26
Inherits PretrainedTokenizer by @Qubitium in #28
loop & test all models by @CSY-ModelCloud in #30

Full Changelog: v0.0.2...v0.0.4

Contributors

Qubitium, CL-ModelCloud, and CSY-ModelCloud

Assets 2

21 Feb 07:18

Qubitium

v0.0.3

b0b2591

Toke(n)icer v0.0.3

What's Changed

Now tokenicer instance dynamically inherits the native tokenizer.__class__ of tokenizer passed in or loaded via our Tokenicer.load() api.

fix mpt pad token bug by @CL-ModelCloud in #24
fix model_config bugs by @CL-ModelCloud in #25
test code clean up by @CL-ModelCloud in #26
Inherits PretrainedTokenizer by @Qubitium in #28

Full Changelog: v0.0.2...v0.0.3

Contributors

Qubitium and CL-ModelCloud

Assets 2

10 Feb 13:41

Qubitium

v0.0.2

efc81a2

Toke(n)icer v0.0.2

What's Changed

⚡ Auto-fix models not setting padding_token
⚡ Auto-Fix models released with wrong padding_token: many models incorrectly use eos_token as pad_token which leads to subtle and hidden errors in post-training and inference when batching is used which is almost always.
⚡ Compatible with all HF Transformers recognized tokenizers

Auto fix pad token by @CL-ModelCloud in #5
Forward to Tokenizer by @CL-ModelCloud in #6
read requirements.txt in setup.py by @CSY-ModelCloud in #7
[CI] add tokenicer forward test by @CL-ModelCloud in #10
add unit tests by @CSY-ModelCloud in #11
refractor by @Qubitium in #8
add deepseek_v3 map by @CL-ModelCloud in #15

New Contributors

@CSY-ModelCloud made their first contribution in #1
@Qubitium made their first contribution in #3
@CL-ModelCloud made their first contribution in #5

Full Changelog: https://github.com/ModelCloud/Tokenicer/commits/v0.0.2

Contributors

Qubitium, CL-ModelCloud, and CSY-ModelCloud

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What's Changed

Contributors

What's Changed

Contributors

What's Changed

New Contributors

Contributors

Releases: ModelCloud/Tokenicer

Toke(n)icer v0.0.4

What's Changed

Contributors

Toke(n)icer v0.0.3

What's Changed

Contributors

Toke(n)icer v0.0.2

What's Changed

New Contributors

Contributors