Skip to content

Remove dead is_llama detection code in TransformerTokenizer#1876

Open
joaquinhuigomez wants to merge 1 commit into
dottxt-ai:mainfrom
joaquinhuigomez:fix/remove-dead-is-llama-code
Open

Remove dead is_llama detection code in TransformerTokenizer#1876
joaquinhuigomez wants to merge 1 commit into
dottxt-ai:mainfrom
joaquinhuigomez:fix/remove-dead-is-llama-code

Conversation

@joaquinhuigomez

Copy link
Copy Markdown

convert_token_to_string applies the SPIECE_UNDERLINE / <0x20> space workaround unconditionally, so the is_llama flag and the get_llama_tokenizer_types() helper that computed it are no longer read anywhere. This removes both, deletes the test that exercised the helper, and drops the now-meaningless is_llama assignments in test_transformer_tokenizer_convert_token_to_string (which still passes). Fixes #1874.

convert_token_to_string applies the SPIECE_UNDERLINE / <0x20> space
workaround unconditionally, so the is_llama flag and the
get_llama_tokenizer_types() helper that computed it are no longer read
anywhere. Remove both, along with the test that exercised the helper, and
drop the now-meaningless is_llama assignments in the conversion test.

Fixes dottxt-ai#1874
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Remove dead code: is_llama and get_llama_tokenizer_types() in TransformerTokenizer

2 participants