mirror of
https://github.com/saymrwulf/transformers.git
synced 2026-05-14 20:58:08 +00:00
* Fix StopStringCriteria to handle tokens above len(tokenizer) This fixes #35244 by clipping token IDs to be within the tokenizer's vocabulary size before performing the embedding lookup. This prevents index errors when model.config.vocab_size > len(tokenizer). The fix: 1. Adds a clamp operation to ensure token IDs are within bounds 2. Adds a test case to verify the behavior * Use self.stop_strings instead of stop_strings * Handle clipping correctly * make fixup * Update test to the new embedding vecs * Use much bigger values in the mismatch test * Typo fix * Slight simplification --------- Co-authored-by: openhands <openhands@all-hands.dev> |
||
|---|---|---|
| .. | ||
| __init__.py | ||
| test_beam_constraints.py | ||
| test_beam_search.py | ||
| test_candidate_generator.py | ||
| test_configuration_utils.py | ||
| test_flax_logits_process.py | ||
| test_flax_utils.py | ||
| test_framework_agnostic.py | ||
| test_fsdp.py | ||
| test_logits_process.py | ||
| test_stopping_criteria.py | ||
| test_streamers.py | ||
| test_tf_logits_process.py | ||
| test_tf_utils.py | ||
| test_utils.py | ||