Skip to content

Conversation

@rover12421
Copy link
Contributor

文本描述和代码输出结果对不上.代码输出可能是旧版本,文本描述是新版本结果.
最新代码输出结果已经纠正,可以和文本描述对应上.

文本描述和代码输出结果对不上.代码输出可能是旧版本,文本描述是新版本结果.
最新代码输出结果已经纠正,可以和文本描述对应上.
@jsksxs360
Copy link
Owner

屏幕截图 2024-05-18 200216

transformers==4.34.0

你好,我尝试了一下,反而与代码里的结果一致。哈哈。不过这个也不是什么大事

@jsksxs360 jsksxs360 merged commit 176f602 into jsksxs360:gh-pages May 18, 2024
@rover12421
Copy link
Contributor Author

屏幕截图 2024-05-18 200216

transformers==4.34.0

你好,我尝试了一下,反而与代码里的结果一致。哈哈。不过这个也不是什么大事

你是对的。我也不知道为啥最开始拉下来的是旧的模型。通过对历史版本追溯,已经再到发生变化的节点了。已经按最新的修改重提提交了。辛苦再合并一次。

测试代码:

from transformers import AutoTokenizer


def test_tokenizer(revision):
    tokenizer = AutoTokenizer.from_pretrained("bert-base-cased", revision=revision)
    sequence = "Using a Transformer network is simple"
    tokens = tokenizer.tokenize(sequence)
    print(f"revision: {revision}")
    print(tokens)


test_tokenizer("main")
test_tokenizer("b89a729bdafaf5e18b8cb1774aee7fbc363169f1")
test_tokenizer("ae1d3b2cce5ef798cab884c0e7e61e34f46bc412")

结果:

revision: main
['Using', 'a', 'Trans', '##former', 'network', 'is', 'simple']
revision: b89a729bdafaf5e18b8cb1774aee7fbc363169f1
['Using', 'a', 'Trans', '##former', 'network', 'is', 'simple']
revision: ae1d3b2cce5ef798cab884c0e7e61e34f46bc412
['using', 'a', 'transform', '##er', 'network', 'is', 'simple']

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants