Posted 2024-04-292 minutes read (About 312 words)数据及其分布的偏移这里说了很多我觉得是数据的偏移,然后给了许多例子,然后如何纠正,主要是关于清洗数据的一些思考。Read more
Posted 2024-04-093 minutes read (About 384 words)数据集合下载和预处理12345678def read_data_nmt(): """载入“英语-法语”数据集""" data_dir = d2l.download_extract('fra-eng') with open(os.path.join(data_dir, 'fra.txt'), 'r', encoding='utf-8') as f: return f.read() raw_text = read_data_nmt() print(raw_text[:75])Read more