After running python -m sockeye.translate --config small_model/args.yaml --input sentence_parallel_files/src_test.txt --output sentence_parallel_files/tgt_test.txt --models small_model --strip-unknown-words --prevent-unk I encounter a TracerWarning, then the program is killed without translation outputs, although the sockeye_translations.log does not show any errors either.
The output in the terminal:
[INFO:sockeye.utils] Sockeye: 3.1.34, commit 4c30942ddb523533bccb4d2cbb3e894e45b1db93, path /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/__init__.py [INFO:sockeye.utils] PyTorch: 1.13.1 (/Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/__init__.py) [INFO:sockeye.utils] Command: /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/translate.py --config small_model/args.yaml --input sentence_parallel_files/src_test.txt --output sentence_parallel_files/tgt_test.txt --models small_model --strip-unknown-words --prevent-unk [INFO:sockeye.utils] Arguments: Namespace(allow_missing_params=False, amp=False, apex_amp=False, batch_sentences_multiple_of=8, batch_size=4096, batch_type='word', beam_search_stop='all', beam_size=5, bow_task_pos_weight=10, bow_task_weight=1.0, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none', brevity_penalty_weight=1.0, bucket_scaling=False, bucket_width=8, cache_last_best_params=0, cache_metric='perplexity', cache_strategy='best', checkpoint_improvement_threshold=0.0, checkpoint_interval=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config='small_model/args.yaml', decode_and_evaluate=500, decoder='transformer', deepspeed_bf16=False, deepspeed_fp16=False, device_id=0, dist=False, dry_run=False, dtype='float32', embed_dropout=[0.3, 0.3], encoder='transformer', end_of_prepending_tag=None, ensemble_mode='linear', env=None, fixed_param_names=[], fixed_param_strategy=None, gradient_clipping_threshold=1.0, gradient_clipping_type='none', greedy=False, ignore_extra_params=False, initial_learning_rate=0.0002, input='sentence_parallel_files/src_test.txt', input_factors=None, json_input=False, keep_initializations=False, keep_last_params=-1, knn_index=None, knn_lambda=0.8, label_smoothing=0.3, label_smoothing_impl='mxnet', learning_rate_reduce_factor=0.9, learning_rate_reduce_num_not_improved=8, learning_rate_scheduler_type='plateau-reduce', learning_rate_warmup=0, length_penalty_alpha=1.0, length_penalty_beta=0.0, length_task=None, length_task_layers=1, length_task_weight=1.0, lhuc=None, local_rank=None, loglevel='INFO', loglevel_secondary_workers='INFO', max_checkpoints=None, max_input_length=None, max_num_checkpoint_not_improved=None, max_num_epochs=None, max_output_length=None, max_output_length_num_stds=2, max_samples=10000000, max_seconds=None, max_seq_len=[95, 95], max_updates=None, min_num_epochs=None, min_samples=None, min_updates=None, models=['small_model'], momentum=0.0, nbest_size=1, neural_vocab_selection=None, neural_vocab_selection_block_loss=False, no_bucketing=False, no_logfile=False, no_reload_on_learning_rate_reduce=False, num_embed=[None, None], num_layers=[3, 3], num_words=[20000, 20000], nvs_thresh=0.5, optimized_metric='bleu', optimizer='adam', optimizer_betas=[0.9, 0.999], optimizer_eps=1e-08, output='sentence_parallel_files/tgt_test.txt', output_type='translation', overwrite_output=False, pad_vocab_to_multiple_of=8, params=None, prepared_data=None, prevent_unk=True, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=1, shared_vocab=True, skip_nvs=False, source='../sentence_parallel_files/src_train.txt', source_factor_vocabs=[], source_factors=[], source_factors_combine=[], source_factors_num_embed=[], source_factors_share_embedding=[], source_factors_use_source_vocab=[], source_vocab=None, stop_training_on_decoder_failure=False, strip_unknown_words=True, target='../sentence_parallel_files/tgt_train.txt', target_factor_vocabs=[], target_factors=[], target_factors_combine=[], target_factors_num_embed=[], target_factors_share_embedding=[], target_factors_use_target_vocab=[], target_factors_weight=[1.0], target_vocab=None, tf32=True, transformer_activation_type=['relu', 'relu'], transformer_attention_heads=[4, 4], transformer_block_prepended_cross_attention=False, transformer_dropout_act=[0.1, 0.1], transformer_dropout_attention=[0.1, 0.1], transformer_dropout_prepost=[0.1, 0.1], transformer_feed_forward_num_hidden=[512, 512], transformer_feed_forward_use_glu=False, transformer_model_size=[128, 128], transformer_positional_embedding_type='fixed', transformer_postprocess=['dr', 'dr'], transformer_preprocess=['n', 'n'], update_interval=1, use_cpu=False, validation_source='../sentence_parallel_files/src_validation.txt', validation_source_factors=[], validation_target='../sentence_parallel_files/tgt_validation.txt', validation_target_factors=[], weight_decay=0.0, weight_tying_type='src_trg_softmax', word_min_count=[1, 1]) [INFO:sockeye.utils] CUDA not available, defaulting to CPU device [INFO:__main__] Translate Device: cpu [INFO:sockeye.model] Loading 1 model(s) from ['small_model'] ... [INFO:sockeye.vocab] Vocabulary (20008 words) loaded from "small_model/vocab.src.0.json" [INFO:sockeye.vocab] Vocabulary (20008 words) loaded from "small_model/vocab.trg.0.json" [INFO:sockeye.model] Model version: 3.1.34 [INFO:sockeye.model] Loaded model config from "small_model/config" [INFO:sockeye.model] Disabling dropout layers for performance reasons [INFO:sockeye.model] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=16245, num_discarded=1, num_tokens_source=317343, num_tokens_target=177556, num_unks_source=20594, num_unks_target=6918, max_observed_len_source=83, max_observed_len_target=80, size_vocab_source=20008, size_vocab_target=20008, length_ratio_mean=0.7195621962140187, length_ratio_std=0.5842629746461441, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (72, 72), (80, 80), (88, 88), (96, 96)], num_sents_per_bucket=[888, 5731, 5333, 2625, 1170, 381, 80, 13, 5, 3, 16, 0], average_len_target_per_bucket=[6.146396396396393, 9.93160006979582, 11.494468404275306, 12.341714285714282, 12.621367521367512, 13.341207349081374, 13.412499999999998, 16.846153846153847, 18.8, 35.666666666666664, 8.75, None], length_ratio_stats_per_bucket=[(1.3744302338052337, 0.8585728161087813), (0.949670248710557, 0.6720002114978064), (0.6097747334248025, 0.4043806126550977), (0.4536296476514696, 0.2288022427671748), (0.36131486260088264, 0.20559448455653748), (0.3143034722464711, 0.19507350713250618), (0.2827698400110712, 0.2829753677267691), (0.642658803608251, 1.160508093478276), (0.27178585119143306, 0.13837005297982366), (2.3391053391053394, 3.0600525538493457), (0.10542168674698794, 0.0822129160479288), (None, None)]), max_seq_len_source=96, max_seq_len_target=96, num_source_factors=1, num_target_factors=1, eop_id=-1), vocab_source_size=20008, vocab_target_size=20008, config_embed_source=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_encoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_decoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False) [INFO:sockeye.model] Loaded params from "small_model/params.best" to "cpu" [INFO:sockeye.model] Model dtype: overridden to float32 [INFO:sockeye.model] 1 model(s) loaded in 0.2285s [INFO:sockeye.inference] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=95 nbest_size=1 ensemble_mode=None max_batch_size=4096 dtype=torch.float32 skip_nvs=False nvs_thresh=0.5) [INFO:__main__] Translating... /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/jit/_trace.py:983: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. for list, use a tupleinstead. fordict, use a NamedTupleinstead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior. argument_names, zsh: killed python -m sockeye.translate --config small_model/args.yaml --input --output
sockeye_translations.log:
[2023-10-25:20:06:15:INFO:sockeye.utils:log_sockeye_version] Sockeye: 3.1.34, commit 4c30942ddb523533bccb4d2cbb3e894e45b1db93, path /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/__init__.py [2023-10-25:20:06:15:INFO:sockeye.utils:log_torch_version] PyTorch: 1.13.1 (/Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/__init__.py) [2023-10-25:20:06:15:INFO:sockeye.utils:log_basic_info] Command: /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/translate.py --config ../small_model/args.yaml --input ../sentence_parallel_files/src_test.txt --output ../sockeye_translations --models ../small_model --strip-unknown-words --prevent-unk [2023-10-25:20:06:15:INFO:sockeye.utils:log_basic_info] Arguments: Namespace(allow_missing_params=False, amp=False, apex_amp=False, batch_sentences_multiple_of=8, batch_size=4096, batch_type='word', beam_search_stop='all', beam_size=5, bow_task_pos_weight=10, bow_task_weight=1.0, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none', brevity_penalty_weight=1.0, bucket_scaling=False, bucket_width=8, cache_last_best_params=0, cache_metric='perplexity', cache_strategy='best', checkpoint_improvement_threshold=0.0, checkpoint_interval=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config='../small_model/args.yaml', decode_and_evaluate=500, decoder='transformer', deepspeed_bf16=False, deepspeed_fp16=False, device_id=0, dist=False, dry_run=False, dtype='float32', embed_dropout=[0.3, 0.3], encoder='transformer', end_of_prepending_tag=None, ensemble_mode='linear', env=None, fixed_param_names=[], fixed_param_strategy=None, gradient_clipping_threshold=1.0, gradient_clipping_type='none', greedy=False, ignore_extra_params=False, initial_learning_rate=0.0002, input='../sentence_parallel_files/src_test.txt', input_factors=None, json_input=False, keep_initializations=False, keep_last_params=-1, knn_index=None, knn_lambda=0.8, label_smoothing=0.3, label_smoothing_impl='mxnet', learning_rate_reduce_factor=0.9, learning_rate_reduce_num_not_improved=8, learning_rate_scheduler_type='plateau-reduce', learning_rate_warmup=0, length_penalty_alpha=1.0, length_penalty_beta=0.0, length_task=None, length_task_layers=1, length_task_weight=1.0, lhuc=None, local_rank=None, loglevel='INFO', loglevel_secondary_workers='INFO', max_checkpoints=None, max_input_length=None, max_num_checkpoint_not_improved=None, max_num_epochs=None, max_output_length=None, max_output_length_num_stds=2, max_samples=10000000, max_seconds=None, max_seq_len=[95, 95], max_updates=None, min_num_epochs=None, min_samples=None, min_updates=None, models=['../small_model'], momentum=0.0, nbest_size=1, neural_vocab_selection=None, neural_vocab_selection_block_loss=False, no_bucketing=False, no_logfile=False, no_reload_on_learning_rate_reduce=False, num_embed=[None, None], num_layers=[3, 3], num_words=[20000, 20000], nvs_thresh=0.5, optimized_metric='bleu', optimizer='adam', optimizer_betas=[0.9, 0.999], optimizer_eps=1e-08, output='../sockeye_translations', output_type='translation', overwrite_output=False, pad_vocab_to_multiple_of=8, params=None, prepared_data=None, prevent_unk=True, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=1, shared_vocab=True, skip_nvs=False, source='../sentence_parallel_files/src_train.txt', source_factor_vocabs=[], source_factors=[], source_factors_combine=[], source_factors_num_embed=[], source_factors_share_embedding=[], source_factors_use_source_vocab=[], source_vocab=None, stop_training_on_decoder_failure=False, strip_unknown_words=True, target='../sentence_parallel_files/tgt_train.txt', target_factor_vocabs=[], target_factors=[], target_factors_combine=[], target_factors_num_embed=[], target_factors_share_embedding=[], target_factors_use_target_vocab=[], target_factors_weight=[1.0], target_vocab=None, tf32=True, transformer_activation_type=['relu', 'relu'], transformer_attention_heads=[4, 4], transformer_block_prepended_cross_attention=False, transformer_dropout_act=[0.1, 0.1], transformer_dropout_attention=[0.1, 0.1], transformer_dropout_prepost=[0.1, 0.1], transformer_feed_forward_num_hidden=[512, 512], transformer_feed_forward_use_glu=False, transformer_model_size=[128, 128], transformer_positional_embedding_type='fixed', transformer_postprocess=['dr', 'dr'], transformer_preprocess=['n', 'n'], update_interval=1, use_cpu=False, validation_source='../sentence_parallel_files/src_validation.txt', validation_source_factors=[], validation_target='../sentence_parallel_files/tgt_validation.txt', validation_target_factors=[], weight_decay=0.0, weight_tying_type='src_trg_softmax', word_min_count=[1, 1]) [2023-10-25:20:06:15:INFO:sockeye.utils:init_device] CUDA not available, defaulting to CPU device [2023-10-25:20:06:15:INFO:__main__:run_translate] Translate Device: cpu [2023-10-25:20:06:15:INFO:sockeye.model:load_models] Loading 1 model(s) from ['../small_model'] ... [2023-10-25:20:06:15:INFO:sockeye.vocab:vocab_from_json] Vocabulary (20008 words) loaded from "../small_model/vocab.src.0.json" [2023-10-25:20:06:15:INFO:sockeye.vocab:vocab_from_json] Vocabulary (20008 words) loaded from "../small_model/vocab.trg.0.json" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Model version: 3.1.34 [2023-10-25:20:06:15:INFO:sockeye.model:load_config] Loaded model config from "../small_model/config" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Disabling dropout layers for performance reasons [2023-10-25:20:06:15:INFO:sockeye.model:__init__] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=16245, num_discarded=1, num_tokens_source=317343, num_tokens_target=177556, num_unks_source=20594, num_unks_target=6918, max_observed_len_source=83, max_observed_len_target=80, size_vocab_source=20008, size_vocab_target=20008, length_ratio_mean=0.7195621962140187, length_ratio_std=0.5842629746461441, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (72, 72), (80, 80), (88, 88), (96, 96)], num_sents_per_bucket=[888, 5731, 5333, 2625, 1170, 381, 80, 13, 5, 3, 16, 0], average_len_target_per_bucket=[6.146396396396393, 9.93160006979582, 11.494468404275306, 12.341714285714282, 12.621367521367512, 13.341207349081374, 13.412499999999998, 16.846153846153847, 18.8, 35.666666666666664, 8.75, None], length_ratio_stats_per_bucket=[(1.3744302338052337, 0.8585728161087813), (0.949670248710557, 0.6720002114978064), (0.6097747334248025, 0.4043806126550977), (0.4536296476514696, 0.2288022427671748), (0.36131486260088264, 0.20559448455653748), (0.3143034722464711, 0.19507350713250618), (0.2827698400110712, 0.2829753677267691), (0.642658803608251, 1.160508093478276), (0.27178585119143306, 0.13837005297982366), (2.3391053391053394, 3.0600525538493457), (0.10542168674698794, 0.0822129160479288), (None, None)]), max_seq_len_source=96, max_seq_len_target=96, num_source_factors=1, num_target_factors=1, eop_id=-1), vocab_source_size=20008, vocab_target_size=20008, config_embed_source=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_encoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_decoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False) [2023-10-25:20:06:15:INFO:sockeye.model:load_parameters] Loaded params from "../small_model/params.best" to "cpu" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Model dtype: overridden to float32 [2023-10-25:20:06:15:INFO:sockeye.model:load_models] 1 model(s) loaded in 0.2356s [2023-10-25:20:06:15:INFO:sockeye.inference:__init__] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=95 nbest_size=1 ensemble_mode=None max_batch_size=4096 dtype=torch.float32 skip_nvs=False nvs_thresh=0.5) [2023-10-25:20:06:15:INFO:__main__:read_and_translate] Translating...
After running
python -m sockeye.translate --config small_model/args.yaml --input sentence_parallel_files/src_test.txt --output sentence_parallel_files/tgt_test.txt --models small_model --strip-unknown-words --prevent-unkI encounter a TracerWarning, then the program is killed without translation outputs, although the sockeye_translations.log does not show any errors either.The output in the terminal:
[INFO:sockeye.utils] Sockeye: 3.1.34, commit 4c30942ddb523533bccb4d2cbb3e894e45b1db93, path /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/__init__.py [INFO:sockeye.utils] PyTorch: 1.13.1 (/Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/__init__.py) [INFO:sockeye.utils] Command: /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/translate.py --config small_model/args.yaml --input sentence_parallel_files/src_test.txt --output sentence_parallel_files/tgt_test.txt --models small_model --strip-unknown-words --prevent-unk [INFO:sockeye.utils] Arguments: Namespace(allow_missing_params=False, amp=False, apex_amp=False, batch_sentences_multiple_of=8, batch_size=4096, batch_type='word', beam_search_stop='all', beam_size=5, bow_task_pos_weight=10, bow_task_weight=1.0, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none', brevity_penalty_weight=1.0, bucket_scaling=False, bucket_width=8, cache_last_best_params=0, cache_metric='perplexity', cache_strategy='best', checkpoint_improvement_threshold=0.0, checkpoint_interval=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config='small_model/args.yaml', decode_and_evaluate=500, decoder='transformer', deepspeed_bf16=False, deepspeed_fp16=False, device_id=0, dist=False, dry_run=False, dtype='float32', embed_dropout=[0.3, 0.3], encoder='transformer', end_of_prepending_tag=None, ensemble_mode='linear', env=None, fixed_param_names=[], fixed_param_strategy=None, gradient_clipping_threshold=1.0, gradient_clipping_type='none', greedy=False, ignore_extra_params=False, initial_learning_rate=0.0002, input='sentence_parallel_files/src_test.txt', input_factors=None, json_input=False, keep_initializations=False, keep_last_params=-1, knn_index=None, knn_lambda=0.8, label_smoothing=0.3, label_smoothing_impl='mxnet', learning_rate_reduce_factor=0.9, learning_rate_reduce_num_not_improved=8, learning_rate_scheduler_type='plateau-reduce', learning_rate_warmup=0, length_penalty_alpha=1.0, length_penalty_beta=0.0, length_task=None, length_task_layers=1, length_task_weight=1.0, lhuc=None, local_rank=None, loglevel='INFO', loglevel_secondary_workers='INFO', max_checkpoints=None, max_input_length=None, max_num_checkpoint_not_improved=None, max_num_epochs=None, max_output_length=None, max_output_length_num_stds=2, max_samples=10000000, max_seconds=None, max_seq_len=[95, 95], max_updates=None, min_num_epochs=None, min_samples=None, min_updates=None, models=['small_model'], momentum=0.0, nbest_size=1, neural_vocab_selection=None, neural_vocab_selection_block_loss=False, no_bucketing=False, no_logfile=False, no_reload_on_learning_rate_reduce=False, num_embed=[None, None], num_layers=[3, 3], num_words=[20000, 20000], nvs_thresh=0.5, optimized_metric='bleu', optimizer='adam', optimizer_betas=[0.9, 0.999], optimizer_eps=1e-08, output='sentence_parallel_files/tgt_test.txt', output_type='translation', overwrite_output=False, pad_vocab_to_multiple_of=8, params=None, prepared_data=None, prevent_unk=True, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=1, shared_vocab=True, skip_nvs=False, source='../sentence_parallel_files/src_train.txt', source_factor_vocabs=[], source_factors=[], source_factors_combine=[], source_factors_num_embed=[], source_factors_share_embedding=[], source_factors_use_source_vocab=[], source_vocab=None, stop_training_on_decoder_failure=False, strip_unknown_words=True, target='../sentence_parallel_files/tgt_train.txt', target_factor_vocabs=[], target_factors=[], target_factors_combine=[], target_factors_num_embed=[], target_factors_share_embedding=[], target_factors_use_target_vocab=[], target_factors_weight=[1.0], target_vocab=None, tf32=True, transformer_activation_type=['relu', 'relu'], transformer_attention_heads=[4, 4], transformer_block_prepended_cross_attention=False, transformer_dropout_act=[0.1, 0.1], transformer_dropout_attention=[0.1, 0.1], transformer_dropout_prepost=[0.1, 0.1], transformer_feed_forward_num_hidden=[512, 512], transformer_feed_forward_use_glu=False, transformer_model_size=[128, 128], transformer_positional_embedding_type='fixed', transformer_postprocess=['dr', 'dr'], transformer_preprocess=['n', 'n'], update_interval=1, use_cpu=False, validation_source='../sentence_parallel_files/src_validation.txt', validation_source_factors=[], validation_target='../sentence_parallel_files/tgt_validation.txt', validation_target_factors=[], weight_decay=0.0, weight_tying_type='src_trg_softmax', word_min_count=[1, 1]) [INFO:sockeye.utils] CUDA not available, defaulting to CPU device [INFO:__main__] Translate Device: cpu [INFO:sockeye.model] Loading 1 model(s) from ['small_model'] ... [INFO:sockeye.vocab] Vocabulary (20008 words) loaded from "small_model/vocab.src.0.json" [INFO:sockeye.vocab] Vocabulary (20008 words) loaded from "small_model/vocab.trg.0.json" [INFO:sockeye.model] Model version: 3.1.34 [INFO:sockeye.model] Loaded model config from "small_model/config" [INFO:sockeye.model] Disabling dropout layers for performance reasons [INFO:sockeye.model] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=16245, num_discarded=1, num_tokens_source=317343, num_tokens_target=177556, num_unks_source=20594, num_unks_target=6918, max_observed_len_source=83, max_observed_len_target=80, size_vocab_source=20008, size_vocab_target=20008, length_ratio_mean=0.7195621962140187, length_ratio_std=0.5842629746461441, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (72, 72), (80, 80), (88, 88), (96, 96)], num_sents_per_bucket=[888, 5731, 5333, 2625, 1170, 381, 80, 13, 5, 3, 16, 0], average_len_target_per_bucket=[6.146396396396393, 9.93160006979582, 11.494468404275306, 12.341714285714282, 12.621367521367512, 13.341207349081374, 13.412499999999998, 16.846153846153847, 18.8, 35.666666666666664, 8.75, None], length_ratio_stats_per_bucket=[(1.3744302338052337, 0.8585728161087813), (0.949670248710557, 0.6720002114978064), (0.6097747334248025, 0.4043806126550977), (0.4536296476514696, 0.2288022427671748), (0.36131486260088264, 0.20559448455653748), (0.3143034722464711, 0.19507350713250618), (0.2827698400110712, 0.2829753677267691), (0.642658803608251, 1.160508093478276), (0.27178585119143306, 0.13837005297982366), (2.3391053391053394, 3.0600525538493457), (0.10542168674698794, 0.0822129160479288), (None, None)]), max_seq_len_source=96, max_seq_len_target=96, num_source_factors=1, num_target_factors=1, eop_id=-1), vocab_source_size=20008, vocab_target_size=20008, config_embed_source=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_encoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_decoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False) [INFO:sockeye.model] Loaded params from "small_model/params.best" to "cpu" [INFO:sockeye.model] Model dtype: overridden to float32 [INFO:sockeye.model] 1 model(s) loaded in 0.2285s [INFO:sockeye.inference] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=95 nbest_size=1 ensemble_mode=None max_batch_size=4096 dtype=torch.float32 skip_nvs=False nvs_thresh=0.5) [INFO:__main__] Translating... /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/jit/_trace.py:983: TracerWarning: Encountering a list at the output of the tracer might cause the trace to be incorrect, this is only valid if the container structure does not change based on the module's inputs. Consider using a constant container instead (e.g. forlist, use atupleinstead. fordict, use aNamedTupleinstead). If you absolutely need this and know the side effects, pass strict=False to trace() to allow this behavior. argument_names, zsh: killed python -m sockeye.translate --config small_model/args.yaml --input --outputsockeye_translations.log:
[2023-10-25:20:06:15:INFO:sockeye.utils:log_sockeye_version] Sockeye: 3.1.34, commit 4c30942ddb523533bccb4d2cbb3e894e45b1db93, path /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/__init__.py [2023-10-25:20:06:15:INFO:sockeye.utils:log_torch_version] PyTorch: 1.13.1 (/Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/torch/__init__.py) [2023-10-25:20:06:15:INFO:sockeye.utils:log_basic_info] Command: /Users/christyman/miniconda3/envs/ats-program-37/lib/python3.7/site-packages/sockeye/translate.py --config ../small_model/args.yaml --input ../sentence_parallel_files/src_test.txt --output ../sockeye_translations --models ../small_model --strip-unknown-words --prevent-unk [2023-10-25:20:06:15:INFO:sockeye.utils:log_basic_info] Arguments: Namespace(allow_missing_params=False, amp=False, apex_amp=False, batch_sentences_multiple_of=8, batch_size=4096, batch_type='word', beam_search_stop='all', beam_size=5, bow_task_pos_weight=10, bow_task_weight=1.0, brevity_penalty_constant_length_ratio=0.0, brevity_penalty_type='none', brevity_penalty_weight=1.0, bucket_scaling=False, bucket_width=8, cache_last_best_params=0, cache_metric='perplexity', cache_strategy='best', checkpoint_improvement_threshold=0.0, checkpoint_interval=10, checkpoints=None, chunk_size=None, clamp_to_dtype=False, config='../small_model/args.yaml', decode_and_evaluate=500, decoder='transformer', deepspeed_bf16=False, deepspeed_fp16=False, device_id=0, dist=False, dry_run=False, dtype='float32', embed_dropout=[0.3, 0.3], encoder='transformer', end_of_prepending_tag=None, ensemble_mode='linear', env=None, fixed_param_names=[], fixed_param_strategy=None, gradient_clipping_threshold=1.0, gradient_clipping_type='none', greedy=False, ignore_extra_params=False, initial_learning_rate=0.0002, input='../sentence_parallel_files/src_test.txt', input_factors=None, json_input=False, keep_initializations=False, keep_last_params=-1, knn_index=None, knn_lambda=0.8, label_smoothing=0.3, label_smoothing_impl='mxnet', learning_rate_reduce_factor=0.9, learning_rate_reduce_num_not_improved=8, learning_rate_scheduler_type='plateau-reduce', learning_rate_warmup=0, length_penalty_alpha=1.0, length_penalty_beta=0.0, length_task=None, length_task_layers=1, length_task_weight=1.0, lhuc=None, local_rank=None, loglevel='INFO', loglevel_secondary_workers='INFO', max_checkpoints=None, max_input_length=None, max_num_checkpoint_not_improved=None, max_num_epochs=None, max_output_length=None, max_output_length_num_stds=2, max_samples=10000000, max_seconds=None, max_seq_len=[95, 95], max_updates=None, min_num_epochs=None, min_samples=None, min_updates=None, models=['../small_model'], momentum=0.0, nbest_size=1, neural_vocab_selection=None, neural_vocab_selection_block_loss=False, no_bucketing=False, no_logfile=False, no_reload_on_learning_rate_reduce=False, num_embed=[None, None], num_layers=[3, 3], num_words=[20000, 20000], nvs_thresh=0.5, optimized_metric='bleu', optimizer='adam', optimizer_betas=[0.9, 0.999], optimizer_eps=1e-08, output='../sockeye_translations', output_type='translation', overwrite_output=False, pad_vocab_to_multiple_of=8, params=None, prepared_data=None, prevent_unk=True, quiet=False, quiet_secondary_workers=False, restrict_lexicon=None, restrict_lexicon_topk=None, sample=None, seed=1, shared_vocab=True, skip_nvs=False, source='../sentence_parallel_files/src_train.txt', source_factor_vocabs=[], source_factors=[], source_factors_combine=[], source_factors_num_embed=[], source_factors_share_embedding=[], source_factors_use_source_vocab=[], source_vocab=None, stop_training_on_decoder_failure=False, strip_unknown_words=True, target='../sentence_parallel_files/tgt_train.txt', target_factor_vocabs=[], target_factors=[], target_factors_combine=[], target_factors_num_embed=[], target_factors_share_embedding=[], target_factors_use_target_vocab=[], target_factors_weight=[1.0], target_vocab=None, tf32=True, transformer_activation_type=['relu', 'relu'], transformer_attention_heads=[4, 4], transformer_block_prepended_cross_attention=False, transformer_dropout_act=[0.1, 0.1], transformer_dropout_attention=[0.1, 0.1], transformer_dropout_prepost=[0.1, 0.1], transformer_feed_forward_num_hidden=[512, 512], transformer_feed_forward_use_glu=False, transformer_model_size=[128, 128], transformer_positional_embedding_type='fixed', transformer_postprocess=['dr', 'dr'], transformer_preprocess=['n', 'n'], update_interval=1, use_cpu=False, validation_source='../sentence_parallel_files/src_validation.txt', validation_source_factors=[], validation_target='../sentence_parallel_files/tgt_validation.txt', validation_target_factors=[], weight_decay=0.0, weight_tying_type='src_trg_softmax', word_min_count=[1, 1]) [2023-10-25:20:06:15:INFO:sockeye.utils:init_device] CUDA not available, defaulting to CPU device [2023-10-25:20:06:15:INFO:__main__:run_translate] Translate Device: cpu [2023-10-25:20:06:15:INFO:sockeye.model:load_models] Loading 1 model(s) from ['../small_model'] ... [2023-10-25:20:06:15:INFO:sockeye.vocab:vocab_from_json] Vocabulary (20008 words) loaded from "../small_model/vocab.src.0.json" [2023-10-25:20:06:15:INFO:sockeye.vocab:vocab_from_json] Vocabulary (20008 words) loaded from "../small_model/vocab.trg.0.json" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Model version: 3.1.34 [2023-10-25:20:06:15:INFO:sockeye.model:load_config] Loaded model config from "../small_model/config" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Disabling dropout layers for performance reasons [2023-10-25:20:06:15:INFO:sockeye.model:__init__] ModelConfig(config_data=DataConfig(data_statistics=DataStatistics(num_sents=16245, num_discarded=1, num_tokens_source=317343, num_tokens_target=177556, num_unks_source=20594, num_unks_target=6918, max_observed_len_source=83, max_observed_len_target=80, size_vocab_source=20008, size_vocab_target=20008, length_ratio_mean=0.7195621962140187, length_ratio_std=0.5842629746461441, buckets=[(8, 8), (16, 16), (24, 24), (32, 32), (40, 40), (48, 48), (56, 56), (64, 64), (72, 72), (80, 80), (88, 88), (96, 96)], num_sents_per_bucket=[888, 5731, 5333, 2625, 1170, 381, 80, 13, 5, 3, 16, 0], average_len_target_per_bucket=[6.146396396396393, 9.93160006979582, 11.494468404275306, 12.341714285714282, 12.621367521367512, 13.341207349081374, 13.412499999999998, 16.846153846153847, 18.8, 35.666666666666664, 8.75, None], length_ratio_stats_per_bucket=[(1.3744302338052337, 0.8585728161087813), (0.949670248710557, 0.6720002114978064), (0.6097747334248025, 0.4043806126550977), (0.4536296476514696, 0.2288022427671748), (0.36131486260088264, 0.20559448455653748), (0.3143034722464711, 0.19507350713250618), (0.2827698400110712, 0.2829753677267691), (0.642658803608251, 1.160508093478276), (0.27178585119143306, 0.13837005297982366), (2.3391053391053394, 3.0600525538493457), (0.10542168674698794, 0.0822129160479288), (None, None)]), max_seq_len_source=96, max_seq_len_target=96, num_source_factors=1, num_target_factors=1, eop_id=-1), vocab_source_size=20008, vocab_target_size=20008, config_embed_source=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_embed_target=EmbeddingConfig(vocab_size=20008, num_embed=128, dropout=0.0, num_factors=1, factor_configs=None, allow_sparse_grad=False), config_encoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_decoder=TransformerConfig(model_size=128, attention_heads=4, feed_forward_num_hidden=512, act_type='relu', num_layers=3, dropout_attention=0.0, dropout_act=0.0, dropout_prepost=0.0, positional_embedding_type='fixed', preprocess_sequence='n', postprocess_sequence='dr', max_seq_len_source=96, max_seq_len_target=96, decoder_type='transformer', block_prepended_cross_attention=False, use_lhuc=False, depth_key_value=128, use_glu=False), config_length_task=None, weight_tying_type='src_trg_softmax', lhuc=False, dtype='float32', neural_vocab_selection=None, neural_vocab_selection_block_loss=False) [2023-10-25:20:06:15:INFO:sockeye.model:load_parameters] Loaded params from "../small_model/params.best" to "cpu" [2023-10-25:20:06:15:INFO:sockeye.model:load_model] Model dtype: overridden to float32 [2023-10-25:20:06:15:INFO:sockeye.model:load_models] 1 model(s) loaded in 0.2356s [2023-10-25:20:06:15:INFO:sockeye.inference:__init__] Translator (1 model(s) beam_size=5 algorithm=BeamSearch, beam_search_stop=all max_input_length=95 nbest_size=1 ensemble_mode=None max_batch_size=4096 dtype=torch.float32 skip_nvs=False nvs_thresh=0.5) [2023-10-25:20:06:15:INFO:__main__:read_and_translate] Translating...