-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Description
Are there any succesful guides to train multilingual model? I have good ctc_loss after some hundred thousands of iterations, but model predict only 1-2 tokens always. My units.txt has around 5.7k of only single chars for 3 languages (also 0-9) and train config looks like this:
# train_u2++_conformer.yaml
encoder: conformer
encoder_conf:
output_size: 256
attention_heads: 4
linear_units: 2048
num_blocks: 12
dropout_rate: 0.1
positional_dropout_rate: 0.1
attention_dropout_rate: 0.1
input_layer: conv2d # conv2d / conv2d6 / conv2d8
normalize_before: true
cnn_module_kernel: 8
use_cnn_module: True
activation_type: 'swish'
pos_enc_layer_type: 'rel_pos'
selfattention_layer_type: 'rel_selfattn'
causal: true
use_dynamic_chunk: true
cnn_module_norm: 'layer_norm'
use_dynamic_left_chunk: false
# decoder
decoder: bitransformer
decoder_conf:
attention_heads: 4
linear_units: 2048
num_blocks: 3
r_num_blocks: 3
dropout_rate: 0.1
positional_dropout_rate: 0.1
self_attention_dropout_rate: 0.1
src_attention_dropout_rate: 0.1
tokenizer: char
tokenizer_conf:
symbol_table_path: <PATH_TO_MY_UNITS.TXT>
non_lang_syms_path: <PATH_TO_NON_LANG.TXT>
split_with_space: false
bpe_path: null
is_multilingual: false
num_languages: 1
special_tokens:
<blank>: 0
<unk>: 1
<sos>: 5766
<eos>: 5766
ctc: ctc
ctc_conf:
ctc_blank_id: 0
cmvn: null
cmvn_conf:
cmvn_file: 'data/train/global_cmvn'
is_json_cmvn: true
# hybrid CTC/attention
model: asr_model
model_conf:
ctc_weight: 0.5 # started with 0.3, the same result
lsm_weight: 0.1
length_normalized_loss: false
reverse_weight: 0.3
dataset: asr
dataset_conf:
filter_conf:
max_length: 40960
min_length: 0
token_max_length: 200
token_min_length: 1
resample_conf:
resample_rate: 16000
speed_perturb: false
fbank_conf:
num_mel_bins: 80
frame_shift: 10
frame_length: 25
dither: 1.0
spec_aug: true
spec_aug_conf:
num_t_mask: 2
num_f_mask: 2
max_t: 50
max_f: 10
spec_sub: false # was true
spec_sub_conf:
num_t_sub: 3
max_t: 30
spec_trim: false
spec_trim_conf:
max_t: 50
shuffle: true
shuffle_conf:
shuffle_size: 1500
sort: true
sort_conf:
sort_size: 500
batch_conf:
batch_type: 'dynamic'
batch_size: 48
max_frames_in_batch: 80000
grad_clip: 5
accum_grad: 1
max_epoch: 240
log_interval: 100
optim: adam
optim_conf:
lr: 0.001
scheduler: warmuplr
scheduler_conf:
warmup_steps: 25000
# Savinga
save_interval: 5000
keep_checkpoint_max: 50
log_interval: 100Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels