Multilingual Model Training

Are there any succesful guides to train multilingual model? I have good ctc_loss after some hundred thousands of iterations, but model predict only 1-2 tokens always. My units.txt has around 5.7k of only single chars for 3 languages (also 0-9) and train config looks like this:
```yaml
# train_u2++_conformer.yaml

encoder: conformer
encoder_conf:
    output_size: 256
    attention_heads: 4
    linear_units: 2048
    num_blocks: 12
    dropout_rate: 0.1
    positional_dropout_rate: 0.1
    attention_dropout_rate: 0.1
    input_layer: conv2d # conv2d / conv2d6 / conv2d8
    normalize_before: true
    cnn_module_kernel: 8
    use_cnn_module: True
    activation_type: 'swish'
    pos_enc_layer_type: 'rel_pos'
    selfattention_layer_type: 'rel_selfattn'
    causal: true
    use_dynamic_chunk: true
    cnn_module_norm: 'layer_norm'
    use_dynamic_left_chunk: false

# decoder
decoder: bitransformer
decoder_conf:
    attention_heads: 4
    linear_units: 2048
    num_blocks: 3
    r_num_blocks: 3
    dropout_rate: 0.1
    positional_dropout_rate: 0.1
    self_attention_dropout_rate: 0.1
    src_attention_dropout_rate: 0.1

tokenizer: char
tokenizer_conf:
  symbol_table_path: <PATH_TO_MY_UNITS.TXT>
  non_lang_syms_path: <PATH_TO_NON_LANG.TXT>
  split_with_space: false
  bpe_path: null
  is_multilingual: false
  num_languages: 1
  special_tokens:
    <blank>: 0
    <unk>: 1
    <sos>: 5766
    <eos>: 5766

ctc: ctc
ctc_conf:
  ctc_blank_id: 0

cmvn: null
cmvn_conf:
  cmvn_file: 'data/train/global_cmvn'
  is_json_cmvn: true

# hybrid CTC/attention
model: asr_model
model_conf:
    ctc_weight: 0.5 # started with 0.3, the same result
    lsm_weight: 0.1
    length_normalized_loss: false
    reverse_weight: 0.3

dataset: asr
dataset_conf:
    filter_conf:
        max_length: 40960
        min_length: 0
        token_max_length: 200
        token_min_length: 1
    resample_conf:
        resample_rate: 16000
    speed_perturb: false
    fbank_conf:
        num_mel_bins: 80
        frame_shift: 10
        frame_length: 25
        dither: 1.0
    spec_aug: true
    spec_aug_conf:
        num_t_mask: 2
        num_f_mask: 2
        max_t: 50
        max_f: 10
    spec_sub: false # was true
    spec_sub_conf:
        num_t_sub: 3
        max_t: 30
    spec_trim: false
    spec_trim_conf:
        max_t: 50
    shuffle: true
    shuffle_conf:
        shuffle_size: 1500
    sort: true
    sort_conf:
        sort_size: 500
    batch_conf:
        batch_type: 'dynamic'
        batch_size: 48
        max_frames_in_batch: 80000

grad_clip: 5
accum_grad: 1
max_epoch: 240
log_interval: 100


optim: adam
optim_conf:
    lr: 0.001
scheduler: warmuplr
scheduler_conf:
    warmup_steps: 25000

# Savinga
save_interval: 5000
keep_checkpoint_max: 50
log_interval: 100
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Multilingual Model Training #2807

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Multilingual Model Training #2807

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions