TensorSpeech
diff --git a/‎README.md‎
Lines changed: 22 additions & 2 deletions b/‎README.md‎
Lines changed: 22 additions & 2 deletions
diff --git a/‎examples/conformer/README.md‎
Lines changed: 3 additions & 3 deletions b/‎examples/conformer/README.md‎
Lines changed: 3 additions & 3 deletions
diff --git a/‎examples/conformer/config.yml‎
Lines changed: 6 additions & 6 deletions b/‎examples/conformer/config.yml‎
Lines changed: 6 additions & 6 deletions
diff --git a/‎examples/conformer/train_conformer.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/conformer/train_conformer.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/conformer/train_ga_conformer.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/conformer/train_ga_conformer.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/conformer/train_ga_subword_conformer.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/conformer/train_ga_subword_conformer.py‎
Lines changed: 2 additions & 2 deletions
diff --git a/‎examples/conformer/train_subword_conformer.py‎
Lines changed: 2 additions & 2 deletions b/‎examples/conformer/train_subword_conformer.py‎
Lines changed: 2 additions & 2 deletions
@@ -16,11 +16,12 @@
 </h2>
 
 <p align="center">
-TensorFlowASR implements some automatic speech recognition architectures such as DeepSpeech2, Conformer, etc. These models can be converted to TFLite to reduce memory and computation for deployment :smile:
+TensorFlowASR implements some automatic speech recognition architectures such as DeepSpeech2, Jasper, ContextNet, Conformer, etc. These models can be converted to TFLite to reduce memory and computation for deployment :smile:
 </p>
 
 ## What's New?
 
+- (12/17/2020) Supported ContextNet [http://arxiv.org/abs/2005.03191](http://arxiv.org/abs/2005.03191)
 - (12/12/2020) Add support for using masking
 - (11/14/2020) Supported Gradient Accumulation for Training in Larger Batch Size
 - (11/3/2020) Reduce differences between `librosa.stft` and `tf.signal.stft`
@@ -34,6 +35,8 @@ TensorFlowASR implements some automatic speech recognition architectures such as
 - [What's New?](#whats-new)
 - [Table of Contents](#table-of-contents)
 - [:yum: Supported Models](#yum-supported-models)
+  - [Baselines](#baselines)
+  - [Publications](#publications)
 - [Installation](#installation)
   - [Installing via PyPi](#installing-via-pypi)
   - [Installing from source](#installing-from-source)
@@ -47,21 +50,29 @@ TensorFlowASR implements some automatic speech recognition architectures such as
   - [Vietnamese](#vietnamese)
   - [German](#german)
 - [References & Credits](#references--credits)
+- [Contact](#contact)
 
 <!-- /TOC -->
 
 ## :yum: Supported Models
 
+### Baselines
+
 - **CTCModel** (End2end models using CTC Loss for training)
+- **Transducer Models** (End2end models using RNNT Loss for training)
+
+### Publications
+
 - **Deep Speech 2** (Reference: [https://arxiv.org/abs/1512.02595](https://arxiv.org/abs/1512.02595))
   See [examples/deepspeech2](./examples/deepspeech2)
 - **Jasper** (Reference: [https://arxiv.org/abs/1904.03288](https://arxiv.org/abs/1904.03288))
   See [examples/jasper](./examples/jasper)
-- **Transducer Models** (End2end models using RNNT Loss for training)
 - **Conformer Transducer** (Reference: [https://arxiv.org/abs/2005.08100](https://arxiv.org/abs/2005.08100))
   See [examples/conformer](./examples/conformer)
 - **Streaming Transducer** (Reference: [https://arxiv.org/abs/1811.06621](https://arxiv.org/abs/1811.06621))
   See [examples/streaming_transducer](./examples/streaming_transducer)
+- **ContextNet** (Reference: [http://arxiv.org/abs/2005.03191](http://arxiv.org/abs/2005.03191))
+  See [examples/contextnet](./examples/contextnet)
 
 ## Installation
 
@@ -104,6 +115,8 @@ python setup.py install
 
 - For _enabling XLA_, run `TF_XLA_FLAGS=--tf_xla_auto_jit=2 python3 $path_to_py_script`)
 
+- For _hiding warnings_, run `export TF_CPP_MIN_LOG_LEVEL=2` before running any examples
+
 ## TFLite Convertion
 
 After converting to tflite, the tflite model is like a function that transforms directly from an **audio signal** to **unicode code points**, then we can convert unicode points to string.
@@ -199,3 +212,10 @@ For pretrained models, go to [drive](https://drive.google.com/drive/folders/1BD0
 2. [https://github.com/noahchalifour/warp-transducer](https://github.com/noahchalifour/warp-transducer)
 3. [Sequence Transduction with Recurrent Neural Network](https://arxiv.org/abs/1211.3711)
 4. [End-to-End Speech Processing Toolkit in PyTorch](https://github.com/espnet/espnet)
+5. [https://github.com/iankur/ContextNet](https://github.com/iankur/ContextNet)
+
+## Contact
+
+Huy Le Nguyen
+
+Email: [email protected]
@@ -84,11 +84,11 @@ learning_config:
 
 ## Usage
 
-Training, see `python examples/conformer/train_conformer.py --help`
+Training, see `python examples/conformer/train_*.py --help`
 
-Testing, see `python examples/conformer/train_conformer.py --help`
+Testing, see `python examples/conformer/test_*.py --help`
 
-TFLite Conversion, see `python examples/conformer/tflite_conformer.py --help`
+TFLite Conversion, see `python examples/conformer/tflite_*.py --help`
 
 ## Conformer Subwords - Results on LibriSpeech
 
 
@@ -48,7 +48,7 @@ model_config:
   encoder_fc_factor: 0.5
   encoder_dropout: 0.1
   prediction_embed_dim: 320
-  prediction_embed_dropout: 0.1
+  prediction_embed_dropout: 0
   prediction_num_rnns: 1
   prediction_rnn_units: 320
   prediction_rnn_type: lstm
@@ -70,12 +70,12 @@ learning_config:
 
   dataset_config:
     train_paths:
-      - /mnt/d/Datasets/Speech/LibriSpeech/train-clean-100/transcripts.tsv
+      - /mnt/Miscellanea/Datasets/Speech/LibriSpeech/train-clean-100/transcripts.tsv
     eval_paths:
-      - /mnt/d/Datasets/Speech/LibriSpeech/dev-clean/transcripts.tsv
-      - /mnt/d/Datasets/Speech/LibriSpeech/dev-other/transcripts.tsv
+      - /mnt/Miscellanea/Datasets/Speech/LibriSpeech/dev-clean/transcripts.tsv
+      - /mnt/Miscellanea/Datasets/Speech/LibriSpeech/dev-other/transcripts.tsv
     test_paths:
-      - /mnt/d/Datasets/Speech/LibriSpeech/test-clean/transcripts.tsv
+      - /mnt/Miscellanea/Datasets/Speech/LibriSpeech/test-clean/transcripts.tsv
     tfrecords_dir: null
 
   optimizer_config:
@@ -88,7 +88,7 @@ learning_config:
     batch_size: 2
     accumulation_steps: 4
     num_epochs: 20
-    outdir: /mnt/d/Models/local/conformer
+    outdir: /mnt/Miscellanea/Models/local/conformer
     log_interval_steps: 300
     eval_interval_steps: 500
     save_interval_steps: 1000
@@ -113,9 +113,9 @@
     optimizer_config = config.learning_config.optimizer_config
     optimizer = tf.keras.optimizers.Adam(
         TransformerSchedule(
-            d_model=config.model_config["encoder_dmodel"],
+            d_model=conformer.dmodel,
             warmup_steps=optimizer_config["warmup_steps"],
-            max_lr=(0.05 / math.sqrt(config.model_config["encoder_dmodel"]))
+            max_lr=(0.05 / math.sqrt(conformer.dmodel))
         ),
         beta_1=optimizer_config["beta1"],
         beta_2=optimizer_config["beta2"],
 
@@ -115,9 +115,9 @@
 
     optimizer = tf.keras.optimizers.Adam(
         TransformerSchedule(
-            d_model=config.model_config["encoder_dmodel"],
+            d_model=conformer.dmodel,
             warmup_steps=config.learning_config.optimizer_config["warmup_steps"],
-            max_lr=(0.05 / math.sqrt(config.model_config["encoder_dmodel"]))
+            max_lr=(0.05 / math.sqrt(conformer.dmodel))
         ),
         beta_1=config.learning_config.optimizer_config["beta1"],
         beta_2=config.learning_config.optimizer_config["beta2"],
 
@@ -131,9 +131,9 @@
 
     optimizer = tf.keras.optimizers.Adam(
         TransformerSchedule(
-            d_model=config.model_config["encoder_dmodel"],
+            d_model=conformer.dmodel,
             warmup_steps=config.learning_config.optimizer_config["warmup_steps"],
-            max_lr=(0.05 / math.sqrt(config.model_config["encoder_dmodel"]))
+            max_lr=(0.05 / math.sqrt(conformer.dmodel))
         ),
         beta_1=config.learning_config.optimizer_config["beta1"],
         beta_2=config.learning_config.optimizer_config["beta2"],
 
@@ -128,9 +128,9 @@
 
     optimizer = tf.keras.optimizers.Adam(
         TransformerSchedule(
-            d_model=config.model_config["encoder_dmodel"],
+            d_model=conformer.dmodel,
             warmup_steps=config.learning_config.optimizer_config["warmup_steps"],
-            max_lr=(0.05 / math.sqrt(config.model_config["encoder_dmodel"]))
+            max_lr=(0.05 / math.sqrt(conformer.dmodel))
         ),
         beta_1=config.learning_config.optimizer_config["beta1"],
         beta_2=config.learning_config.optimizer_config["beta2"],