|
1 | 1 | # Changelog |
2 | 2 |
|
| 3 | +## 0.6.0 (05/05/2022) |
| 4 | + |
| 5 | +### Highlights |
| 6 | + |
| 7 | +1. A new recognition algorithm [MASTER](https://arxiv.org/abs/1910.02562) has been added into MMOCR, which was the championship solution for the "ICDAR 2021 Competition on Scientific Table Image Recognition to Latex"! The model pre-trained on SynthText and MJSynth is available for testing! Credit to @JiaquanYe |
| 8 | +2. [DBNet++](https://arxiv.org/abs/2202.10304) has been released now! A new Adaptive Scale Fusion module has been equipped for feature enhancement. Benefiting from this, the new model achieved 2% better h-mean score than its predecessor on the ICDAR2015 dataset. |
| 9 | +3. Three more dataset converters are added: LSVT, RCTW and HierText. Check the dataset zoo ([Det](https://mmocr.readthedocs.io/en/latest/datasets/det.html#) & [Recog](https://mmocr.readthedocs.io/en/latest/datasets/recog.html) ) to explore further information. |
| 10 | +4. To enhance the data storage efficiency, MMOCR now supports loading both images and labels from .lmdb format annotations for the text recognition task. To enable such a feature, the new lmdb_converter.py is ready for use to pack your cropped images and labels into an lmdb file. For a detailed tutorial, please refer to the following sections and the [doc](https://mmocr.readthedocs.io/en/latest/tools.html#convert-text-recognition-dataset-to-lmdb-format). |
| 11 | +5. Testing models on multiple datasets is a widely used evaluation strategy. MMOCR now supports automatically reporting mean scores when there is more than one dataset to evaluate, which enables a more convenient comparison between checkpoints. [Doc](https://mmocr.readthedocs.io/en/latest/tutorials/dataset_types.html#getting-mean-evaluation-scores) |
| 12 | +6. Evaluation is more flexible and customizable now. For text detection tasks, you can set the score threshold range where the best results might come out. ([Doc](https://mmocr.readthedocs.io/en/latest/tutorials/dataset_types.html#evaluation)) If too many results are flooding your text recognition train log, you can trim it by specifying a subset of metrics in evaluation config. Check out the [Evaluation](https://mmocr.readthedocs.io/en/latest/tutorials/dataset_types.html#ocrdataset) section for details. |
| 13 | +7. MMOCR provides a script to convert the .json labels obtained by the popular annotation toolkit **Labelme** to MMOCR-supported data format. @Y-M-Y contributed a log analysis tool that helps users gain a better understanding of the entire training process. Read [tutorial docs](https://mmocr.readthedocs.io/en/latest/tools.html) to get started. |
| 14 | + |
| 15 | +### Lmdb Dataset |
| 16 | + |
| 17 | +Reading images or labels from files can be slow when data are excessive, e.g. on a scale of millions. Besides, in academia, most of the scene text recognition datasets are stored in lmdb format, including images and labels. To get closer to the mainstream practice and enhance the data storage efficiency, MMOCR now officially supports loading images and labels from lmdb datasets via a new pipeline [LoadImageFromLMDB](https://github.com/open-mmlab/mmocr/blob/878383b9de8d0e598f31fbb844ffcb0c305deb8b/mmocr/datasets/pipelines/loading.py#L140). |
| 18 | +This section is intended to serve as a quick walkthrough for you to master this update and apply it to facilitate your research. |
| 19 | + |
| 20 | +#### Specifications |
| 21 | + |
| 22 | +To better align with the academic community, MMOCR now requires the following specifications for lmdb datasets: |
| 23 | + |
| 24 | + * The parameter describing the data volume of the dataset is `num-samples` instead of `total_number` (deprecated). |
| 25 | + * Images and labels are stored with keys in the form of `image-000000001` and `label-000000001`, respectively. |
| 26 | + |
| 27 | + |
| 28 | +#### Usage |
| 29 | + |
| 30 | +1. Use existing academic lmdb datasets if they meet the specifications; or the tool provided by MMOCR to pack images & annotations into a lmdb dataset. |
| 31 | + |
| 32 | + - Previously, MMOCR had a function `txt2lmdb` (deprecated) that only supported converting labels to lmdb format. However, it is quite different from academic lmdb datasets, which usually contain both images and labels. Now MMOCR provides a new utility [lmdb_converter](https://github.com/open-mmlab/mmocr/blob/main/tools/data/utils/lmdb_converter.py) to convert recognition datasets with both images and labels to lmdb format. |
| 33 | + - Say that your recognition data in MMOCR's format are organized as follows. (See an example in [ocr_toy_dataset](https://github.com/open-mmlab/mmocr/tree/main/tests/data/ocr_toy_dataset)). |
| 34 | + |
| 35 | + ```text |
| 36 | + # Directory structure |
| 37 | +
|
| 38 | + ├──img_path |
| 39 | + | |—— img1.jpg |
| 40 | + | |—— img2.jpg |
| 41 | + | |—— ... |
| 42 | + |——label.txt (or label.jsonl) |
| 43 | +
|
| 44 | + # Annotation format |
| 45 | +
|
| 46 | + label.txt: img1.jpg HELLO |
| 47 | + img2.jpg WORLD |
| 48 | + ... |
| 49 | +
|
| 50 | + label.jsonl: {'filename':'img1.jpg', 'text':'HELLO'} |
| 51 | + {'filename':'img2.jpg', 'text':'WORLD'} |
| 52 | + ... |
| 53 | + ``` |
| 54 | +
|
| 55 | + - Then pack these files up: |
| 56 | +
|
| 57 | + ```bash |
| 58 | + python tools/data/utils/lmdb_converter.py {PATH_TO_LABEL} {OUTPUT_PATH} --i {PATH_TO_IMAGES} |
| 59 | + ``` |
| 60 | +
|
| 61 | + - Check out [tools.md](https://github.com/open-mmlab/mmocr/blob/main/docs/en/tools.md) for more details. |
| 62 | +
|
| 63 | +2. The second step is to modify the configuration files. For example, to train CRNN on MJ and ST datasets: |
| 64 | +
|
| 65 | + - Set parser as `LineJsonParser` and `file_format` as 'lmdb' in [dataset config](https://github.com/open-mmlab/mmocr/blob/main/configs/_base_/recog_datasets/ST_MJ_train.py#L9) |
| 66 | +
|
| 67 | + ```python |
| 68 | + # configs/_base_/recog_datasets/ST_MJ_train.py |
| 69 | + train1 = dict( |
| 70 | + type='OCRDataset', |
| 71 | + img_prefix=train_img_prefix1, |
| 72 | + ann_file=train_ann_file1, |
| 73 | + loader=dict( |
| 74 | + type='AnnFileLoader', |
| 75 | + repeat=1, |
| 76 | + file_format='lmdb', |
| 77 | + parser=dict( |
| 78 | + type='LineJsonParser', |
| 79 | + keys=['filename', 'text'], |
| 80 | + )), |
| 81 | + pipeline=None, |
| 82 | + test_mode=False) |
| 83 | + ``` |
| 84 | + - Use `LoadImageFromLMDB` in [pipeline](https://github.com/open-mmlab/mmocr/blob/main/configs/_base_/recog_pipelines/crnn_pipeline.py#L4): |
| 85 | +
|
| 86 | + ```python |
| 87 | + # configs/_base_/recog_pipelines/crnn_pipeline.py |
| 88 | + train_pipeline = [ |
| 89 | + dict(type='LoadImageFromLMDB', color_type='grayscale'), |
| 90 | + ... |
| 91 | + ``` |
| 92 | +
|
| 93 | +3. You are good to go! Start training and MMOCR will load data from your lmdb dataset. |
| 94 | +
|
| 95 | +### New Features & Enhancements |
| 96 | +
|
| 97 | +* Add analyze_logs in tools and its description in docs by @Y-M-Y in https://github.com/open-mmlab/mmocr/pull/899 |
| 98 | +* Add LSVT Data Converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/896 |
| 99 | +* Add RCTW dataset converter by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/914 |
| 100 | +* Support computing mean scores in UniformConcatDataset by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/981 |
| 101 | +* Support loading images and labels from lmdb file by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/982 |
| 102 | +* Add recog2lmdb and new toy dataset files by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/979 |
| 103 | +* Add labelme converter for textdet and textrecog by @cuhk-hbsun in https://github.com/open-mmlab/mmocr/pull/972 |
| 104 | +* Update CircleCI configs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/918 |
| 105 | +* Update Git Action by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/930 |
| 106 | +* More customizable fields in dataloaders by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/933 |
| 107 | +* Skip CIs when docs are modified by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/941 |
| 108 | +* Rename Github tests, fix ignored paths by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/946 |
| 109 | +* Support latest MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/959 |
| 110 | +* Support dynamic threshold range in eval_hmean by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/962 |
| 111 | +* Update the version requirement of mmdet in docker by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/966 |
| 112 | +* Replace `opencv-python-headless` with `open-python` by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/970 |
| 113 | +* Update Dataset Configs by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/980 |
| 114 | +* Add SynthText dataset config by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/983 |
| 115 | +* Automatically report mean scores when applicable by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/995 |
| 116 | +* Add DBNet++ by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/973 |
| 117 | +* Add MASTER by @JiaquanYe in https://github.com/open-mmlab/mmocr/pull/807 |
| 118 | +* Allow choosing metrics to report in text recognition tasks by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/989 |
| 119 | +* Add HierText converter by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/948 |
| 120 | +* Fix lint_only in CircleCI by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/998 |
| 121 | +
|
| 122 | +### Bug Fixes |
| 123 | +
|
| 124 | +* Fix CircleCi Main Branch Accidentally Run PR Stage Test by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/927 |
| 125 | +* Fix a deprecate warning about mmdet.datasets.pipelines.formating by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/944 |
| 126 | +* Fix a Bug in ResNet plugin by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/967 |
| 127 | +* revert a wrong setting in db_r18 cfg by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/978 |
| 128 | +* Fix TotalText Anno version issue by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/945 |
| 129 | +* Update installation step of `albumentations` by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/984 |
| 130 | +* Fix ImgAug transform by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/949 |
| 131 | +* Fix GPG key error in CI and docker by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/988 |
| 132 | +* update label.lmdb by @Mountchicken in https://github.com/open-mmlab/mmocr/pull/991 |
| 133 | +* correct meta key by @garvan2021 in https://github.com/open-mmlab/mmocr/pull/926 |
| 134 | +* Use new image by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/976 |
| 135 | +* Fix Data Converter Issues by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/955 |
| 136 | +
|
| 137 | +### Docs |
| 138 | +
|
| 139 | +* Update CONTRIBUTING.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/905 |
| 140 | +* Fix the misleading description in test.py by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/908 |
| 141 | +* Update recog.md for lmdb Generation by @xinke-wang in https://github.com/open-mmlab/mmocr/pull/934 |
| 142 | +* Add MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/954 |
| 143 | +* Add wechat QR code to CN readme by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/960 |
| 144 | +* Update CONTRIBUTING.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/947 |
| 145 | +* Use QR codes from MMCV by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/971 |
| 146 | +* Renew dataset_types.md by @gaotongxiao in https://github.com/open-mmlab/mmocr/pull/997 |
| 147 | +
|
| 148 | +### New Contributors |
| 149 | +* @Y-M-Y made their first contribution in https://github.com/open-mmlab/mmocr/pull/899 |
| 150 | +
|
| 151 | +**Full Changelog**: https://github.com/open-mmlab/mmocr/compare/v0.5.0...v0.6.0 |
| 152 | +
|
3 | 153 | ## 0.5.0 (31/03/2022) |
4 | 154 |
|
5 | 155 | ### Highlights |
|
0 commit comments