feat (WIP): bring png OCR scanning support (#1670)

* Add pytesseract

* Add simple ocr endpoint

replace extension argument

* feat/ocr-editor gui

* fix frontend linting issues

* Add service unit tests

* Add split text modes & single ingredient/instruction editing

* make split mode really reactive

* Remove default step and ingredient

* make the linter haappy

* Accept only image uploads

* Add automatic recipe title suggestion

* Correct regex

* fix incorrect array.map method usage

* make the linter happy again

* Swap route to use asset name

* Rearange buttons

* fix test data

* feat: Allow making image the recipe image

* Add translation

* Make the linter happy

* Restrict function setPropertyValueByPath generic

* Restrict template literal type

* Add a more friendly icon to creation page

* update poetry lock file

* Correct sloppy ocr classes

* Make MyPy happy

* Rewrite safer tests

* Add tesseract to backend test CI container dependencies

* Make canvas element a component global

* Remove unwanted spaces in selected text

* Add way to know if recipe was created with ocr

* Access to ocr-editor for ocr recipes

* Update Alembic revision

* Make the frontend build

* Fix scrolling offset bug

* Allow creation of recipes with custom settings

* Fix rebasing mistakes

* Add format_tsv_output test

* Exclude the tests data directory only

* Enforce camelCase for frontend functions

* Remove import of unused component

* Fix type and class initialization

* Add multi-language support

* Highlight words in mount

* Fix image ratio bug

* Better ocr creation page

* Revert awkward feature to scroll in Selection mode

* Rebasing alembic migrations sux

* Remove obsolete getShared function

* Add function docstring

* Move down ocr creation option

* Make toolbar icons more generic

* Show help at the bottom of the page

* move ocr types to own file

* Use template ref for the canvas

* Use i18n.tc to get strings directly

* Correct naming mistake

* Move Ocr editor to own directory

* Create Ocr Editor parts

* Safeguard recipe properties access

* Add loading frontend animation due to longer request time

* minor cleanup chores

Co-authored-by: Miroito <alban.vachette@gmail.com>
This commit is contained in:
Hayden
2022-09-25 15:00:45 -08:00
committed by GitHub
parent a8f3922907
commit 39adea4ee3
44 changed files with 1659 additions and 34 deletions

View File

@@ -0,0 +1,73 @@
level page_num block_num par_num line_num word_num left top width height conf text
1 1 0 0 0 0 0 0 640 480 -1
2 1 1 0 0 0 36 92 582 269 -1
3 1 1 1 0 0 36 92 582 92 -1
4 1 1 1 1 0 36 92 544 30 -1
5 1 1 1 1 1 36 92 60 24 87.137558 This
5 1 1 1 1 2 109 92 20 24 87.137558 is
5 1 1 1 1 3 141 98 15 18 87.823906 a
5 1 1 1 1 4 169 92 32 24 87.823906 lot
5 1 1 1 1 5 212 92 28 24 92.965874 of
5 1 1 1 1 6 251 92 31 24 93.247513 12
5 1 1 1 1 7 296 92 68 30 92.734741 point
5 1 1 1 1 8 374 93 53 23 92.996040 text
5 1 1 1 1 9 437 93 26 23 93.160057 to
5 1 1 1 1 10 474 93 52 23 92.312637 test
5 1 1 1 1 11 536 92 44 24 92.312637 the
4 1 1 1 2 0 36 126 582 31 -1
5 1 1 1 2 1 36 132 45 18 90.505524 ocr
5 1 1 1 2 2 91 126 69 24 90.505524 code
5 1 1 1 2 3 172 126 51 24 91.169167 and
5 1 1 1 2 4 236 132 50 18 89.765854 see
5 1 1 1 2 5 299 126 15 24 85.827324 if
5 1 1 1 2 6 325 126 14 24 93.116241 it
5 1 1 1 2 7 348 126 85 24 92.394562 works
5 1 1 1 2 8 445 132 33 18 30.119690 on
5 1 1 1 2 9 500 126 29 24 30.119690 all
5 1 1 1 2 10 541 127 77 30 92.090988 types
4 1 1 1 3 0 36 160 187 24 -1
5 1 1 1 3 1 36 160 28 24 92.476135 of
5 1 1 1 3 2 72 160 41 24 90.919365 file
5 1 1 1 3 3 123 160 100 24 91.360367 format.
3 1 1 2 0 0 36 194 561 167 -1
4 1 1 2 1 0 36 194 549 31 -1
5 1 1 2 1 1 36 194 55 24 89.098892 The
5 1 1 2 1 2 102 194 75 30 89.098892 quick
5 1 1 2 1 3 189 194 85 24 91.415680 brown
5 1 1 2 1 4 287 194 52 31 91.943085 dog
5 1 1 2 1 5 348 194 108 31 92.167969 jumped
5 1 1 2 1 6 468 200 63 18 91.970985 over
5 1 1 2 1 7 540 194 45 24 92.843704 the
4 1 1 2 2 0 37 228 548 31 -1
5 1 1 2 2 1 37 228 55 31 92.262550 lazy
5 1 1 2 2 2 103 228 50 24 92.693161 fox.
5 1 1 2 2 3 165 228 55 24 92.947639 The
5 1 1 2 2 4 232 228 75 30 90.589806 quick
5 1 1 2 2 5 319 228 85 24 91.051247 brown
5 1 1 2 2 6 417 228 51 31 91.925011 dog
5 1 1 2 2 7 478 228 107 31 91.471077 jumped
4 1 1 2 3 0 36 262 561 31 -1
5 1 1 2 3 1 36 268 63 18 90.210129 over
5 1 1 2 3 2 109 262 44 24 90.210129 the
5 1 1 2 3 3 165 262 56 31 91.178192 lazy
5 1 1 2 3 4 231 262 50 24 92.794647 fox.
5 1 1 2 3 5 294 262 55 24 91.388016 The
5 1 1 2 3 6 360 262 75 30 92.525742 quick
5 1 1 2 3 7 447 262 85 24 90.425552 brown
5 1 1 2 3 8 545 262 52 31 90.425552 dog
4 1 1 2 4 0 43 296 518 31 -1
5 1 1 2 4 1 43 296 107 31 91.759590 jumped
5 1 1 2 4 2 162 302 64 18 92.923576 over
5 1 1 2 4 3 235 296 44 24 92.017929 the
5 1 1 2 4 4 292 296 55 31 91.558884 lazy
5 1 1 2 4 5 357 296 50 24 92.687485 fox.
5 1 1 2 4 6 420 296 55 24 91.922661 The
5 1 1 2 4 7 486 296 75 30 91.870224 quick
4 1 1 2 5 0 37 330 524 31 -1
5 1 1 2 5 1 37 330 85 24 92.923935 brown
5 1 1 2 5 2 135 330 52 31 91.468765 dog
5 1 1 2 5 3 196 330 108 31 91.425491 jumped
5 1 1 2 5 4 316 336 63 18 91.489830 over
5 1 1 2 5 5 388 330 45 24 91.740379 the
5 1 1 2 5 6 445 330 55 31 92.110054 lazy
5 1 1 2 5 7 511 330 50 24 93.180054 fox.
1 level page_num block_num par_num line_num word_num left top width height conf text
2 1 1 0 0 0 0 0 0 640 480 -1
3 2 1 1 0 0 0 36 92 582 269 -1
4 3 1 1 1 0 0 36 92 582 92 -1
5 4 1 1 1 1 0 36 92 544 30 -1
6 5 1 1 1 1 1 36 92 60 24 87.137558 This
7 5 1 1 1 1 2 109 92 20 24 87.137558 is
8 5 1 1 1 1 3 141 98 15 18 87.823906 a
9 5 1 1 1 1 4 169 92 32 24 87.823906 lot
10 5 1 1 1 1 5 212 92 28 24 92.965874 of
11 5 1 1 1 1 6 251 92 31 24 93.247513 12
12 5 1 1 1 1 7 296 92 68 30 92.734741 point
13 5 1 1 1 1 8 374 93 53 23 92.996040 text
14 5 1 1 1 1 9 437 93 26 23 93.160057 to
15 5 1 1 1 1 10 474 93 52 23 92.312637 test
16 5 1 1 1 1 11 536 92 44 24 92.312637 the
17 4 1 1 1 2 0 36 126 582 31 -1
18 5 1 1 1 2 1 36 132 45 18 90.505524 ocr
19 5 1 1 1 2 2 91 126 69 24 90.505524 code
20 5 1 1 1 2 3 172 126 51 24 91.169167 and
21 5 1 1 1 2 4 236 132 50 18 89.765854 see
22 5 1 1 1 2 5 299 126 15 24 85.827324 if
23 5 1 1 1 2 6 325 126 14 24 93.116241 it
24 5 1 1 1 2 7 348 126 85 24 92.394562 works
25 5 1 1 1 2 8 445 132 33 18 30.119690 on
26 5 1 1 1 2 9 500 126 29 24 30.119690 all
27 5 1 1 1 2 10 541 127 77 30 92.090988 types
28 4 1 1 1 3 0 36 160 187 24 -1
29 5 1 1 1 3 1 36 160 28 24 92.476135 of
30 5 1 1 1 3 2 72 160 41 24 90.919365 file
31 5 1 1 1 3 3 123 160 100 24 91.360367 format.
32 3 1 1 2 0 0 36 194 561 167 -1
33 4 1 1 2 1 0 36 194 549 31 -1
34 5 1 1 2 1 1 36 194 55 24 89.098892 The
35 5 1 1 2 1 2 102 194 75 30 89.098892 quick
36 5 1 1 2 1 3 189 194 85 24 91.415680 brown
37 5 1 1 2 1 4 287 194 52 31 91.943085 dog
38 5 1 1 2 1 5 348 194 108 31 92.167969 jumped
39 5 1 1 2 1 6 468 200 63 18 91.970985 over
40 5 1 1 2 1 7 540 194 45 24 92.843704 the
41 4 1 1 2 2 0 37 228 548 31 -1
42 5 1 1 2 2 1 37 228 55 31 92.262550 lazy
43 5 1 1 2 2 2 103 228 50 24 92.693161 fox.
44 5 1 1 2 2 3 165 228 55 24 92.947639 The
45 5 1 1 2 2 4 232 228 75 30 90.589806 quick
46 5 1 1 2 2 5 319 228 85 24 91.051247 brown
47 5 1 1 2 2 6 417 228 51 31 91.925011 dog
48 5 1 1 2 2 7 478 228 107 31 91.471077 jumped
49 4 1 1 2 3 0 36 262 561 31 -1
50 5 1 1 2 3 1 36 268 63 18 90.210129 over
51 5 1 1 2 3 2 109 262 44 24 90.210129 the
52 5 1 1 2 3 3 165 262 56 31 91.178192 lazy
53 5 1 1 2 3 4 231 262 50 24 92.794647 fox.
54 5 1 1 2 3 5 294 262 55 24 91.388016 The
55 5 1 1 2 3 6 360 262 75 30 92.525742 quick
56 5 1 1 2 3 7 447 262 85 24 90.425552 brown
57 5 1 1 2 3 8 545 262 52 31 90.425552 dog
58 4 1 1 2 4 0 43 296 518 31 -1
59 5 1 1 2 4 1 43 296 107 31 91.759590 jumped
60 5 1 1 2 4 2 162 302 64 18 92.923576 over
61 5 1 1 2 4 3 235 296 44 24 92.017929 the
62 5 1 1 2 4 4 292 296 55 31 91.558884 lazy
63 5 1 1 2 4 5 357 296 50 24 92.687485 fox.
64 5 1 1 2 4 6 420 296 55 24 91.922661 The
65 5 1 1 2 4 7 486 296 75 30 91.870224 quick
66 4 1 1 2 5 0 37 330 524 31 -1
67 5 1 1 2 5 1 37 330 85 24 92.923935 brown
68 5 1 1 2 5 2 135 330 52 31 91.468765 dog
69 5 1 1 2 5 3 196 330 108 31 91.425491 jumped
70 5 1 1 2 5 4 316 336 63 18 91.489830 over
71 5 1 1 2 5 5 388 330 45 24 91.740379 the
72 5 1 1 2 5 6 445 330 55 31 92.110054 lazy
73 5 1 1 2 5 7 511 330 50 24 93.180054 fox.