feat (WIP): bring png OCR scanning support (#1670)

* Add pytesseract

* Add simple ocr endpoint

replace extension argument

* feat/ocr-editor gui

* fix frontend linting issues

* Add service unit tests

* Add split text modes & single ingredient/instruction editing

* make split mode really reactive

* Remove default step and ingredient

* make the linter haappy

* Accept only image uploads

* Add automatic recipe title suggestion

* Correct regex

* fix incorrect array.map method usage

* make the linter happy again

* Swap route to use asset name

* Rearange buttons

* fix test data

* feat: Allow making image the recipe image

* Add translation

* Make the linter happy

* Restrict function setPropertyValueByPath generic

* Restrict template literal type

* Add a more friendly icon to creation page

* update poetry lock file

* Correct sloppy ocr classes

* Make MyPy happy

* Rewrite safer tests

* Add tesseract to backend test CI container dependencies

* Make canvas element a component global

* Remove unwanted spaces in selected text

* Add way to know if recipe was created with ocr

* Access to ocr-editor for ocr recipes

* Update Alembic revision

* Make the frontend build

* Fix scrolling offset bug

* Allow creation of recipes with custom settings

* Fix rebasing mistakes

* Add format_tsv_output test

* Exclude the tests data directory only

* Enforce camelCase for frontend functions

* Remove import of unused component

* Fix type and class initialization

* Add multi-language support

* Highlight words in mount

* Fix image ratio bug

* Better ocr creation page

* Revert awkward feature to scroll in Selection mode

* Rebasing alembic migrations sux

* Remove obsolete getShared function

* Add function docstring

* Move down ocr creation option

* Make toolbar icons more generic

* Show help at the bottom of the page

* move ocr types to own file

* Use template ref for the canvas

* Use i18n.tc to get strings directly

* Correct naming mistake

* Move Ocr editor to own directory

* Create Ocr Editor parts

* Safeguard recipe properties access

* Add loading frontend animation due to longer request time

* minor cleanup chores

Co-authored-by: Miroito <alban.vachette@gmail.com>
This commit is contained in:
Hayden
2022-09-25 15:00:45 -08:00
committed by GitHub
parent a8f3922907
commit 39adea4ee3
44 changed files with 1659 additions and 34 deletions

View File

View File

@@ -0,0 +1,56 @@
from io import BytesIO
import pytesseract
from PIL import Image
from mealie.schema.ocr.ocr import OcrTsvResponse
from mealie.services._base_service import BaseService
class OcrService(BaseService):
"""
Class for ocr engines.
"""
def image_to_string(self, image_data):
"""
Returns a plain text translation of an image
"""
return pytesseract.image_to_string(Image.open(image_data))
def image_to_tsv(self, image_data, lang=None):
"""
Returns the pytesseract default tsv output
"""
if lang is not None:
return pytesseract.image_to_data(Image.open(BytesIO(image_data)), lang=lang)
return pytesseract.image_to_data(Image.open(BytesIO(image_data)))
def format_tsv_output(self, tsv: str) -> list[OcrTsvResponse]:
"""
Returns a OcrTsvResponse from a default pytesseract tsv output
"""
lines = tsv.split("\n")
titles = [t.strip() for t in lines[0].split("\t")]
response: list[OcrTsvResponse] = []
for i in range(1, len(lines)):
if lines[i] == "":
continue
line = OcrTsvResponse()
for key, value in zip(titles, lines[i].split("\t")):
if key == "text":
setattr(line, key, value.strip())
elif key == "conf":
setattr(line, key, float(value.strip()))
elif key in OcrTsvResponse.__fields__:
setattr(line, key, int(value.strip()))
else:
continue
if isinstance(line, OcrTsvResponse):
response.append(line)
return response

View File

@@ -111,14 +111,18 @@ class RecipeService(BaseService):
additional_attrs=create_data.dict(),
)
data.settings = RecipeSettings(
public=self.group.preferences.recipe_public,
show_nutrition=self.group.preferences.recipe_show_nutrition,
show_assets=self.group.preferences.recipe_show_assets,
landscape_view=self.group.preferences.recipe_landscape_view,
disable_comments=self.group.preferences.recipe_disable_comments,
disable_amount=self.group.preferences.recipe_disable_amount,
)
if isinstance(create_data, CreateRecipe) or create_data.settings is None:
if self.group.preferences is not None:
data.settings = RecipeSettings(
public=self.group.preferences.recipe_public,
show_nutrition=self.group.preferences.recipe_show_nutrition,
show_assets=self.group.preferences.recipe_show_assets,
landscape_view=self.group.preferences.recipe_landscape_view,
disable_comments=self.group.preferences.recipe_disable_comments,
disable_amount=self.group.preferences.recipe_disable_amount,
)
else:
data.settings = RecipeSettings()
return self.repos.recipes.create(data)