Skip to content
GitLab
Menu
Projects
Groups
Snippets
Help
Help
Support
Community forum
Keyboard shortcuts
?
Submit feedback
Contribute to GitLab
Sign in
Toggle navigation
Menu
Open sidebar
Merlijn Wajer
archive-hocr-tools
Commits
55a02ede
Commit
55a02ede
authored
Jan 13, 2022
by
Merlijn Wajer
Browse files
hocr/parse: make x_wconf optional
parent
6cdb14db
Changes
1
Hide whitespace changes
Inline
Side-by-side
hocr/parse.py
View file @
55a02ede
...
...
@@ -183,7 +183,10 @@ def hocr_page_to_word_data(hocr_page, scaler=1):
box
=
BBOX_REGEX
.
search
(
word
.
attrib
[
'title'
]).
group
(
1
).
split
()
box
=
[
float
(
i
)
for
i
in
box
]
conf
=
int
(
X_WCONF_REGEX
.
search
(
word
.
attrib
[
'title'
]).
group
(
1
).
split
()[
0
])
conf
=
None
m
=
X_WCONF_REGEX
.
search
(
word
.
attrib
[
'title'
])
if
m
:
conf
=
int
(
m
.
group
(
1
).
split
()[
0
])
f_sizeraw
=
X_FSIZE_REGEX
.
search
(
word
.
attrib
[
'title'
])
if
f_sizeraw
:
...
...
Write
Preview
Supports
Markdown
0%
Try again
or
attach a new file
.
Attach a file
Cancel
You are about to add
0
people
to the discussion. Proceed with caution.
Finish editing this message first!
Cancel
Please
register
or
sign in
to comment