computer_vision_project/README.md

59 lines
2 KiB
Markdown
Raw Permalink Normal View History

2024-01-19 11:37:22 +00:00
# The ALGORITHM!!!
The general idea is to make a filter/mask of each of the corresponding fonts,
and attempt to match them to the given letter.
## Scoring system
The score each font will have will be based on the average color(`acolor`)
underneath each font mask(might be different `acolor` for each mask).
After obtaining the `acolor` for a mask, the score will be calculated
as the sum of the different pixel scores.
For a given pixel(`po` for the original image and `pm` for the mask, same position)
its score will be calculated as follows:
2024-01-26 13:27:40 +00:00
`v` for variance
2024-01-19 11:37:22 +00:00
```
2024-01-26 13:27:40 +00:00
||po - acolor|| - ||v - acolor||
S_p = (|po - acolor| - v) x (0.5 - pm)
2024-01-19 11:37:22 +00:00
```
it is assumed that the font mask is of values between `0..1` and made as a
'white on black' text(so `1` is where the font is).
The given score calculation will take into consideration color
variations of where the letter should be, while also taking into
consideration the fact that the background should be of different
color.
2024-01-29 17:45:11 +00:00
I seem to be missing something in the original idea, as some fonts gets better
score on incorrect guesses with bigger color variance, and others get
the smallest color variance on some other fonts.
2024-01-19 11:37:22 +00:00
## Potential improvements
Some potential improvements would be:
- Only consider pixels in the font and their outline.
This might be helpful, as it would mean we dont care
about pixels that are too far away, but assuming a good bounding boxes,
it probably wont give much better results(or at all).
Additionally, it poses some questions of which pixels should be considered,
as both the text and mask are anti-aliased(thus having "weak" pixels)
- Increase the area around the font.
This idea can make sure we are not looking too inwards,
although it shouldn't matter since we are looking to classify from
a predefined set and not search them randomly, thus the potentially
good information missed shouldn't matter that much
(i.e. all scores will be 0.1 lower but the correct font shall still be picked)