Efficient pattern matching algo (python or js)
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I need to compare a cropped image or slightly skewed image against a corpus of images to find matches. My first thought was it can be something like fingerprint matching since there is a concept of partial fingerprint matches.

here is the scenario. I have 100K images in a corpus. I have processed these to come up with a fingerprint of the image. It looks like this https://www.dropbox.com/s/a7028so6mfnj9sf/2020-06-26_07-23-44.png?dl=0 you can think of it as a matrix where we have either a 1 or zero in a bubble zone.

now i have an image that i need to check against the corpus to find the match. its a smaller subset of the 1/0 pattern and visually looks like this - https://www.dropbox.com/s/2uchttganjbfc3i/2020-06-26_07-25-35.png?dl=0

here you can see the subsection - https://www.dropbox.com/s/rxg8ni6b6w22qn0/2020-06-26_07-25-50.png?dl=0

so we need an algo that we can use the data matrix of values to find the pattern or partial pattern match. Ideally it can be resilient to slight differences (Im thinking some sort of confidence score like 90% sure this image is a subset of image X, or 70% sure its a subset of image Y.

My thought is that we can express the matrix numerically as a curve (signal) and then attempt to find signal or partial signal matches. Totally guessing and thats why Im turning to the community here for a bounty project.

The first place bounty will be $100, second place $50 and 3rd place $25. Ideally with your submission there is some technical details or pseudo code to represent your idea.

more of my thoughts on how to approach maybe (update 1)

so i think there should be a reasonable way to represent this string as a "signal" and then i can try to match other signals against it for full match or fuzzy match
thinking something like a cosine wave
or maybe i mean sine wave
reason I'm thinking signals is i did some research on fingerprint recognition and thats generally how they approach it
as a series of cosine waves per print. and then they try to fuzzy match those
aka - Fingerprint matching by using 2D discrete cosine transform and 2D Fourier transforms. Not sure if this helps or not but thought i would share

shazam (update 2)
started looking how an app like shazam works and it seems similar to the ideas above, nicely laid out here. https://www.toptal.com/algorithms/shazam-it-music-processing-fingerprinting-and-recognition so maybe the solution is to convert the image representation to a signal and then process it in a similar way to the music?

Do we have access to the 100K image corpus?
kostasx over 2 years ago
good question. We dont at the moment. My thought was that if we need "fake data" we could unwind the matrix... here is my random thought. Using the original image above. It could be expressed by - https://www.dropbox.com/s/3nu505wlmkfry5z/2020-06-26_07-52-10.png?dl=0 - which then reduces to this string https://gist.github.com/quotient/7eabd45a776b369231c60deb7095a0ef
Qdev over 2 years ago
Hi QDev, a small question, do you have limitations on processing/memory times ? I'm thinking of an algorithm that can compare all images fingerprint to the cropped fingerprint but it might take some times to process (depending on your corpus). Other question, does all the fingerprint images of your corpus have the same size (for example 20x20 pixels ?)
kerncy over 2 years ago
at this stage, i dont care a ton about processing. we can assume in the future that all of the corpus (original) images are already pre computed and we have the signatures stored. then when we get present an image to find/locate we can do some scanning against the signatures - seems a lot like shazam probably here which is where i was thinking this would go.
Qdev over 2 years ago
Hi QDev, any update on this bounty ?
kerncy over 2 years ago

Crowdsource coding tasks.

2 Solutions

I started experimenting with a somewhat brute-force approach algorithm: the target (image) is scanned for the pattern from top to bottom and left to right. At each scan-cycle, the white and black cells that match are added to an accuracy score (2x for black ones, 0.5x for white ones) and then are calculated depending on the total cells that are compared.

I've set up a little demo here: https://codepen.io/kostasx/pen/RwrZEvO?editors=0011

As you can see, in the beginning, the pattern placed at the exact subsection, matches at about 92% accuracy. As you move the pattern (window) back and forth, you'll see it being compared to other subsections of the image along with the accompanying accuracy.

The algorithm is contained in the findPattern function which returns an object with each match categorized according to its accuracy along with a full list array with all the comparisons. You can play around with the scores (section: if ( entry === 1 ){ ... } else { ... }) and see different accuracy scores.

I am not sure if this approach will get you close to the desired result, but my curiosity and my intuition led me here (mostly my curiosity to see how this algorithm would behave).

Other ideas, that came to mind, are the perceptual hashing algorithm and of course Machine Learning.

If you have more examples to provide (target + pattern to match), perhaps I can test this setup and see how it goes.

Hope this helps you in some way. You can take it from here and experiement.

P.S. So far I was just experimenting, but of course there's room for optimizations and performance refactoring.

thanks for this, reviewing! still thinking the final route is headed down the wave/signal route. One note on the phash stuff, we actually use that for matching images today, it doesnt work well when the compare image is a crop of the original - here is an example for contents. suppose the image on right is the original and the image on left is the cropped and resized one we are trying to compare - https://www.dropbox.com/s/8t88nnu1nxvjh5e/2020-06-29_17-07-44.png?dl=0 . so my goal is to detect edges in the images and store that as a fingerprint and think of it as a signal. then i think (a big think) is that we can do matching in much the same way as shazam.
Qdev over 2 years ago


As far as I know, all recognition algorithm always work with derivative data of fingerprints, so you are not subject to resizing issue. So here is my idea ( a bit complicated I assume it)

If you want to compare two curves for example, it's easier to compare their derivative (when they are going up or down) instead of their absolute value (relative delta between all values).

for example if you take :

  • y = sin(x)
  • y = 0.5 * sin(x)

both curves have exactly the same derivative, but the relative delta would be huge (moreover, it depends on your domain). With this intuition you can see that the resizing is not take into account.

If you expand this idea to your image, your fingerprint which contains absolute positions of 0 and 1 are pure data that are subject to resizing. An idea would be to compute slopes between nearest points (3 for example) on your curve. That will lead you to a list of slopes for each "remarquable point". Than you will have to compare your list of slopes to your cropped images, and if it matches you find your solution.

As an example, consider your original image :

slopes ((yb-ya) / (xb-xa)) with 3 neighbours would be for each point (I take clockwise 3 nearest neighbour)

  • Point A [1,1] --> [to B : 0 | to D : 2/3 | to E : INF]
  • Point B [5,1] --> [to C : 1/2 | to D : -1/2 | to F : -1/2]
  • ...
  • ...
  • Point H [2,8] --> [to E : -4 / -1| to F : -3 / 1 | to G : -1 / 4]

Than do the same for your cropped/resized image. For each point of your cropped compute fingerprint with the same algorithm.
Next step, for each point of your cropped image, compute all the relative slopes with all points of your original images. The one with relative slopes nearest to 0 matches your original image

The more point in your image near 0 that matches your original image slopes, than the greatest chance you have to match your original image.

EDIT : just an update, this only works if your resize is an homothety (same factor for x and y resize) otherwise it would fail

View Timeline