PIH

Features wanted...
Marco
Posts: 2347
Joined: 27 Jun 2011 15:20

PIH

Post by Marco »

Do you have any other algorithms to try?
Attachments
Appunti-20200130.png
Appunti-20200130.png (335.58 KiB) Viewed 4180 times
Tag Backup - SimpleUpdater - XYplorer Messenger - The Unofficial XYplorer Archive - Everything in XYplorer
Don sees all [cit. from viewtopic.php?p=124094#p124094]

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

Hihi LOL.

Yes, I'm working on it... :biggrin:

eil
Posts: 1617
Joined: 13 Jan 2011 19:44

Re: PIH

Post by eil »

Code: Select all

Set it to a value larger than 0 to define the tolerated distance for two 
      images to be counted as dupes.
      Already tolerance 1 will ignore scale, brightness, contrast, saturation, 
      and hue. Higher tolerance will ignore JPEG artifacts and minor differences 
      in detail.
just a thought, but wouldn't it be better to make tolerance be "more flexible" by putting each "ignore" on different percentage?..
smt like:
1 = ignore scale
2 = ignore brightness
4 = ignore contrast
8 = ignore saturation
or at least 1,2,3,4, and only on 5 "Higher tolerance will ignore JPEG artifacts"
Win 7 SP1 x64 100% 1366x768

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

I see what you mean, but this (partly) sounds more easy than it's done. At the moment I'm quite satisfied with the current state as "phase one" and would like to publish it to the masses and see the kind of feedback I'll get.

But what about the feedback here in the forum? Does it work for you as good as for me? (in my personal photo collection it does a perfect job at the moment)

eil
Posts: 1617
Joined: 13 Jan 2011 19:44

Re: PIH

Post by eil »

that's what i can show as feedback. ~79Gb scanned, quite fast, especially as it's done on rather old laptop.
default SHA-1, 9 tolerance
2020-02-03_220005.png
2020-02-03_220005.png (2.86 KiB) Viewed 4077 times
just for test SHA-256, 9 tolerance.
2020-02-03_220707.png
2020-02-03_220707.png (2.8 KiB) Viewed 4077 times
kinda surprised it found 2 extras. :kidding:
Win 7 SP1 x64 100% 1366x768

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

Thanks! And did the results look good? Not too many false positives are false negatives?

Now the interesting part: You cannot set the hash method here, this dropdown belongs to the "Content" dupe search above. So you have run the same search two times and got different results! Hmm, there is no learning built in... :) But I once saw the same here during my tests. I'll watch it...

PS: Your second search was much slower??? That cannot be. Or was it actually the first search? (There is heavy caching here, any non-first search will be MUCH faster).

eil
Posts: 1617
Joined: 13 Jan 2011 19:44

Re: PIH

Post by eil »

Thanks! And did the results look good? Not too many false positives are false negatives?
visual presentation is good, groups are easily determined, text info is well informative. didn't check all dupes, don't really want to sort them right now, but checked about 10-15 groups = no false at all.
You cannot set the hash method here, this dropdown belongs to the "Content" dupe search above. So you have run the same search two times and got different results!
though you say that makes no difference, still probably speed really depends on CPU business, 'cause here i made 9 searches with so to say "different methods" and there are differences in time. though i must mention that 1-4 was while doing some other stuff, and 5-9 was while doing nothing waiting for results. plus can't say that "next runs" were actually "much faster", though again maybe it's due to old laptop.
2020-02-04_121627.png
2020-02-04_121627.png (12.21 KiB) Viewed 4046 times
2020-02-04_121849.png
2020-02-04_121849.png (12.15 KiB) Viewed 4046 times
2020-02-04_122055.png
2020-02-04_122055.png (12.25 KiB) Viewed 4046 times
2020-02-04_122307.png
2020-02-04_122307.png (12.17 KiB) Viewed 4046 times
2020-02-04_122556.png
2020-02-04_122556.png (12.19 KiB) Viewed 4046 times
2020-02-04_122803.png
2020-02-04_122803.png (12.32 KiB) Viewed 4046 times
2020-02-04_122900.png
2020-02-04_122900.png (12.21 KiB) Viewed 4046 times
2020-02-04_123017.png
2020-02-04_123017.png (12.15 KiB) Viewed 4046 times
2020-02-04_123144.png
2020-02-04_123144.png (12.18 KiB) Viewed 4046 times
Win 7 SP1 x64 100% 1366x768

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

I see you have Content and Image ticked. That does not make much sense since Content dupes are logically always also Image dupes. :)

Thanks for that insight which it allows me to make my code faster: if Content is ticked then any check for Image dupes can be skipped for logical reasons.
a) if content differs we don't have to check for image since all Dupes filters are AND-ed
b) if content is same then image is same as well

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

FYI, you can now search images by similarity to a given image. ATM it's a two-step process:

1. Retrieve image hash of the given image:

Code: Select all

text "<prop #image.hash>";
2. Go to your search location and use this search term (1b116dc5c3c5311b is an example for the hash returned above):

Code: Select all

prop:#image.hash:1b116dc5c3c5311b
Future enhancements could be
- a way to pass a tolerance value to the search (at the moment the default here is tolerance = 1)
- a fully wrapped comfort feature where you point to an image and order "find similar images!"

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

PS: Of course, the same syntax also lets you define color filters for similar images, how cool is that?

Code: Select all

prop:#image.hash:1b116dc5c3c5311b

Enternal
Posts: 1174
Joined: 10 Jan 2012 18:26

Re: PIH

Post by Enternal »

Well, not very cool. I'm supposed to be managing files and not playing with neat new features. What are you doing to me?

(And yes, wow, this really is a neat feature! I can finally get rid of this other ancient tool I have been using for the longest time.)

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

It might even get uncooler. So many things are imaginable now. What about sorting photos by their dominant color hue (like some folks do with their books on the shelf)?

Enternal
Posts: 1174
Joined: 10 Jan 2012 18:26

Re: PIH

Post by Enternal »

Yes, please. Uhh, I meant no. :biggrin:

eil
Posts: 1617
Joined: 13 Jan 2011 19:44

Re: PIH

Post by eil »

admin wrote: 05 Feb 2020 15:31 FYI, you can now search images by similarity to a given image.
sounds as neat feature for huge photo archives. :biggrin:
i wonder if it may allow in future to hash an image with lots of green(photo in forest), and with some # tolerance find not similar, but same "lots-of-green"-images?. it's just a question of curiosity. :roll:
Win 7 SP1 x64 100% 1366x768

admin
Site Admin
Posts: 60357
Joined: 22 May 2004 16:48
Location: Win8.1 @100%, Win10 @100%
Contact:

Re: PIH

Post by admin »

Sure, that's possible.

Post Reply