Allow option for "fuzzy" search, as well as combining search options #848

etyarews · 2022-10-06T15:19:44Z

etyarews
Oct 6, 2022

My issue is that I have previous versions of files, they have either very similar names or very similar file size, but not exactly the same. Hash also doesn't find them due to the discrepancies.

A solution to this would be to allow a user defined "margin of error" when using name and size options, For example, a name might be 95% similar (ignoring the extension, which should be identical), or files could be grouped together as long as they fall in a 10% margin of error in terms of size. The problem with margin of error in either of those cases is that it would result in a huge number of false positives, making it quite useless.

However, if you were to combine both options, you would get something more reliable. Take for example two files:

Super Special Project.odt (3 MB)
The Super Special Project Final.odt (3.5 MB)
The Super Special Project Final Final.odt (4.0 MB)

It is very clear for a human that they are all versions of the same file made at different dates, but not to czkawka. You could also argue that they aren't really the same file, but I disagree, they are just different versions, and if the content is incremental it is safe to remove the older version. Also, if you put it in different folders or spread it across your computer it becomes increasingly difficult to manually clean those duplicates, so a tool like czkawka is needed

bogn83 · 2024-02-06T01:37:46Z

bogn83
Feb 6, 2024

It would be super useful in finding similar code files too.

@qarmin (GREAT RUST PROJECT btw!):
tons of developers could use it then (I'd love to do right now) to find code that they want to refactor for reuse (I know DRY is not always ideal, but in frequent cases it's advisible to not repeat the same stuff too often when you have 200+ view template files which are mostly similar and where abstract syntax tree (AST) comparison is hard to do for).

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow option for "fuzzy" search, as well as combining search options #848

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Allow option for "fuzzy" search, as well as combining search options #848

etyarews Oct 6, 2022

Replies: 1 comment

bogn83 Feb 6, 2024

etyarews
Oct 6, 2022

bogn83
Feb 6, 2024