Questions tagged [string-matching]
The string-matching tag has no summary.
39 questions
1
vote
1
answer
158
views
Data structure for grouping strings in a collection when they share common substrings [closed]
I am looking for a data structure and an algorithm to manage a dynamic collection of strings, but grouping strings that have a substring in common. I try to describe it through an example.
@Christophe:...
-1
votes
3
answers
352
views
Is there a text distance (or string similarity) algorithm which accounts for the distance between characters?
I'm interested in finding a text distance (or string similarity) algorithm which computes a greater distance (or lower similarity) when characters are further apart.
For example, I want the distance ...
2
votes
2
answers
915
views
Are there typo-tolerance algorithms (as opposed to string similarity)? [closed]
I want to build a search with basic typo tolerance.
There are quite a few string similarity algorithms (and implementations for almost all languages I guess).
However, humans tend to make some typos ...
2
votes
0
answers
237
views
Algorithm to search very long blacklist in another very long data set
I have two data sets. The first data set has approx. 50.000 movie and song titles and the second one have 20.000 blacklist strings. I am looking for the best algorithm to detect movie/song title which ...
0
votes
1
answer
241
views
Data Matching In VBA - Best way to deal with dynamic data and user entry?
Background
I am currently building this project with VBA, just to keep in the back of your mind when thinking about my question.
Imagine 2 adjacent blocks, in Excel. The first block is made up of ...
1
vote
0
answers
213
views
Name matching in SWIFT messages
Here i am basically looking for performance improvement.
I need to match names in a SWIFT message (Let's say MT 103) against sanctions lists (sanctions lists by UN, by OFAC, some custom lists) and ...
2
votes
3
answers
2k
views
Algorithm for optimizing text compression
I am looking for text compression algorithms (natural language compression, rather than compression of arbitrary binary data).
I have seen for example An Efficient Compression Code for Text ...
1
vote
0
answers
195
views
phonetic algorithms for words that aren't surnames?
I've been doing a little research into algorithms for matching spelling mistakes in names. From Soundex through to metaphone and Beider-Morse. All of these algorithms generally focus on last names ...
1
vote
1
answer
169
views
Find a string in list of strings
Background:
I am writing an application for a small embedded device. There is a static list of strings: currently about 500 strings and string length is 12 characters on average. The list might ...
2
votes
4
answers
3k
views
What is the optimal way to perform 5000 unique string replace functions in terms of performance?
Restructuring some code, and the way I built it up over time has portions that look something like this:
s.replace("ABW"," Aruba ");
s.replace("AFG"," Afghanistan ");
s.replace("AGO"," Angola ");
s....
2
votes
1
answer
4k
views
Efficient multiple substrings search
I have many substrings(2-5 words each) which I would like to search in some text of about 40-50 words length. What is the most efficient way to flag matching substrings.
Currently I am simply using:
...
6
votes
2
answers
4k
views
Detecting plagiarism – what algorithm?
I'm currently writing a program to read a body of text and compare it to search-engine results (from searching for substrings of the given text), with the goal of detecting plagiarism in, for example, ...
-6
votes
2
answers
341
views
Which piece of code is more efficient with respect to Time and Memory cost? [closed]
Code 1:
private static int myCompare(String a, String b) {
/* my version of the compareTo method from the String Java class */
int len1 = a.length();
int len2 = b.length();
if (...
37
votes
7
answers
51k
views
What algorithm would you best use for string similarity?
I am designing a plugin to uniquely identify content on various web pages, based on addresses.
So I may have one address which looks like:
1 someawesome street, anytown, F100 211
later I may find ...
3
votes
3
answers
139
views
Replace strings based on substring match
I have N strings and M search-replace pairs. Each of the strings contains exactly one of the search pair and the whole string needs to be replaced by the replace pair.
Say you have returns,between,...