Skip to main content

All Questions

1 vote
1 answer
158 views

Data structure for grouping strings in a collection when they share common substrings [closed]

I am looking for a data structure and an algorithm to manage a dynamic collection of strings, but grouping strings that have a substring in common. I try to describe it through an example. @Christophe:...
differentrain's user avatar
-1 votes
3 answers
353 views

Is there a text distance (or string similarity) algorithm which accounts for the distance between characters?

I'm interested in finding a text distance (or string similarity) algorithm which computes a greater distance (or lower similarity) when characters are further apart. For example, I want the distance ...
Vermillion's user avatar
2 votes
3 answers
2k views

Algorithm for optimizing text compression

I am looking for text compression algorithms (natural language compression, rather than compression of arbitrary binary data). I have seen for example An Efficient Compression Code for Text ...
Lance Pollard's user avatar
2 votes
4 answers
3k views

What is the optimal way to perform 5000 unique string replace functions in terms of performance?

Restructuring some code, and the way I built it up over time has portions that look something like this: s.replace("ABW"," Aruba "); s.replace("AFG"," Afghanistan "); s.replace("AGO"," Angola "); s....
Anon's user avatar
  • 3,633
2 votes
1 answer
4k views

Efficient multiple substrings search

I have many substrings(2-5 words each) which I would like to search in some text of about 40-50 words length. What is the most efficient way to flag matching substrings. Currently I am simply using: ...
skadoosh's user avatar
  • 121
6 votes
2 answers
4k views

Detecting plagiarism – what algorithm?

I'm currently writing a program to read a body of text and compare it to search-engine results (from searching for substrings of the given text), with the goal of detecting plagiarism in, for example, ...
Vivian's user avatar
  • 189
-6 votes
2 answers
341 views

Which piece of code is more efficient with respect to Time and Memory cost? [closed]

Code 1: private static int myCompare(String a, String b) { /* my version of the compareTo method from the String Java class */ int len1 = a.length(); int len2 = b.length(); if (...
Avid Programmer's user avatar
7 votes
2 answers
282 views

Finding and counting equal substrings in a set of strings

I'm thinking about a way of finding similar parts in Strings. I have a set of strings of varying length i.e: The quick brown fox jumps fox force five the bunny is much quicker than the fox is First, i ...
Chris's user avatar
  • 207
0 votes
1 answer
2k views

Most Pythonic way to remove first match of potential leading strings?

This is a bit difficult to describe, but I'll do my best. In Python, I can use string.startswith(tuple) to test for multiple matches. But startswith only returns a boolean answer, whether or not it ...
Hactar's user avatar
  • 115
6 votes
4 answers
6k views

Is "use "abc".equals(myString) instead of myString.equals("abc") to avoid null pointer exception" already problematic in terms of business logic?

I heard numerous times that when comparing Strings in Java, to avoid null pointer exception, we should use "abc".equals(myString) instead of myString.equals("abc"), but my question is, is this idea ...
ggrr's user avatar
  • 5,863
-1 votes
1 answer
1k views

Find missing number in sequence in string [closed]

I have a string that contains numbers in sequence. There are no delimiters between numbers. I have to find missing number in that sequence. For example: 176517661768 is missing the number: 1767 ...
Neo's user avatar
  • 31
3 votes
2 answers
1k views

Burrows-Wheeler transform backward search: how to find suffix index?

BWT backward search algorithm is pretty straightforward if we only need the multiplicity of a pattern. However I also need to find the suffix indices (i.e. positions in the reference string where a ...
user798275's user avatar
4 votes
2 answers
2k views

why regex, when using global search and {0,} quantifier, match the end of the string?

I have asked a question here about js, regex, quantifiers and global search. I've understood finally how this works, but, let's take a concrete example and then I`ll write my question. Based on the ...
Gigi Ionel's user avatar
1 vote
0 answers
404 views

clustering of strings with variable-length prefixes

I've got bunch of strings with variable-length prefixes (or postfixes - I can always revert them) as follows: 0155555555 523455555555 755555555 ... 87129999999999999 119999999999999 09119999999999999 ...
god's user avatar
  • 232
0 votes
0 answers
1k views

Compare names and the use of Levenshtein's algorithm

I need to cross names from two lists. What is the best away to compare the names? As you may expect, in one list we can have the complete name, on the other just the first and last. Besides that, ...
cap7's user avatar
  • 287

15 30 50 per page