edited tags

Link

edited Jun 20, 2021 at 23:02

Christophe

81.4k
11
133
200

added 15 characters in body

Source Link

edited Aug 23, 2018 at 7:13

Lance Pollard

2.7k
1
23
41

What I'm wondering is, if there is a way to automatically find the best encoding for the bytes. Automatically find all the sequences that can be cachedput into a dictionary. I don't see how that's not possible, but I imagine it is otherwise it would've been done already. It seems like it would be best solved in the area of DNA sequence analysis.

added 7 characters in body; edited title

Source Link

edited Aug 23, 2018 at 6:54

Christophe

81.4k
11
133
200

The reason why you can't compress Algorithm for optimizing text like thiscompression

I have seen someam looking for text compression stuffalgorithms (natural language compression, as opposed to arbitrary byte or integerrather than compression of arbitrary binary data), such as .

I have seen for example An Efficient Compression Code for Text DatabasesAn Efficient Compression Code for Text Databases. It sounds like theyThis algorithm basically useuses the words as symbols, createcreates a dictionary from them, and replacereplaces them with integers. So something like this:

Then that would mean the text is turned into:

deleted 2 characters in body

Source Link

edited Aug 23, 2018 at 4:25

Lance Pollard

2.7k
1
23
41

Loading

Source Link

asked Aug 23, 2018 at 4:19

Lance Pollard

2.7k
1
23
41

Loading

Stack Exchange Network

Return to Question

The reason why you can't compress Algorithm for optimizing text like thiscompression

The reason why you can't compress text like this

Algorithm for optimizing text compression