Longest collatz sequence using dynamic programming

Question

I am trying to solve longest collatz sequence problem under 1000000 with the below code. Can anyone suggest a faster way to approach this problem? I was thinking of dynamic programming, but I'm having trouble in understanding it.

#include <stdio.h>

long long int col(long long int n)
{
    if(n==1)
    {
        return 1;
    }
    else
    {
        if(n%2==0)
            return (1+col(n/2));
        else
            return (1+col(3*n+1));
    }
}

int main()
{
    long long int i=0, c, max, k=1;
    max=1;
    for(i=1; i<1000000; i++)
    {
        c=col(i);
        if(c>max)
        {
            max=c;
            k=i;
        }
    }

    printf("%lld",max);
    printf("\n%lld\n",k);
    return 0;
}

if you don't mind to use memory, could have a container, say int cache[1000] = {0};, and save in every element of the container the collatz chain length when starting with that number. Using this cache could stop calculating a lot of collatz chain length when encounter an already calculated number. — NetVipeC, Commented Aug 30, 2014 at 3:16
This looks a lot like a solution for projecteuler.net/problem=14 . If it is, this could be included in the question and the question could have the relevant project-euler tag. — SylvainD, Commented Oct 22, 2014 at 14:57

David C. Rankin · Accepted Answer · 2014-08-30 02:40:23Z

2

Well, it appears you are having a bit of trouble getting things off-the-ground, so to speak. Let's start with the basics. Let's name your file collatz.c and after a quick glance, it looks like it should compile:

gcc -Wall -Wextra -o ctz collatz.c

Good, it compiled with no errors and no warnings. Now let's see if it will run:

output:

$ time ./ctz
525
837799

real    0m2.088s
user    0m2.082s
sys     0m0.004s

Also good, max is 525 and k is 837799 and it completed in less than 2.1 seconds. The logic is implemented in a single recursive function akin to that used by a non-math-lib power function, so no great speed improvements come to mind. As was pointed out, there are optimization that can help reduce the execution time. Let's try the suggested -Ofast -fwhole-program:

gcc -Wall -Wextra -o ctz collatz.c -Ofast -fwhole-program

output:

$ time ./ctz
525
837799

real    0m0.480s
user    0m0.474s
sys     0m0.003s

A 400+% improvement. That's better. So it looks like your work is done. Drop a comment or edit your question if you have more specifics in mind.

answered Aug 30, 2014 at 2:40

David C. Rankin

9278 silver badges15 bronze badges

\$\begingroup\$ ...although there's no optimization! \$\endgroup\$
– ikh
Commented Aug 30, 2014 at 2:42
\$\begingroup\$ With -Ofast -fwhole-program of gcc 4.8.3, it takes only 634ms 561ms 46ms (real/user/sys) \$\endgroup\$
– ikh
Commented Aug 30, 2014 at 2:44
1

\$\begingroup\$ Your computer is much much faster than this old box. \$\endgroup\$
– David C. Rankin
Commented Aug 30, 2014 at 2:46
\$\begingroup\$ Oh, sorry :) but in your case the time will be also decreased with optimization. It takes 1488ms 1419ms 31ms without optimization in my case >o< \$\endgroup\$
– ikh
Commented Aug 30, 2014 at 2:49
\$\begingroup\$ (after seeing edit) Hey, the really slow box was mine! (It's okay, because it's actually not mine; it's my school's >o<) \$\endgroup\$
– ikh
Commented Aug 30, 2014 at 2:50

| Show 3 more comments

JS1 · Accepted Answer · 2015-01-22 00:01:11Z

I took the suggestion from NetVipeC and modified your program a bit. The changes were:

I used a #defined constant MAXNUM to control the number of iterations.
I added a result array to store previous results.
In the col function, it now ends when it finds a previous result, instead of just stopping at 1.

The Modified Program

#include <stdio.h>

#define MAXNUM  1000000

unsigned short result[MAXNUM];

long long int col(long long int n)
{
    if (n < MAXNUM && result[n] != 0)
        return result[n];
    if(n%2==0)
        return (1+col(n/2));
    else
        return (1+col(3*n+1));
}

int main()
{
    long long int i=0, c, max, k=1;
    max=1;
    result[1] = 1;
    for(i=2; i<MAXNUM; i++)
    {
        c=result[i]=col(i);
        if(c>max)
        {
            max=c;
            k=i;
        }
    }

    printf("%lld",max);
    printf("\n%lld\n",k);
    return 0;
}

I also went ahead and rewrote your program to remove recursion, add in a small optimization, and allow the max to be set by a command line argument:

The Rewritten Program

#include <stdio.h>
#include <stdint.h>
#include <stdlib.h>
#include <inttypes.h>

#define DEFAULT_MAXNUM        100000000

static uint16_t *result;

uint16_t collatz(uint64_t n)
{
    uint64_t originalNumber = n;
    uint16_t count          = 1;

    while (1) {
        if (n & 1) {
            n = n*3+1;
        } else {
            n >>= 1;
            if (n < originalNumber)
                return count + result[n];
        }
        count++;
    }
}

int main(int argc, char *argv[])
{
    uint64_t i         = 0;
    uint16_t count     = 0;
    uint16_t bestCount = 1;
    uint64_t bestNum   = 1;
    uint64_t maxNum    = DEFAULT_MAXNUM;

    if (argc > 1) {
        maxNum = strtoull(argv[1], NULL, 0);
        if (maxNum < 2)
            maxNum = 2;
    }

    result = malloc(maxNum * sizeof(result[0]));
    if (result == NULL) {
        fprintf(stderr, "Error: Not enough memory\n");
        return 1;
    }

    result[1] = 1;
    for(i=2; i<maxNum; i++) {
        count = result[i] = collatz(i);
        if (count > bestCount) {
            bestCount = count;
            bestNum   = i;
        }
    }

    printf("Max considered: %" PRIu64 "\n", maxNum);
    printf("Largest number: %" PRIu64 "\n", bestNum);
    printf("Largest count : %d\n", bestCount);
    return 0;
}

The Timing Results

Here are the timing results from the three programs. Note that I used 100000000 as the number of iterations (100x the original amount) to get longer run times. I also used gcc -O4 to build all programs.

Original program : 37.7s
Modified program :  2.0s
Rewritten program:  1.2s

Stack Exchange Network

Longest collatz sequence using dynamic programming

2 Answers 2

The Modified Program

The Rewritten Program

The Timing Results

Hot Network Questions

Longest collatz sequence using dynamic programming

2 Answers 2

The Modified Program

The Rewritten Program

The Timing Results

Related

Hot Network Questions