2
\$\begingroup\$

Here is the Hackerrank problem which is related to dense ranking and below is my solution which didn't pass the time limit cases. Any better way to optimize this?

def climbingLeaderboard(ranked, player):
    ranked_set = sorted(list(set(ranked)), reverse=True)
    result = []
    print(ranked_set)
    for p in player:
        appended = False
        for i in range(len(ranked_set)):
            if p >= ranked_set[i]:
               result.append(i+1)
               appended = True
               break
        if appended == False:
            result.append(len(ranked_set)+1)
    return result
\$\endgroup\$

1 Answer 1

2
\$\begingroup\$

My approach, described below, should definitely be an improvement. I have named my version climbing_leaderboard so as to not conflict with your name (I have included your version in mine in a benchmark to measure the performances of each), but also to be more in conformance with the PEP 8 Style Guide standard form naming functions.

General Suggestions for Improvement

It's always a good idea to provide a docstring describing what the function is supposed to be doing. Either use typing to describe the function's arguments and return value or provide that information in the docstring itself. Enough said.

Algorithmic Improvement

For each element p of the player argument, you are searching sequentially through your ranked_set list to find the new ranking for p. For small values of p you might be searching almost every element of ranked_set to accomplish that since ranked_set is sorted in descending order. Also, ranked_set is not the best name for a variable that references a list.

A better approach would be to use a binary search to accomplish this. To discover any given ranking should not require examining more than approximately Log2(N) elements where N is the number of elements in ranked_set.

In the following code I have included your version (less print statements) and mine and print out the times for my desktop.

from typing import List
import time

timer = time.perf_counter

def climbingLeaderboard(ranked, player):
    """Solve the problem descibed at:
    https://www.hackerrank.com/challenges/climbing-the-leaderboard/problem?isFullScreen=true"""

    # Remove duplicate scores:
    ranked_list = sorted(list(set(ranked)), reverse=True)

    result = []
    last_index = len(ranked_list) - 1
    for p in player:
        start = 0
        end = last_index
        while start <= end:
            mid = (start + end) // 2
            score = ranked_list[mid]
            if p > score:
                end = mid - 1
            elif p < score:
                start = mid + 1
            else:
                break

        if start > end:
            # No exact hit
            rank = mid + 1 if p > score else mid + 2
        else:
            rank = mid + 1

        result.append(rank)

    return result


ranked = list(range(100_000, 1, -3))
player = list(range(10, 100_030, 20))

t0 = timer()
result1 = climbingLeaderboard(ranked, player)
t = timer() - t0
print(t)

t0 = timer()
result2 = climbing_leaderboard(ranked, player)
t = timer() - t0
print(t)

assert result1 == result2

Prints:

3.8725043
0.01452220000000004

Update

The following is a benchmark of 3 methods of initializing the ranked_list variable from the input so as to remove duplicate values when they exist. For the first run there are 30,000 distinct values in descending order and for the second run there are 15,000 distinct values each duplicated once.

  1. Method 1: As in the above code.
  2. Method 2: The input is stepped through element by element and if the element is not the same value as the previous element, it is appended to the initially empty ranked_list.
  3. itertools.groupby is used as suggested by vnp.

The code:

import itertools
import time

timer = time.perf_counter

def benchmark():
    N = 1_000

    for run in (1, 2):
        if run == 1:
            # No duplicate values
            ranked = list(range(30_000, 0, -1))
        else:
            # Two of each value:
            ranked = []
            for n in range(15_000, 0, -1):
                ranked.append(n)
                ranked.append(n)

        t = timer()
        for _ in range(N):
            ranked_list = sorted(list(set(ranked)), reverse=True)
        t1 = timer() - t

        t = timer()
        for _ in range(N):
            ranked_list = []
            last = None
            for r in ranked:
                if r != last:
                    last = r
                ranked_list.append(r)
        t2 = timer() - t

        t = timer()
        for _ in range(N):
            ranked_list = [k for k, _ in itertools.groupby(ranked)]
        t3 = timer() - t

        print(t1, t2, t3)

benchmark()

The output:

1.9442110000000001 2.5163936 3.0017806
0.7499568000000005 2.2753618000000007 1.654442699999998

Conclusion

Method 1 is the fastest in both cases. With no duplicates in the input, method 3 is the slowest. When there are duplicates, method 2 is the slowest.

\$\endgroup\$
6
  • 1
    \$\begingroup\$ ranked_list = sorted(list(set(ranked)), reverse=True) is somewhat suboptimal. It ignores the fact that ranked is already sorted.. Considetrranked_list = [k for k, _ in itertools.groupby(ranked)]. \$\endgroup\$
    – vnp
    Commented Nov 24, 2023 at 21:27
  • \$\begingroup\$ @vnp You are probably correct. I say that because in my original code I looped successively through all the elements of the ranked list argument and if it was not equal to the previous element it was added to the initially empty ranked_list. At least for the input I used in the benchmark above there seemed to be no appreciable difference in timings for that method, your method using itertools.groupby and the OP's original code, which I decided to retain. I would have to do a proper benchmark on just this code fragment executing many repetitions to know for sure. \$\endgroup\$
    – Booboo
    Commented Nov 24, 2023 at 21:56
  • \$\begingroup\$ @vpn I added a benchmark to see which of 3 methods of initializing ranked_list is the fastest (at least for the input I have chosen). The method used in the code, which involves creating a set to remove duplicates and then re-sorting, is the fastest whether there are duplicate values in the input or not. Have a look. \$\endgroup\$
    – Booboo
    Commented Nov 25, 2023 at 0:07
  • \$\begingroup\$ It's a good idea not to go through the whole ranked_list to rank a player but to start from the middle. \$\endgroup\$
    – peternish
    Commented Nov 25, 2023 at 8:27
  • \$\begingroup\$ @peternish That is what a binary search does. That is what my code does. I am not sure what your point is. \$\endgroup\$
    – Booboo
    Commented Nov 25, 2023 at 10:25

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.