My approach, described below, should definitely be an improvement. I have named my version climbing_leaderboard
so as to not conflict with your name (I have included your version in mine in a benchmark to measure the performances of each), but also to be more in conformance with the PEP 8 Style Guide standard form naming functions.
General Suggestions for Improvement
It's always a good idea to provide a docstring describing what the function is supposed to be doing. Either use typing to describe the function's arguments and return value or provide that information in the docstring itself. Enough said.
Algorithmic Improvement
For each element p
of the player argument, you are searching sequentially through your ranked_set
list to find the new ranking for p
. For small values of p
you might be searching almost every element of ranked_set
to accomplish that since ranked_set
is sorted in descending order. Also, ranked_set
is not the best name for a variable that references a list.
A better approach would be to use a binary search to accomplish this. To discover any given ranking should not require examining more than approximately Log2(N) elements where N is the number of elements in ranked_set
.
In the following code I have included your version (less print
statements) and mine and print out the times for my desktop.
from typing import List
import time
timer = time.perf_counter
def climbingLeaderboard(ranked, player):
"""Solve the problem descibed at:
https://www.hackerrank.com/challenges/climbing-the-leaderboard/problem?isFullScreen=true"""
# Remove duplicate scores:
ranked_list = sorted(list(set(ranked)), reverse=True)
result = []
last_index = len(ranked_list) - 1
for p in player:
start = 0
end = last_index
while start <= end:
mid = (start + end) // 2
score = ranked_list[mid]
if p > score:
end = mid - 1
elif p < score:
start = mid + 1
else:
break
if start > end:
# No exact hit
rank = mid + 1 if p > score else mid + 2
else:
rank = mid + 1
result.append(rank)
return result
ranked = list(range(100_000, 1, -3))
player = list(range(10, 100_030, 20))
t0 = timer()
result1 = climbingLeaderboard(ranked, player)
t = timer() - t0
print(t)
t0 = timer()
result2 = climbing_leaderboard(ranked, player)
t = timer() - t0
print(t)
assert result1 == result2
Prints:
3.8725043
0.01452220000000004
Update
The following is a benchmark of 3 methods of initializing the ranked_list
variable from the input so as to remove duplicate values when they exist. For the first run there are 30,000 distinct values in descending order and for the second run there are 15,000 distinct values each duplicated once.
- Method 1: As in the above code.
- Method 2: The input is stepped through element by element and if the element is not the same value as the previous element, it is appended to the initially empty
ranked_list
.
itertools.groupby
is used as suggested by vnp.
The code:
import itertools
import time
timer = time.perf_counter
def benchmark():
N = 1_000
for run in (1, 2):
if run == 1:
# No duplicate values
ranked = list(range(30_000, 0, -1))
else:
# Two of each value:
ranked = []
for n in range(15_000, 0, -1):
ranked.append(n)
ranked.append(n)
t = timer()
for _ in range(N):
ranked_list = sorted(list(set(ranked)), reverse=True)
t1 = timer() - t
t = timer()
for _ in range(N):
ranked_list = []
last = None
for r in ranked:
if r != last:
last = r
ranked_list.append(r)
t2 = timer() - t
t = timer()
for _ in range(N):
ranked_list = [k for k, _ in itertools.groupby(ranked)]
t3 = timer() - t
print(t1, t2, t3)
benchmark()
The output:
1.9442110000000001 2.5163936 3.0017806
0.7499568000000005 2.2753618000000007 1.654442699999998
Conclusion
Method 1 is the fastest in both cases. With no duplicates in the input, method 3 is the slowest. When there are duplicates, method 2 is the slowest.