Given a list of numbers, the majority is more than half. If we randomly pick up a number from numbers, what is the probability of at least one number is the majority number?

In probability, it should be 1-(every picked number is not the majority)

1-(1-P(majority))**n

Since P(majority) ≥0.5

1-(1-P(majority))**n ≤(1–0.5**n)

We can have the output of n and upper bound of 1-(1-P(majority))**n

>>> for n in range(1, 21):
... print(n, ":", 1-0.5**n)
...
1 : 0.5
2 : 0.75
3 : 0.875
4 : 0.9375
5 : 0.96875
6 : 0.984375
7 : 0.9921875
8 : 0.99609375
9 : 0.998046875
10 : 0.9990234375
11 : 0.99951171875
12 : 0.999755859375
13 : 0.9998779296875
14 : 0.99993896484375
15 : 0.999969482421875
16 : 0.9999847412109375
17 : 0.9999923706054688
18 : 0.9999961853027344
19 : 0.9999980926513672
20 : 0.9999990463256836

It means that if we randomly pick up 20 numbers, at least one of them is the majority’s possibility can be 0.9999990463256836.

One leetcode problem is related to this basic probability.

1157. Online Majority Element In Subarray

I have the following code, however it got TLE.

21 / 27 test cases passed.

class MajorityChecker:def __init__(self, arr: List[int]):
self.arr = arr
self.i_v_cnt = collections.defaultdict(lambda :collections.defaultdict(int))
self.i_v_cnt[0][arr[0]] +=1
for i in range(1, len(arr)):
v = arr[i]
###need deep copy this line will generate a shallow copy
#self.i_v_cnt[i] =self.i_v_cnt[i-1]
#### this is deep copy
self.i_v_cnt[i] = copy.deepcopy(self.i_v_cnt[i-1])
self.i_v_cnt[i][v] += 1

def query(self, left: int, right: int, threshold: int) -> int:
from random import randrange
this_arr = [self.arr[i] for i in (randrange(left, right+1) for _ in range(20))]
for num in set(this_arr):
if self.i_v_cnt[right][num] - self.i_v_cnt[left-1][num] >=threshold:return num
return -1
# Your MajorityChecker object will be instantiated and called as such:
# obj = MajorityChecker(arr)
# param_1 = obj.query(left,right,threshold)

The reason might be the size of self.i_v_cnt can be super large. As the arr will have a maximum length of 20000 and the worst case it will have 10000 unique numbers. It will make the self.i_v_cnt super large.

Inspired by

https://leetcode.com/problems/online-majority-element-in-subarray/discuss/355848/Python-Binary-Search-+-Find-the-Majority-Element

We can only save the index of num in arr. Then bisect it to the indexes to find the occurrence of this num within a scope.

Solution based on ideas above

Runtime: 1548 ms, faster than 6.95% of Python3 online submissions for Online Majority Element In Subarray.

class MajorityChecker:def __init__(self, arr: List[int]):
a2i = collections.defaultdict(list)
for i, x in enumerate(arr):
a2i[x].append(i)
self.arr, self.a2i = arr, a2i

def query(self, left: int, right: int, threshold: int) -> int:
from random import randrange
this_arr = [self.arr[i] for i in (randrange(left, right+1) for _ in range(40))]
def get_count(a):
idxes = self.a2i[a]
return bisect.bisect(idxes, right) - bisect.bisect_left(idxes, left)

for num in set(this_arr):
count = get_count(num)
if count>=threshold: return num
return -1
# Your MajorityChecker object will be instantiated and called as such:
# obj = MajorityChecker(arr)
# param_1 = obj.query(left,right,threshold)

Another one related to using probability.

837. New 21 Game

https://leetcode.com/problems/new-21-game/

Naive DFS will get TLE.

86 / 146 test cases passed.

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
res = []
total = []
def dfs(remain, cur_pro):
if remain==1:
total.append(cur_pro)
valid = sum(K-1+w<=N for w in range(1, W+1))
res.append(cur_pro*valid/W)
elif remain<=0:
total.append(cur_pro)
if K-remain<=N:res.append(cur_pro)
else:
for w in range(1, W+1):
dfs(remain-w, cur_pro/W)

dfs (K, 1)
return sum(res)/sum(total)

Using cache also got TLE but got 10 more test cases AC

96 / 146 test cases passed.

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
from functools import lru_cache
@lru_cache(None)
def dfs(remain):
if remain==1:
valid = sum(K-1+w<=N for w in range(1, W+1))
return 1, valid/W
elif remain<=0:
return 1, K-remain<=N
else:
total, res = 0, 0
for w in range(1, W+1):
this_remain =remain- w
this_total, this_res = dfs(this_remain)
total += this_total/W
res += this_res/W
return total, res

total, res = dfs(K)
return res/total

This code adjusted from the post below also got TLE.

Complexity is O((K + W) * W)

96 / 146 test cases passed.

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
if K == 0 or N >= K + W: return 1
max_ = K + W - 1
dp = [0]*(max_+1)
dp[0] = 1
for i in range(1, max_+1):
for j in range(1, W+1):
if i-j>=0 and i-j<K:
dp[i] += dp[i-j]/W
return sum(dp[i] for i in range(K, N+1))

Then I realize that the transition equation dp[i] = (dp[i - W] + dp[i - W + 1] + ... + dp[i - 1]) / W could be simplified to dp[i] = (sum[i - 1] - sum[i - W - 1]) / W.

Furthermore, if we use dp[i] to directly represent the sum[i], we can get dp[i] = dp[i - 1] + (dp[i - 1] - dp[i - W - 1]) / W. This equation takes us to the final O(K + W) solution. Just take care with the beginning and the end of the array.

dp (200 ms)

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
if K == 0 or N >= K + W: return 1
max_ = K + W - 1
dp = [0]*(max_+1)
dp[0] = 1
for i in range(1, max_+1):
dp[i] = dp[i-1]
if i<=W:
dp[i] += dp[i-1]/W
else:
dp[i] += (dp[i-1]-dp[i-W-1]) / W
if i>K:
dp[i] -= (dp[i-1]-dp[K-1])/W
return dp[N]-dp[K-1]

Finally, from the post from lee215, I realized that it is a DP problem. The time complexity if O(n) and the code from the link below

https://leetcode.com/problems/new-21-game/discuss/132334/One-Pass-DP-O(N)

Runtime (70 ms)

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
if K == 0 or N >= K + W: return 1
dp = [1.0] + [0.0] * N
Wsum = 1.0
for i in range(1, N + 1):
dp[i] = Wsum / W
if i < K: Wsum += dp[i]
if i - W >= 0: Wsum -= dp[i - W]
return sum(dp[K:])

Or another DP (60 ms)

class Solution:
def new21Game(self, N: int, K: int, W: int) -> float:
dp = [0.0] * (N + W + 1)
# dp[x] = the answer when Alice has x points
for k in range(K, N + 1):
dp[k] = 1.0
S = min(N - K + 1, W)
# S = dp[k+1] + dp[k+2] + ... + dp[k+W]
for k in range(K - 1, -1, -1):
dp[k] = S / W
S += dp[k] - dp[k + W]
return dp[0]

This post is nice.

https://leetcode.com/problems/new-21-game/discuss/437832/Python-DP-iterate-on-points-rather-than-draw-count.-Should-be-a-"hard"-problem

I copied the content as below:

“dp(n) means the probability getting point “n” at any count of draws.
e.g. when random number is 1~3, we have:
dp(n)=dp(n-3)*(1/3)+dp(n-2)*(1/3)+dp(n-1)*(1/3)
And this kind of sum can be optimized using sliding window sum.

We cannot iterate DP on draw count, because it is hard to decide when to stop iterating. Instead we iterate on the obtained point, because each point can be obtained by the dp value of previous points.

This problem should be “hard” rather than “medium”.”

def new21Game(self, N: int, K: int, W: int) -> float:
if K==0 or K-1+W<=N:
return 1
dp=[1]+[0]*N
cursum=1
for i in range(1,N+1):
# dp[i]=sum(dp[max(0,i-W):min(i,K)])*(1/W)
dp[i]=cursum*(1/W)
if i<K:
cursum+=dp[i]
if i>=W:
cursum-=dp[i-W]
return sum(dp[K:])

808. Soup Servings

from functools import lru_cache
class Solution:
def soupServings(self, N: int) -> float:
if N > 4800: return 1
@lru_cache(None)
def f(a, b):
if a <= 0 and b <= 0: return 0.5
if a <= 0: return 1
if b <= 0: return 0
return (f(a - 4, b) + f(a - 3, b - 1) + f(a - 2, b - 2) + f(a - 1, b - 3))/4
N = math.ceil(N / 25)
return f(N, N)

Thanks for reading.

Data Scientist/MLE/SWE @takemobi