I need to return different values based on a weighted round-robin such that 1 in 20 gets A, 1 in 20 gets B, and the rest go to C.
So:
A => 5% B => 5% C => 90%
Here’s a basic version that appears to work:
import random
x = random.randint(1, 100)
if x <= 5:
return 'A'
elif x > 5 and x <= 10:
return 'B'
else:
return 'C'
Is this algorithm correct? If so, can it be improved?
Answers:
Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.
Method 1
Your algorithm is correct, how about something more elegant:
import random my_list = ['A'] * 5 + ['B'] * 5 + ['C'] * 90 random.choice(my_list)
Method 2
that’s fine. more generally, you can define something like:
from collections import Counter
from random import randint
def weighted_random(pairs):
total = sum(pair[0] for pair in pairs)
r = randint(1, total)
for (weight, value) in pairs:
r -= weight
if r <= 0: return value
results = Counter(weighted_random([(1,'a'),(1,'b'),(18,'c')])
for _ in range(20000))
print(results)
which gives
Counter({'c': 17954, 'b': 1039, 'a': 1007})
which is as close to 18:1:1 as you can expect.
Method 3
If you want to use weighted random and not percentile random, you can make your own Randomizer class:
import random
class WeightedRandomizer:
def __init__ (self, weights):
self.__max = .0
self.__weights = []
for value, weight in weights.items ():
self.__max += weight
self.__weights.append ( (self.__max, value) )
def random (self):
r = random.random () * self.__max
for ceil, value in self.__weights:
if ceil > r: return value
w = {'A': 1.0, 'B': 1.0, 'C': 18.0}
#or w = {'A': 5, 'B': 5, 'C': 90}
#or w = {'A': 1.0/18, 'B': 1.0/18, 'C': 1.0}
#or or or
wr = WeightedRandomizer (w)
results = {'A': 0, 'B': 0, 'C': 0}
for i in range (10000):
results [wr.random () ] += 1
print ('After 10000 rounds the distribution is:')
print (results)
Method 4
It seems correct since you are using a uniform random variable with independent draws the probability for each number will be 1/n (n=100).
You can easily verify your algorithm by running it say 1000 time and see the frequency for each letter.
Another algorithm you might consider is to generate an array with your letters given the frequency you want for each letter and only generate a single random number which is the index in the array
It will be less efficient in memory but should perform better
Edit:
To respond to @Joel Cornett comment, an example will be very similar to @jurgenreza but more memory efficient
import random data_list = ['A'] + ['B'] + ['C'] * 18 random.choice(data_list )
All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0