A great deal here depends on exactly what you really want. If you want numbers, each of which has exactly N pseudo-randomly chosen bits set, it's probably easiest to start by setting the chosen number of bits, then use std::shuffle
to randomize the positions:
std::vector<bool> bits(total_length - set_bits);
for (int i = 0; i < set_bits; i++)
bits.push_back(1);
std::shuffle(bits.begin(), bits.end(), rnd);
If we print out the results in binary, we get something like this:
00000100101110101000000010001000
11001010100100000000001001100000
10100100101010000001000100000100
10010001000000001100000011001010
11000010101000010001000001000010
00101000001000001000000110010110
00100010100000011000000110010100
10000000001000101100100100101000
01010110010000000011001000010000
11010011000100000100000100001000
To summarize that: 32 bit numbers, each of which has exactly 9 bits set, but the positions of the bits that are set varies (pseudo-)randomly.
More commonly, you'll want numbers in which each bit has a 1/N chance of being set, so any individual number might have more or less than that set, but in the long term it should tend toward a total of approximately 1/N bits being set. In this case, we can use the library's Bernoulli distribution:
const double odds = 1.0/9.0;
const int total_length = 32;
std::mt19937 rnd{ std::random_device()() };
std::bernoulli_distribution dist(odds);
std::vector<bool> bits;
std::generate_n(std::back_inserter(bits),
total_length,
[&] { return dist(rnd); });
Output from this would look something like this:
10000000100000000000001001100111
00000010010101010000000000100010
01000000100000000000111000110000
00001011000000000001001000001100
11010110101111010000001001000100
00000000110000010100100101100000
00000000000001100001110101001010
01010101001011110000100110100000
00000001000100010110000010001000
01001100111000000001011100100000
The previous generator was deterministic when we looked at the last bit of any output number--if we'd already see 9 set bits, then the last one had to be clear (and if we'd only seen 8, it had to be set). This one has independent odds for each bit of the result, so even though we should get about 9/32 bits set as a long-term average, we'll frequently see more or fewer than that set in any particular output. In fact, we should see (for example) all bits set with a frequency of approximately 1/932 (i.e., about 1 out of 2.9 x 1031 times we generate a number).
Now, a few more comments more specifically about your code and how you've written it:
Avoid new[]
In most reasonably written code, you should almost never see new
used at all, unless you're looking at the internals of an allocator, or something on that order.
Using the array form of new
, like T *something = new T[size];
should be even more unusual--to the point that I honestly can't think of a single situation where it's what I'd choose to do/use (and, in fact, I haven't used it in years).
Avoid magic numbers
Right now, the basic intent of your code isn't entirely clear in some places. For example, at least one (and probably a couple) of reviews have mis-read your intent as being to have even odds of a 0 or 1 for any bit in your result. Although you've since clarified in comments that you intended a skewed distribution of 0's vs. 1's, it would be much better if the code were written to make that intent clear1.
Avoid rand()
and srand()
Unless you need to write your code so it's compatible with C compilers, it's generally better to use the generators and distribution classes from <random>
instead. They use specified algorithms, so the quality of results have been tested and are well known. A generator encapsulates its own state, so using one generator won't (can't) affect the state of another. They are a little bit of extra work to use, but generally not so much that it's really a major problem.
1. I should probably add that although using odds
for the number is probably better than a magic number, it's still not entirely perfect. odds_of_set_bit
or something on that order would make the intent even more explicit.
t
variable, because that is causing your distribution to not be 50/50, so I'm not sure if that's what you intended. \$\endgroup\$t
ones in your array, since the randomized indices might contain duplicates. That will make the number and distribution of ones in the output somewhat difficult to analyze. \$\endgroup\$