As the title says, I want a function that takes a vector, as well as the number of bins, and splits the vector in that number of bins, with a minimum length of 1 for each bin.
def split_into_bins(nbin, vector): """ Randomly split vector into nbin number of bins, each of random size """ permutation = list(np.random.permutation(vector)) # Location of the splits splits = sorted(np.random.choice(range(1,len(vector)), nbin-1, replace=False)) # Initializing empty bins bins = []*nbin start = 0 end = splits for i in range(nbin): bins[i] = permutation[start:end] start = end try: end = splits[i+1] except IndexError: end = len(vector) return bins
I'm wondering if there's a cleaner way to split the vector into bins besides randomly selecting split locations. My method of splitting up the list given the location of the splits also seems pretty messy. Performance does matter, so I'm wondering if I should initialize the empty bins outside of the function.
I also don't want there to be any bias with regards to the size of the bins; they should all have the same size on average. I'm pretty sure this method isn't biased, however.