The Python Oracle

varying degree of shuffling using random module python

--------------------------------------------------
Hire the world's top talent on demand or became one of them at Toptal: https://topt.al/25cXVn
and get $2,000 discount on your first invoice
--------------------------------------------------

Music by Eric Matyas
https://www.soundimage.org
Track title: Puzzle Game 5 Looping

--

Chapters
00:00 Varying Degree Of Shuffling Using Random Module Python
01:44 Answer 1 Score 1
02:24 Answer 2 Score 4
03:06 Accepted Answer Score 5
04:03 Answer 4 Score 0
04:40 Thank you

--

Full question
https://stackoverflow.com/questions/4196...

--

Content licensed under CC BY-SA
https://meta.stackexchange.com/help/lice...

--

Tags
#python #random #shuffle

#avk47



ACCEPTED ANSWER

Score 5


Here is an approach based on this paper which proves a result about the mixing time needed to scramble a list by using swaps of adjacent items

from random import choice
from math import log

def jitter(items,percent):
    n = len(items)
    m = (n**2 * log(n))
    items = items[:]
    indices = list(range(n-1))
    for i in range(int(percent*m)):
        j = choice(indices)
        items[j],items[j+1] = items[j+1],items[j]
    return items

A test, each line showing the result of jitter with various percents being applied to the same list:

ls = list(('0'*20 + '1'*20)*2)

for i in range(11):
    p = i/10.0
    print(''.join(jitter(ls,p)))

Typical output:

00000000000000000000111111111111111111110000000000000000000011111111111111111111
00000000000000111100001101111011011111001010000100010001101000110110111111111111
00000000100100000101111110000110111101000001110001101001010101100011111111111110
00000001010010011011000100111010101100001111011100100000111010110111011001011111
00100001100000001101010000011010011011111011001100000111011011111011010101011101
00000000011101000110000110000010011001010110011111100100111101111011101100111110
00110000000001011001000010110011111101001111001001100101010011010111111011101100
01101100000100100110000011011000001101111111010100000100000110111011110011011111
01100010110100010100010100011000000001000101100011111011111011111011010100011111
10011100101000100010001100100000100111001111011011000100101101101010101101011111
10000000001000111101101011000011010010110011010101110011010100101101011110101110

I'm not sure how principled the above is, but it seems like a reasonable place to start.




ANSWER 2

Score 4


There's no clear definition of what "degree of shuffling" (d) means, so you'll need to choose one. One option would be: "the fraction of items remaining unshuffled is (1-d)".

You could implement that as:

  1. Produce a list of indices
  2. Remove (1-d)*N of them
  3. Shuffle the rest
  4. Reinsert the ones removed
  5. Use these to look up values from the original data

def partial_shuffle(x, d):
    """
    x: data to shuffle
    d: fraction of data to leave unshuffled
    """
    n = len(x)
    dn = int(d*n)
    indices = list(range(n))
    random.shuffle(indices)
    ind_fixed, ind_shuff = indices[dn:], indices[:dn]

    # copy across the fixed values
    result = x[:]

    # shuffle the shuffled values
    for src, dest in zip(ind_shuff, sorted(ind_shuff)):
        result[dest] = x[src]

    return result



ANSWER 3

Score 1


The other algorithms you're referring to are probably using the Fisher-Yates shuffle under the hood.

This O(n) shuffle starts with the first element of an array and swaps it with a random higher element, then swaps the second element with a random higher element, and so on.

Naturally, stopping this shuffle before you reach the last element at some fraction [0,1] would give a partially-randomized array, like you want.

Unfortunately, the effect of the foregoing is that all the "randomness" builds up on one side of the array.

Therefore, make a list of array indices, shuffle these completely, and then use the indices as an input to the Fisher-Yates algorithm to partially sort the original array.




ANSWER 4

Score 0


I believe I found a more versatile, robust, and a consistent way to implement this "adjustable shuffling" technique.

import random
import numpy as np


def acc_shuffle(lis, sr, array=False, exc=None):  # "sr" = shuffling rate
    if type(lis) != list:  # Make it compatible with shuffling (mxn) numpy.ndarrays
        arr = lis
        shape = arr.shape
        lis = list(arr.reshape(-1))
    lis = lis[:]  # Done, such that any changes applied on "lis" wont affect original input list "x"
    indices = list(range(len(lis)))

    if exc is not None:  # Exclude any indices if necessary
        for ele in sorted(exc, reverse=True):
            del indices[ele] 

    shuff_range = int(sr * len(lis) / 2)  # How much to shuffle (depends on shuffling rate)
    if shuff_range < 1:
        shuff_range = 1  # "At least one shuffle (swap 2 elements)"

    for _ in range(shuff_range):
        i = random.choice(indices)
        indices.remove(i)  # You can opt not to remove the indices for more flexibility
        j = random.choice(indices)
        indices.remove(j)
        temp = lis[i]
        lis[i] = lis[j]
        lis[j] = temp

    if array is True:
        return np.array(lis).reshape(shape)
    return lis