5

Okay, here's the problem I have. I'd like to do something like this:

V(n,k) generates all n-sized permutations in which each permutations contains all digits from 0 to k-1 in lexicographic order.

V(2,2) = { (0,1), (1,0) }
V(3,2) = { (0,0,1), (0,1,0), (0,1,1), (1,0,0), (1,0,1), (1,1,0) }
V(4,3) = { (0,0,1,2), (0,0,2,1), (0,1,0,2), (0,1,1,2), (0,1,2,0), ..., (2,2,1,0) }

Additional explanations For V(4,3) case, let's look at the sequence:

0012 -> 3 unique digits. Histogram: [2,1,1]
0021 -> 3 unique digits. Histogram: [2,1,1]
0102 -> 3 unique digits. Histogram: [2,1,1]
0112 -> 3 unique digits. Histogram: [1,2,1]
0120 -> 3 unique digits. Histogram: [2,1,1]
0121 -> 3 unique digits. Histogram: [1,2,1]
.....
2210 -> 3 unique digits. Histogram: [1,1,2]

As you can see, they all have k-amount of digits with different frequency. In addition, they are in lexicographic order.


The Code

The problem I have now is avoiding generating permutations where a number is missing. My input test was n=4 and k=3. So far, I do have this code, which correctly generates up to (0,2,0,1).

#@cli rep_rde_compsel_permutations: total_selection,number_of_item,_axis={x|y|z} : number_of_item_group,total_selection,_axis={x|y|z}
#@cli : Generates all possible permutations in which at least 1 item from each item group is select given N size.
#@cli : Author : Reptorian.
#@cli : Default values: '_axis=x'
+rep_rde_compsel_permutations:
skip ${3=x}

check isint($1,1)&&isint($2,1)&&inrange(('$3')[0],_'x',_'z',1,1)

number_of_item_groups,total_selection:=sort([${1-2}])
cutoff_point={$total_selection-$number_of_item_groups}

axis,length:=${-_rep_rde_lib_sterling_number_of_second_kind};[_'$3'-_'x',fact($number_of_item_groups)*sterling_number_of_second_kind($total_selection,$number_of_item_groups)]
{V=[1,1,1];V[$axis]=$length;V},$total_selection,>"begin(
        const n_ig=$number_of_item_groups;
        const dec_n_ig=n_ig-1;
        const last_indice=s-1;

        result=expr('x>$cutoff_point?x-$cutoff_point',s);
        counts=histogram(result,n_ig,0,n_ig);
        list_of_misses=vector(#n_ig);
        min_rp=last_indice;

        started=0;
    );
    started?(
        missings=0;last_missing=-1;swapped=0;
        t_counts=counts;

        if(x==5,print(result));

        result[last_indice-1,2]!=[0,1]?(
            for(rp=last_indice,rp>=0,--rp,
                last_v=result[rp];
                current_v=(last_v+1)%n_ig;
                is_current_missing=0;

                if(!--t_counts[last_v]&&rp,
                    is_current_missing=1;
                    ++missings;
                    last_missing=last_v;
                );

                --counts[last_v];
                ++counts[result[rp]=current_v];

                if(is_current_missing&&missings==1,
                    continue();
                );

                if(last_v<dec_n_ig,
                    break();
                );
            );

            if(missings&&result[rp]!=last_missing;,
                --counts[result[last_indice]];
                ++counts[result[last_indice]=last_missing];
            );
        ):(
            swap(result[last_indice-1],result[last_indice]);
        );
    ):(
        started=1;
    );
    result;"
_rep_rde_lib_sterling_number_of_second_kind:
u "sterling_number_of_second_kind(_a,_b)=(
        ref(vector(##_a-_b+1,1),_arr);
        _max_q=_a-_b;
        _i_max_q=_max_q+1;
        for(_p=2,_p<=_b,++_p,
            for(_q=1,_q<_i_max_q,++_q,
                _arr[_q]+=_p*_arr[_q-1];
            );
        );
        _res=_arr[_max_q];
        unref(_arr);
        _res;
    );"

I know why it happens. (0,2,0,1). At the last indice, it becomes 2, and then second last indice becomes 1. So, it skips (0,2,1,0). (0,2,1,2) is generated next after (0,2,0,1).

4
  • And the question is how to fix the code? Or how to generate the permutations (plenty of that on stackoverflow)? If you would describe the idea of your algorithm with words and not just code and add a question it is easier to help. Commented Sep 5 at 22:31
  • Yes, there are plenty of permutations, but as far as I searched, there is no algorithm for generating permutations of a specified length in which all permutations has all unique digits/chars of varying counts, and the numbers of uniques are of specific size. That's the idea, and I have provided examples. The code is there only as a reference, I believe I am missing something. Commented Sep 5 at 23:02
  • what crazy language is that? Commented Sep 12 at 1:57
  • @NariPDev That's G'MIC. that's the only language I use these day asides from the rare Python things. Commented Sep 13 at 1:34

2 Answers 2

7

First step, know how many of them there are. This is a simple recursive function. The caching layer turns it into a top down dynamic programming algorithm. Which is fancy speak for, "speed up calculations by returning the previous answer if you're asked the same question again."

def count_possibilities (digits, entries, seen, cache={}):
    key = (digits, entries, seen)
    if key not in cache:
        if entries < digits - seen:
            value = 0
        elif entries == 0:
            value = 1
        else:
            value = (digits - seen) * count_possibilities(
                        # See a new digit
                        digits, entries-1, seen+1
                    ) + seen * count_possibilities(
                        # See a digit again
                        digits, entries-1, seen
                    )
        cache[key] = value
    return cache[key]

Now that we know how many there are, we can find any particular possibility by index.

def find_possibility (digits, entries, n):
    if n < 0 or count_possibilities(digits, entries, 0) <= n:
        return None
    answer = []
    seen = set()
    while len(answer) < entries:
        for digit in range(digits):
            if digit in seen:
                count = count_possibilities(digits,
                                            entries-len(answer)-1,
                                            len(seen))
            else:
                count = count_possibilities(digits,
                                            entries-len(answer)-1,
                                            len(seen)+1)
            if count <= n:
                # We are skipping past count possibilities
                # that came before in lexicographic order.
                n -= count
            else:
                # This is our choice
                seen.add(digit)
                answer.append(digit)
                break
    return answer

And now we know how to count how many there are, and how to find individual ones. We just need to combine them as follows:

def enumerate_possibilities(digits, entries):
    max_n = count_possibilities(digits, entries, 0)
    for n in range(max_n):
        yield find_possibility(digits, entries, n)

for possibility in enumerate_possibilities(3, 5):
    print(possibility)

And after running that you will verify that they did indeed come out in lexicographic order.


It was pointed out in the comments that I changed variable names from the question. The following may clarify the connection.

def V (n, k):
    return enumerate_possibilities(k, n)

I also said "possibilities" instead of "permutations" because they are not generally permutations.

Sign up to request clarification or add additional context in comments.

11 Comments

Oops, I focussed on avoiding duplicates and overlooked that the output has to be in lexicographical order.
>>> for possibility in enumerate_possibilities(3, 3): ... print(possibility) ... [0, 1, 2] [0, 2, 1] [1, 0, 2] [1, 2, 0] [2, 0, 1] [2, 1, 0]
That formatted badly. Anyway, it's not limiting itself to lexicographic order. Also, I think the count should be choose(n,k) * (k-1) ^ (n-k).
That was an unclarity in the wording of the question that I had to resolve by looking at the desired sample output. Here is what I concluded. Within any one of the possibilities, the values need not appear in lexicographic order. But the possibilities themselves are sorted in lexicographic order. Which describes the order that you posted.
@btilly "Within any one of the possibilities, the values need not appear in lexicographic order." - Does it even make sense to talk about lexicographic order within one?
Each possibility is an ordered set. So yes, ic can make sense to talk about that internal order.
Order, yes. But lexicographic order?
My first reading was that new values had to be first met in lexicographic order. So yes. There is a way to make sense of that.
But lexicographic order is about ordering multiple sequences: which sequence comes first, which sequence comes second, etc. And a number is not a sequence.
I guess that technically you are right. But I think of lexicographic ordering as handling any arbitrary nesting level of ordered data. It applies to lists, lists of lists, lists of lists of lists, various mixed things as long as the types match up, and so on. (This is how Python does it.) In which case with entirely unnested data, it simply becomes ordering. So when I see "lexicographic ordering", my brain translates it into "the most natural ordering that you can do."
Regardless of right or wrong, that is how my brain first processed it. Call it a misfire if you want. Call it a weird brain. shrug
2

(I overlooked the requirement that the results have to be in lexicographical order. If you use the method described below, you will have to sort the results. You could use a merge sort, because the permutations in the last step are generated in order.)

To generate all unique solutions, without duplicates that would have to be weeded out later, you could take these step (using the example of n=7 and k= 3)

  • Generate all partitions of n-k with a maximum of k parts:
    [4], [3,1], [2,2], [2,1,1]

  • Pad the partitions with zeros to all be size k:
    [4,0,0], [3,1,0], [2,2,0], [2,1,1]

  • Add 1 to each number in the partitions:
    [5,1,1], [4,2,1], [3,3,1], [3,2,2]

  • Generate all unique permutations of these partitions:
    [5,1,1], [1,5,1], [1,1,5]
    [4,2,1], [4,1,2], [2,1,4], [2,4,1], [1,2,4], [1,4,2]
    [3,3,1], [3,1,3], [1,3,3]
    [3,2,2], [2,3,2], [2,2,3]

  • Use these permutations as the number of each digit 0 to k-1 in a solution of size n:
    [5,1,1] = [0,0,0,0,0,1,2]
    [1,5,1] = [0,1,1,1,1,1,2]
    ...
    [2,3,2] = [0,0,1,1,1,2,2]
    [2,2,3] = [0,0,1,1,2,2,2]

  • Generate all unique permutations of these solutions:
    [0,0,0,0,0,1,2], [0,0,0,0,0,2,1], [0,0,0,0,1,0,2], ... [2,1,0,0,0,0,0]
    [0,1,1,1,1,1,2], [0,1,1,1,1,2,1], [0,1,1,1,2,1,1], ... [2,1,1,1,1,1,0]
    ...
    [0,0,1,1,1,2,2], [0,0,1,1,2,1,2], [0,0,1,1,2,2,1], ... [2,2,1,1,1,0,0]
    [0,0,1,1,2,2,2], [0,0,1,2,1,2,2], [0,0,1,2,2,1,0], ... [2,2,2,1,1,0,0]

Generating partitions and permutations can be done with simple recursive algorithms, but generating unique permutations is less straightforward. Have a look at this question for useful algorithms, e.g.:

  • Start with the numbers in ascending order (so no need to sort them in the last step).

  • Locate the rightmost digit that is smaller that the digit to the right of it; let's call its index i.

  • Locate the rightmost digit after index i that is greater than the digit at index i, call its index j.

  • Swap the digits at indexes i and j.

  • Reverse the order of all digits to the right of index i.

Let's run through an example from the last step of our algorithm:

[0,0,1,1,2,2,2]
       i     j

The right-most digit that is smaller than the digit after it, is the last 1 before the 2's. The right-most digit after it that is greater than it, is the last 2. We now swap these two digits:

[0,0,1,2,2,2,1]
       i     j

and then reverse the order of the digits after index i:

[0,0,1,2,1,2,2]
       i

and this is our next permutation. When you get to the point where you can not find a digit that is smaller than the digit to the right of it, you have reached the last permutation:

[2,2,2,1,1,0,0]

For the fourth step, algorithms to generate partitions often generate them from large to small values, like [4,2,1]. So you will either have to modify the algorithm to generate them in the form [1,2,4], or reverse them afterwards, to prepare them for the next step of generating their unique permutations.

1 Comment

Given the large number of values generated by even a small set of inputs, I thought it was important to not require them all to exist in memory first.

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.