Existe uma estrutura de dados da fila prioritária que ofereça suporte às seguintes operações?
- Inserir (x, p) : adiciona um novo registro x com prioridade p
- StableExtractMin () : retorne e exclua o registro com prioridade mínima, quebrando os vínculos por ordem de inserção .
Assim, após Inserir (a, 1), Inserir (b, 2), Inserir (c, 1), Inserir (d, 2), uma sequência de StableExtractMin retornaria a, depois c, depois b e, em seguida, d.
Obviamente, pode-se usar qualquer estrutura de dados da fila de prioridade armazenando o par como a prioridade real, mas estou interessado em estruturas de dados que não armazenam explicitamente os tempos de inserção (ou ordem de inserção), por analogia para classificação estável.
Equivalentemente (?): Existe uma versão estável do heapsort que não requer espaço extra?
ds.data-structures
Jeffε
fonte
fonte
Respostas:
O método Bently-Saxe fornece uma fila de prioridade estável bastante natural.
Armazene seus dados em uma sequência de matrizes ordenadas . A i tem tamanho 2 i . Cada matriz também mantém um contador c i . As entradas da matriz A i [ c i ] , … , A i [ 2 i - 1 ] contêm dados.A0,…,Ak Ai 2i ci Ai[ci],…,Ai[2i−1]
Para cada , todos os elementos em A i foram adicionados mais recentemente do que os em A i + 1 e dentro de cada elemento A i são ordenados por valor, com os vínculos quebrando-se colocando os elementos mais antigos à frente dos elementos mais novos. Observe que isso significa que podemos mesclar A i e A i + 1 e preservar essa ordem. (No caso de empate durante a mesclagem, pegue o elemento de A i + 1. )i Ai Ai+1 Ai Ai Ai+1 Ai+1
Para inserir um valor , encontrar o menor i tal que A i contém 0 elementos, merge A 0 , ... , A i - 1 e x , armazenar isso em A i e conjunto c 0 , ... , c i adequadamente.x i Ai A0,…,Ai−1 x Ai c0,…,ci
Para extrair o mínimo, encontre o maior índice modo que o primeiro elemento em A i [ c i ] seja mínimo sobre todos os i e incremente c i .i Ai[ci] i ci
Pelo argumento padrão, isso fornece tempo amortizado por operação e é estável devido à ordem descrita acima.O(logn)
Para uma sequência de inserções e extrações, isso usa n entradas de matriz (não mantenha matrizes vazias) mais O ( log n ) palavras de dados da contabilidade. Ele não responde à versão da pergunta de Mihai, mas mostra que a restrição estável não exige muito espaço. Em particular, mostra que não há Ω ( n ) limite inferior no espaço extra necessário.n n O(logn) Ω(n)
Atualização: Rolf Fagerberg ressalta que, se pudermos armazenar valores nulos (não dados), toda essa estrutura de dados poderá ser compactada em uma matriz de tamanho , onde n é o número de inserções até o momento.n n
Primeiro, observe que podemos empacotar em uma matriz nessa ordem (com A k primeiro, seguido por A k - 1 se não estiver vazio, e assim por diante). A estrutura disso é completamente codificada pela representação binária de n , o número de elementos inseridos até o momento. Se a representação binária de n tem um 1 na posição de i , em seguida, um i ocupará 2 i local da matriz, caso contrário irá ocupar nenhum local de matriz.Ak,…,A0 Ak Ak−1 n n i Ai 2i
When inserting,n , and the length of our array, increase by 1, and we can merge A0,…,Ai plus the new element using existing in-place stable merging algorithms.
Now, where we use null values is in getting rid of the countersci . In Ai , we store the first value, followed by ci null values, followed by the remaining 2i−ci−1 values. During an extract-min, we can still find the value to extract in O(logn) time by examining A0[0],…,Ak[0] . When we find this value in Ai[0] Ai[0] Ai Ai[ci] Ai[0] and Ai[ci] .
The end result: The entire structure can be implemented with one array whose length is increment with each insertion and one counter,n , that counts the number of insertions.
fonte
I'm not sure what your constraints are; does the following qualify? Store the data in an array, which we interpret as an implicit binary tree (like a binary heap), but with the data items at the bottom level of the tree rather than at its internal nodes. Each internal node of the tree stores the smaller of the values copied from its two children; in case of ties, copy the left child.
To find the minimum, look at the root of the tree.
To delete an element, mark it as deleted (lazy deletion) and propagate up the tree (each node on the path to the root that held a copy of the deleted element should be replaced with a copy of its other child). Maintain a count of deleted elements and if it ever gets to be too large a fraction of all elements then rebuild the structure preserving the order of the elements at the bottom level — the rebuild takes linear time so this part adds only constant amortized time to the operation complexity.
To insert an element, add it to the next free position on the bottom row of the tree and update the path to the root. Or, if the bottom row becomes full, double the size of the tree (again with an amortization argument; note that this part is not any different from the need to rebuild when a standard binary heap outgrows its array).
It's not an answer to Mihai's stricter version of the question, though, because it uses twice as much memory as a true implicit data structure should, even if we ignore the space cost of handling deletions lazily.
fonte
Is the following a valid interpretation of your problem:
You have to store N keys in an array of A[1..N] with no auxiliary information such that you can support: * insert key * delete min, which picks the earliest inserted element if there are multiple minima
This appear quite hard, given that most implicit data structures play the trick of encoding bits in the local ordering of some elements. Here if multiple guys are equal, their ordering must be preserved, so no such tricks are possible.
Interesting.
fonte
Short answer : You can't.
Slightly longer answer :
You'll needΩ(n) extra space to store the "age" of your entry which will allow you to discriminate between identical priorities. And you'll need Ω(n) space for information that will allow fast insertions and retrievals. Plus your payload (value and priority).
And, for each payload you store, you'll be able to "hide" some information in the address (e.g.addr(X)<addr(Y) means Y is older than X). But in that "hidden" information, you'll either hide the "age", OR the "fast retrieval" information. Not both.
Very long answer with inexact flaky pseudo-math :
Note : the very end of the second part is sketchy, as mentioned. If some math guy could provide a better version, I'd be grateful.
Let's think about the amount of data that is involved on an X-bit machine (say 32 or 64-bit), with records (value and priority)P machine words wide.
You have a set of potential records that is partially ordered :(a,1)<(a,2) and (a,1)=(a,1) but you can't compare (a,1) and (b,1) .
However you want to be able to compare two non-comparable values from your set of records, based on when they were inserted. So you have here another set of values : those that have been inserted, and you want to enhance it with a partial order :X<Y iff X was inserted before Y .
In the worst-case scenario, your memory will be filled with records of the form(?,1) (with ? different for each one), so you'll have to rely entirely upon the insertion time in order to decide which one goes out first.
That means that you must somehow storeX−log2(P) extra bits of information for each record you store. And that's O(n) for n records.
Now, how much bits of information does each memory "cell" provide us ?
Now, let's assumeP≥1 (payload is at least one machine word wide (usually one octet)). This means that X−log2(P)<X , so we can fit the insertion order information in the cell's address. That's what happening in a stack : cells with the lowest address entered the stack first (and will get out last).
So, to store all our information, we have two possibilities :
Obviously, in order to avoid waste, we'll use the first solution.
Now for the operations. I suppose you wish to have :
Let's look atStableExtractMin() :
The really really general algorithm goes like this :
For example, in the case of a heap, it will be slightly differently organized, but the work is the same : 1. Find the min record in0(1)
2. Remove it from the structure in O(1)
3. Fix everything so that next time #1 and #2 are still O(1) i.e. "repair the heap". This needs to be done in "O(log n)"
4. Return the element.
Going back to the general algorithm, we see that to find the record inO(logn) time, we need a fast way to choose the right one between 2(X−log2(P)) candidates (worst case, memory is full).
This means that we need to storeX−log2(P) bits of information in order to retrieve that element (each bit bisects the candidate space, so we have O(logn) bisections, meaning O(logn) time complexity).
These bits of information might be stored as the address of the element (in the heap, the min is at a fixed address), or, with pointers for example (in a binary search tree (with pointers), you need to followO(logn) on average to get to the min).
Now, when deleting that element, we'll need to augment the next min record so it has the right amount of information to allowO(logn) retrieval next time, that is, so it has X−log2(P) bits of information discriminating it from the other candidates.
That is, if it doesn't have already enough information, you'll need to add some. In a (non-balanced) binary search tree, the information is already there : You'll have to put a NULL pointer somewhere to delete the element, and without any further operation, the BST is searchable inO(logn) time on average.
After this point, it's slightly sketchy, I'm not sure about how to formulate that. But I have the strong feeling that each of the remaining elements in your set will need to haveX−log2(P) bits of information that will help find the next min and augment it with enough information so that it can be found in O(logn) time next time.
The insertion algorithm usually just needs to update part of this information, I don't think it will cost more (memory-wise) to have it perform fast.
Now, that means that we'll need to storeX−log2(P) more bits of information for each element. So, for each element, we have :
Since we already use the memory contents to store the payload, and the address to store the insertion time, we don't have any room left to store the "fast search" information. So we'll have to allocate some extra space for each element, and so "waste"Ω(n) extra space.
fonte
If you implement your priority queue as a balanced binary tree (a popular choice), then you just have to make sure that when you add an element to the tree, it gets inserted to the left of any elements with equal priority.
This way, the insertion order is encoded in the structure of the tree itself.
fonte
I don't think that's possible
concrete case:
min heap with all x > 1
heapifying will eventually give something a choice like so
now which 1 to propagate to root?
fonte