O que significa um algoritmo mais rápido em ciência da computação teórica?

18

Se existe um algoritmo rodando no tempo O(f(n)) para algum problema A, e alguém cria um algoritmo rodando no tempo, O(f(n)/g(n)) , onde g(n)=o(f(n)) , é considerado uma melhoria em relação ao algoritmo anterior?

Faz sentido, no contexto da ciência da computação teórica, apresentar esse algoritmo?

lovw
fonte
4
Por "algoritmo mais rápido", queremos dizer "algoritmo assintoticamente mais rápido".
Yuval Filmus
@YuvalFilmus, o que você quer dizer com "assintoticamente"
indefinido
1
Correndo no tempo o(f(n)) .
Yuval Filmus

Respostas:

26

Não, um algoritmo em execução no tempo , onde g ( n ) = o ( f ( n ) ) , não é necessariamente considerado uma melhoria. Por exemplo, suponha que f ( n ) = N e g ( n ) = 1 / n . Então O ( f ( n ) / g (O(f(n)/g(n))g(n)=o(f(n))f(n)=ng(n)=1/n é um período de tempo pior que O ( f ( n ) ) = O ( n ) .O(f(n)/g(n))=O(n2)O(f(n))=O(n)

Para melhorar um algoritmo rodando no tempo , você precisa criar um algoritmo rodando no tempo o ( f ( n ) ) , ou seja, no tempo g ( n ) para alguma função g ( n ) = o ( f ( n ) )f(n)o(f(n))g(n)g(n)=o(f(n)) .

Se tudo que você sabe é que um algoritmo é executado no tempo , não está claro se um algoritmo que é executado no tempo O ( g ( n ) ) é uma melhoria, qualquer que seja f ( n ) , g ( n ) são. Isso ocorre porque O grande é apenas um limite superior no tempo de execução. Em vez disso, é comum considerar a pior complexidade do tempo e estimar como um Θ grande, e não como um O grande .O(f(n))O(g(n))f(n),g(n)ΘO

Yuval Filmus
fonte
21
Talvez seja melhor tomar no seu primeiro parágrafo. Usar uma função decrescente parece um pouco enganador. g(n)=1
David Richerby
1
@DavidRicherby: Talvez um pouco, mas o OP nunca disse que eles tinham um algoritmo rodando em então a monotonicidade não pode ser assumida. O(g(n))
Kevin
7
@Kevin Sure but the context is computer science and, in computer science, big-O notation is usually used for nondecreasing functions. Probably the asker was thinking in those terms.
David Richerby
11

Remember that O(...) notation is meant for analyzing how the task grows for different sizes of input, and specifically leaves out multiplicative factors, lower-order term, and constants.

Suponha que você tenha um algoritmo cujo tempo de execução real seja 1 n 2 + 2 n + 1 (supondo que você possa realmente contar as instruções e saber os horários exatos e assim por diante, o que é reconhecidamente uma enorme suposição nos sistemas modernos). Suponha que você crie um novo algoritmo que seja O ( n ) , mas o tempo de execução real é 1000 n + 5000 . Suponha também que você saiba que o software para usar esse algoritmo nunca verá um tamanho de problema de n > 10 .O(n2)1n2+2n+1O(n)1000n+5000n>10

So, which would you chose - the O(n) algorithm that's going to take 15000 units of time, or the O(n2) one that's only going to take 121 units? Now if your software evolves to handling problem sizes of n>100000, which one would you pick? What would you do if your problem size varies greatly?

twalberg
fonte
2
"never see a problem size of n>10" - then we'd not use the O notation at all, would we...
AnoE
5
@AnoE Simple numbers for the sake of argument. The same logic applies whether you're analyzing for a problem size of 10 vs 1e5 or analyzing for 1e6 vs 1e9.
twalberg
1
@AnoE Most computer programs do not try to handle an infinitely growing problem size. So there will be a trade-off. That's why big-O is for theoretical computer science, and the concepts can be applied to improve actual programs.
mbomb007
Exactly, @mbomb007. The question title is "What does a faster algorithm mean in theoretical computer science?" and he has this in the body: "Does it make sense, in the context of theoretical computer science...".
AnoE
@AnoE From experience, O notation is used when n<10 all the time! Not that it's a good idea... but it's totally something that's done!
Cort Ammon - Reinstate Monica
5

g(n)o(f(n))gf the time complexity of the old.

Θ(n2)Θ(nlogn) time, but it’s widely-used in practice because of its good average-case running time. It can additionally be tweaked to run very quickly in the cases that are most frequent in the wild, such as arrays that are mostly in the right order.

And sometimes, even theoretical computer scientists use “faster” the same way normal people do. For example, most implementations of String classes have Short String Optimization (also called Small String Optimization), even though it only speeds things up for short strings and is pure overhead for longer ones. As the input size gets larger and larger, the running time of a String operation with SSO is going to be higher by a small constant term, so by the definition I gave in the first paragraph, removing SSO from a String class makes it “faster.” In practice, though, most strings are small, so SSO makes most programs that use them faster, and most computer-science professors know better than to go around demanding that people only talk about orders of asymptotic time complexity.

Davislor
fonte
1

There is not one unified definition of what a "faster algorithm" is. There is not a governing body which decides whether an algorithm is faster than another.

To point out why this is, I'd like to offer up two different scenarios which demonstrate this murky concept.

The first example is an algorithm which searches a linked list of unordered data. If I can do the same operation with an array, I have no change on the big Oh measure of performance. Both searches are O(n). If I just look at the big Oh values, I might say that I made no improvement at all. However, it is known that array lookups are faster than walking a linked list in the majority of cases, so one may decide that that made an algorithm "faster," even though the big Oh did not change.

If I may use the traditional example of programming a robot to make a PBJ sandwich, I can show what I mean another way. Consider just the point where one is opening the jar of peanut butter.

Pick up the jar
Grab the lid
Unscrew the lid

Versus

Pick up the jar
Put the jar back down
Pick up the jar
Put the jar back down
Pick up the jar
Put the jar back down
Pick up the jar
Put the jar back down
Pick up the jar
Put the jar back down
Pick up the jar
Grab the lid
Unscrew the lid

Even in the most academic theoretical setting I can think of, you'll find that people accept that the first algorithm is faster than the second, even though the big Oh notation results are the same.

By contrast, we can consider an algorithm to break RSA encryption. At the moment, it is perceived that this process is probably O(2^n), where n is the number of bits. Consider a new algorithm which runs n^100 faster This means my new process runs in O(2^n/n^100). However, in the world of cryptography, a polynomial speedup to an exponential algorithm is traditionally not thought of as a theoretical speed up at all. When doing security proofs, it's assumed that an attacker may discover one of these speed ups, and that it will have no effect.

So in one circumstance, we can change a O(n) to O(n), and call it faster. In a different circumstance, we can change a O(2^n) to O(2^n/n^100), and claim there was no meaningful speed up at all. This is why I say there is no one unified definition for a "faster algorithm." It is always contextually dependent.

Cort Ammon - Reinstate Monica
fonte
1

I can't comment yet, but I feel like the current answers, while correct and informative, do not address part of this question. First, let us write an expression equivalent to A(n)O(f(n)).

 0cf< lim supnA(n)f(n)=cf

Now, let us assume we are talking about an arbitrarily increasing function g(n) where lim supng(n)= and let us create the function h(n)=f(n)g(n).

We are given that the run-time of the "improved" algorithm A(n) is in O(h(n)). Suppose that the run-time of the original algorithm A(n) is also in O(h(n)). This can be written as follows.

 0ch< lim supnA(n)h(n)=ch

Using the rules of limits, we can also write:

ch=lim supnA(n)h(n)=lim supnA(n)g(n)f(n)=cflim supng(n)

Since ch<, this can only be true if cf=0.

The contrapositive statement is: If cf0, then A(n)O(h(n)).

In words, A(n) is an "improvement" on A(n) under the additional conditions that A(n)Θ(f(n)) and g(n) is arbitrarily increasing.

Additionally, this should show why the statement that A(n)O(f(n)) is not strong enough to draw a conclusion about whether A(n) is an "improvement." In short, A(n) could already be in O(h(n)).

Jared Goguen
fonte
1
Your limit should be limit superior.
Yuval Filmus
1
@YuvalFilmus Updated
Jared Goguen