Estou fazendo o curso de Machine Learning Stanford no Coursera.
No capítulo Regressão logística, a função de custo é esta:
Tentei obter a derivada da função de custo, mas obtive algo completamente diferente.
Como é obtido o derivado?
Quais são as etapas intermediárias?
Adaptado das notas do curso, que não vejo disponíveis (incluindo esta derivação) fora das notas contribuídas pelos alunos na página do curso Coursera Machine Learning de Andrew Ng .
A seguir, o sobrescrito(i) indica medições individuais ou "exemplos" de treinamento.
A derivada da função sigmóide é
To avoid impression of excessive complexity of the matter, let us just see the structure of solution.
With simplification and some abuse of notation, letG(θ) be a term in sum of J(θ) , and h=1/(1+e−z) is a function of z(θ)=xθ :
We may use chain rule:dGdθ=dGdhdhdzdzdθ and solve it one by one (x and y are constants).
Finally,dzdθ=x .
Combining results all together gives sought-for expression:
The credit for this answer goes to Antoni Parellada from the comments, which I think deserves a more prominent place on this page (as it helped me out when many other answers did not). Also, this is not a full derivation but more of a clear statement of∂J(θ)∂θ . (For full derivation, see the other answers).
Also, a Python implementation for those wanting to calculate the gradient ofJ with respect to θ .
For those of us who are not so strong at calculus, but would like to play around with adjusting the cost function and need to find a way to calculate derivatives... a short cut to re-learning calculus is this online tool to automatically provide the derivation, with step by step explanations of the rule.