Estou fazendo o curso de Machine Learning Stanford no Coursera.
No capítulo Regressão logística, a função de custo é esta:
Tentei obter a derivada da função de custo, mas obtive algo completamente diferente.
Como é obtido o derivado?
Quais são as etapas intermediárias?
regression
logistic
gradient-descent
derivative
octaviano
fonte
fonte
Respostas:
Adaptado das notas do curso, que não vejo disponíveis (incluindo esta derivação) fora das notas contribuídas pelos alunos na página do curso Coursera Machine Learning de Andrew Ng .
A seguir, o sobrescrito(i) indica medições individuais ou "exemplos" de treinamento.
A derivada da função sigmóide é
fonte
To avoid impression of excessive complexity of the matter, let us just see the structure of solution.
With simplification and some abuse of notation, letG(θ) be a term in sum of J(θ) , and h=1/(1+e−z) is a function of z(θ)=xθ :
We may use chain rule:dGdθ=dGdhdhdzdzdθ and solve it one by one (x and y are constants).
Finally,dzdθ=x .
Combining results all together gives sought-for expression:
fonte
The credit for this answer goes to Antoni Parellada from the comments, which I think deserves a more prominent place on this page (as it helped me out when many other answers did not). Also, this is not a full derivation but more of a clear statement of∂J(θ)∂θ . (For full derivation, see the other answers).
where
Also, a Python implementation for those wanting to calculate the gradient ofJ with respect to θ .
fonte
For those of us who are not so strong at calculus, but would like to play around with adjusting the cost function and need to find a way to calculate derivatives... a short cut to re-learning calculus is this online tool to automatically provide the derivation, with step by step explanations of the rule.
https://www.derivative-calculator.net
fonte