Xiran He et al. - “Practical Lessons from Predicting Clicks on Ads at Facebook” paper
Interesting to note that of the 11 authors 5 had already left Facebook at the time of writing.
use Normalized Cross Entropy (NE) and calibration as major evaluation metrics
Normalized entropy is the predictive log loss normalized by entropy of background CTR. E.g. what are you gaining over just empirical CTR of training data set.
$$
NE = \frac{-\frac{1}{N}\sum_{i=1}^n(\frac{1+y_i}{2}log(p_i)+\frac{1-y_i}{2}log(1-p_i))}{-(p*log(p) + (1-p)*log(1-p))}
$$
Where $p_i$ is the estimated probability of click for a particular ad, $p$ is the average empirical CTR and $y_i \in \{-1,+1\}$. NE has the benefits of reflecting both the goodness of predictions and implicitly the calibration. The top of the model means that when $p\_i = 1, y\_i = 1$ we get loss of $2/2log(1) + 0/2log(0) = 0$, similarly when $p_i = 0, y_i = -1$ we get loss of $0log(0) + 2/2log(1) = 0$