sklearn,numpy里有没有计算ndcg的函数?

  统计/机器学习 推荐系统 模型验证 Python    浏览次数:164        分享
0

sklearn,numpy里有没有计算ndcg的函数?

 

古力夬   2019-10-24 12:25



   1个回答 
2

sklearn里并没有,需要自己实现的,kaggle上有人写过,链接在这里

代码在这里

import numpy as np
from sklearn.preprocessing import LabelBinarizer
from sklearn.metrics import make_scorer


def dcg_score(y_true, y_score, k=5):
    """Discounted cumulative gain (DCG) at rank K.

    Parameters
    ----------
    y_true : array, shape = [n_samples]
        Ground truth (true relevance labels).
    y_score : array, shape = [n_samples, n_classes]
        Predicted scores.
    k : int
        Rank.

    Returns
    -------
    score : float
    """
    order = np.argsort(y_score)[::-1]
    y_true = np.take(y_true, order[:k])

    gain = 2 ** y_true - 1

    discounts = np.log2(np.arange(len(y_true)) + 2)
    return np.sum(gain / discounts)


def ndcg_score(ground_truth, predictions, k=5):
    """Normalized discounted cumulative gain (NDCG) at rank K.

    Normalized Discounted Cumulative Gain (NDCG) measures the performance of a
    recommendation system based on the graded relevance of the recommended
    entities. It varies from 0.0 to 1.0, with 1.0 representing the ideal
    ranking of the entities.

    Parameters
    ----------
    ground_truth : array, shape = [n_samples]
        Ground truth (true labels represended as integers).
    predictions : array, shape = [n_samples, n_classes]
        Predicted probabilities.
    k : int
        Rank.

    Returns
    -------
    score : float

    Example
    -------
    >>> ground_truth = [1, 0, 2]
    >>> predictions = [[0.15, 0.55, 0.2], [0.7, 0.2, 0.1], [0.06, 0.04, 0.9]]
    >>> score = ndcg_score(ground_truth, predictions, k=2)
    1.0
    >>> predictions = [[0.9, 0.5, 0.8], [0.7, 0.2, 0.1], [0.06, 0.04, 0.9]]
    >>> score = ndcg_score(ground_truth, predictions, k=2)
    0.6666666666
    """
    lb = LabelBinarizer()
    lb.fit(range(len(predictions) + 1))
    T = lb.transform(ground_truth)

    scores = []

    # Iterate over each y_true and compute the DCG score
    for y_true, y_score in zip(T, predictions):
        actual = dcg_score(y_true, y_score, k)
        best = dcg_score(y_true, y_true, k)
        score = float(actual) / float(best)
        scores.append(score)

    return np.mean(scores)


# NDCG Scorer function
ndcg_scorer = make_scorer(ndcg_score, needs_proba=True, k=5)
SofaSofa数据科学社区DS面试题库 DS面经

u_u   2019-10-26 16:37



  相关主题

sklearn算法里输入的数据集是要求pandas.DataFrame还是numpy.array呢   2回答

如何往numpy array中插入一列数组?   4回答

如何在numpy array尾部增加一行   2回答

怎么用numpy生成一个三维的tensor?   1回答

'numpy.ndarray' object has no attribute 'head'   1回答

怎么对2维的numpy array取整?   2回答

怎么随机打乱一个numpy array?   3回答

怎么把torch.tensor转成numpy ndarray?   2回答

numpy里的np.clip函数怎么用?   1回答

numpy array怎么保存成文件?   1回答

python numpy里array和matrix的区别?   1回答

调用np.average()时报错TypeError: No loop matching   1回答



回答问题时需要注意什么?

我们谢绝在回答前讲“生动”的故事。

我们谢绝“这么简单,你自己想”、“书上有的,你认真看”这类的回答;如果你认为对方的提问方式或者内容不妥,你可以直接忽略该问题,不用进行任何作答,甚至可以对该问题投反对票。

我们谢绝答非所问。

我们谢绝自己不会、硬要回答。

我们感激每一个用户在编写答案时的努力与付出!