CP Vanity Lite

Nish Nishant

4.99/5 (29投票s)

2010年11月18日

CPOL

4分钟阅读

153265

822

这是 Luc Pattyn 著名 CP Vanity 应用程序的轻量级版本。

图 1：CP Vanity Lite - 主对话框

图 2：在 Excel 中导出的 CSV 文件

引言

这是 Luc Pattyn 广受欢迎的 CP Vanity 应用程序的精简版。由于不可避免的网站更改，Luc 厌倦了经常修改解析代码，并且自上次重大网站更新（几周前）以来，他的应用程序一直未更新。这促使我编写了这个应用程序，它是他的应用程序的迷你版。与 CP Vanity 不同，CP Vanity Lite 并没有深入研究细节，而是只获取很可能具有高声誉分数的成员的总声誉分数。我采用了 Luc 相同的方法，即根据发帖量和文章量获取排名靠前的成员，然后获取每个成员的声誉分数。我在每个类别中获取前 125 名，但由于某些成员同时出现在两个组中，您会发现获取声誉分数的成员总数少于 250。除了总分数外，它还显示每日平均分数，该分数是根据总分数和成员注册的天数计算得出的。

警告：该应用程序从硬编码的 URL 中提取和解析 HTML。如果 URL 或 HTML 内容发生更改，则解析代码将中断。在 CP 提供将返回此信息的 Web 服务之前，编写此应用程序的前提将是不稳定的。我自己和 Luc Pattyn 等人编写的类似应用程序也是如此。

使用 CP Vanity Lite

使用该应用程序并不难，您只需运行该应用程序并等待 UI 填充。标题栏将显示 fetching，因为正在执行 HTTP 提取，一旦所有数据可用，它将显示完成。 UI 是一个简单的 ListView，可以按 声誉分数 列排序。每次单击标题时，排序在升序和降序之间切换，如果您厌倦了看到 CG 的名字排在最前面，这非常方便！数据通过工作线程异步填充，因此该应用程序将在整个提取期间响应。即使在提取所有数据之前，您也可以对数据进行排序，但请记住，新添加的行将不会被排序，尽管您可以根据需要多次排序。一旦所有数据被提取，导出为 CSV 按钮将被启用，您可以保存一个 CSV 文件，然后可以在 Excel 等应用程序中打开该文件，以进行各种花哨的数据操作和图表制作。看看图 2，您会看到 CG 领先于竞争对手的程度，他拥有的声誉分数比 John 和 Pete 加起来还多！如果您定期运行该应用程序并保存 CSV 文件，您就可以很好地进行基于日期的数据解析，以确定分数的变化以及某人的追赶速度（如果您是统计学好奇类型）。

代码

HTML 抓取是在一个名为 RepScoreScraper 的类中完成的。代码将被阻塞，因此由调用代码从工作线程或通过某些不会阻塞主线程的异步/任务模式使用它。通过查询“谁是谁”页面获取潜在高分成员的 ID，以获取热门消息发布者和热门文章作者。使用正则表达式提取信息。（我已包装了代码块以防止滚动，实际的源代码不会像这样包装）

private Regex memberNumberRegex = new Regex("Member No. (\\d*)");
private Regex repScoreRegex = new Regex(
  "<span id=\"ctl00_.*?_TotalRep\" class=\"large-text\">([\\s\\S]*?)</span>");
private Regex displayNameRegex = new Regex(
  "<h2 id=\"ctl00_.*?_P_Name\">([\\s\\S]*?)</h2>");
private Regex memberSinceRegex = new Regex("Member since (.+)\n");

请注意，我已在此处的代码段中剪掉了 URL，以防止水平滚动。

public void StartScraping()
{
    string[] ml_obs = new[] { "ArticleCount", "MessageCount" };
    HashSet<int> ids = new HashSet<int>();

    for (int j = 0; j < ml_obs.Length; j++)
    {
        for (int page = 1; page <= 5; page++)
        {
            string url = String.Format(
                "**SNIPPED**?ml_ob={0}&mgtid=-1&mgm=False&pgnum={1}", 
                ml_obs[j], page);
            string html = GetHttpPage(url, timeOut);

            var memberNumberMatches = memberNumberRegex.Matches(html);
            var repScoreMatches = repScoreRegex.Matches(html);
            var displayNameMatches = displayNameRegex.Matches(html);
            var memberSinceMatches = memberSinceRegex.Matches(html);

            if (memberNumberMatches.Count == repScoreMatches.Count 
              && memberNumberMatches.Count == displayNameMatches.Count 
              && memberNumberMatches.Count == memberSinceMatches.Count)
            {
                for (int i = 0; i < memberNumberMatches.Count; i++)
                {
                    int id = -1;
                    double score = -1.0;
                    double scorePerDay = 0.0;

                    DateTime memberSince = ParseDateTime(memberSinceMatches[i].Value);

                    if (memberNumberMatches[i].Groups.Count == 2 && 
                      Int32.TryParse(memberNumberMatches[i].Groups[1].Value, out id) &&
                      repScoreMatches[i].Groups.Count == 2 && 
                      Double.TryParse(repScoreMatches[i].Groups[1].Value, out score) &&
                      displayNameMatches[i].Groups.Count == 2)
                    {
                        if (!ids.Contains(id))
                        {
                            ids.Add(id);

                            if (memberSince != DateTime.MinValue)
                            {
                                scorePerDay = Math.Round(
                                  score / (DateTime.Now - memberSince).TotalDays, 2);                                
                            }

                            var handler = MemberInfoScraped;
                            if (handler != null)
                            {
                                handler(this, new RepScoreScraperEventArgs() 
                                { 
                                    Id = id, 
                                    DisplayName = StripOffHtml(
                                      displayNameMatches[i].Groups[1].Value.Trim()), 
                                    ReputationScore = (int)score,
                                    DailyAverage = scorePerDay
                                });
                            }
                        }
                    }
                }
            }
        }
    }

    var finishedHandler = ScrapeFinished;
    if (finishedHandler != null)
    {
        finishedHandler(this, EventArgs.Empty);
    }
}

就这样。如果您遇到任何问题，请使用下面的论坛报告这些问题，我会尽力解决。谢谢。

致谢

RaviRanjankr - 用于 beta 测试该应用程序并提供有用的反馈，这促使我改进和简化了提取代码。

历史

2010 年 11 月 17 日：首次发布
2010 年 11 月 19 日
- 添加了排名和每日平均列。
- 您现在可以刷新分数了。