实现 TakeLast、TakeLastWhile、SkipLast 和 SkipLastWhile LINQ 运算符

Paulo Morgado

4.12/5 (9投票s)

2010 年 12 月 12 日

CPOL

5分钟阅读

29446

前段时间，我需要检索满足某些条件的序列的最后几项，查看 Enumerable 类中可用的运算符时，我注意到没有这样的运算符。

引言

免费计数器前段时间，我需要检索满足某些条件的序列的最后几项，查看 Enumerable 类中可用的运算符时，我注意到没有这样的运算符。

实现此目的的唯一方法是反转序列，获取满足条件的项，然后反转结果序列。像这样：

sequence.Reverse().TakeWhile(criteria).Reverse();

看起来很简单，对吧？首先，我们调用 Reverse 方法来生成一个新序列，其中包含与原始序列相同的项，但顺序相反，然后我们调用 TakeWhile 方法来获取满足条件的第一个项，然后再次调用 Reverse 方法来恢复项的原始顺序。

这种方法的缺点是 Reverse 方法会在以相反顺序迭代其项之前缓冲整个序列——而上面的代码使用了两次。这意味着迭代原始序列中的所有项并全部缓冲它们，迭代满足条件的第一个结果序列项并全部缓冲它们，最后迭代结果以生成最终序列。

如果你在计算，你会得出结论，原始序列中的所有项都会被迭代一次，而结果序列中的项会再次迭代三次。如果原始序列很大，这可能会占用大量内存和时间。

如果你使用在选择条件评估中（>）使用原始序列中项索引的变体，则还有另一个问题。当我们反转项的顺序时，索引也会反转，并且谓词必须考虑这一点，如果你不知道原始序列中的项数，这可能无法实现。

肯定有更好的方法，这就是我实现 Take Last 和 Skip Last 运算符的原因。

TakeLast 运算符

此运算符从序列末尾返回指定数量的连续元素，并作为 TakeLast 扩展方法实现

public static IEnumerable<TSource> 
  TakeLast<TSource>(this IEnumerable<TSource> source, int count);

它的工作原理如下

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var topThreeGrades = grades.OrderBy(grade => grade).TakeLast(3);

Console.WriteLine("The top three grades are:");
foreach (int grade in topThreeGrades)
{
    Console.WriteLine(grade);
}
/*
This code produces the following output:

The top three grades are:
98
92
85
*/

实现 TakeLast 运算符

要实现此运算符，我们首先在充当循环缓冲区的数组中最多缓冲 count 个来自 source 序列的项

var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
var numOfItems = 0;
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++, numOfItems++)
{
    buffer[idx] = sourceEnumerator.Current;
}

如果缓冲的项数（numOfItems）少于请求的项数（count），我们只产生所有缓冲的项

if (numOfItems < count)
{
    for (idx = 0; idx < numOfItems; idx++)
    {
        yield return buffer[idx];
    }

    yield break;
}

接下来，我们迭代其余的项，循环缓冲它们

for (idx = 0; sourceEnumerator.MoveNext(); idx = (idx + 1) % count)
{
    buffer[idx] = sourceEnumerator.Current;
}

最后，我们只需迭代缓冲的项并产生它们

for (; numOfItems > 0; idx = (idx + 1) % count, numOfItems--)
{
    yield return buffer[idx];
}

这里可以进行两个优化。

第一个很明显；如果请求的项数为 0 (零)，我们只返回一个空序列

if (count <= 0)
{
    return System.Linq.Enumerable.Empty<TSource>();
}

第二个是如果已知 source 序列实现了 IList<T> 接口。实现此接口的对象具有 Count 属性和对其项的索引访问，这使我们能够优化最终序列的生成。

生成最终序列包括从列表末尾生成所需数量的项（如果列表包含的项少于所需项，则生成所有项）

int listCount = list.Count;

for (int idx = listCount - ((count < listCount) ? 
         count : listCount); idx < listCount; idx++)
{
    yield return list[idx];
}

TakeLastWhile 运算符

此运算符从序列末尾返回满足指定条件的所有元素，并作为 TakeLastWhile 扩展方法实现

public static IEnumerable<TSource> TakeLastWhile<TSource>(
       this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> TakeLastWhile<TSource>(
       this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

它的工作原理如下

string[] fruits = { "apple", "passionfruit", "banana", 
                    "mango", "orange", "blueberry", 
                    "grape", "strawberry" };

var query = 
  fruits.TakeLastWhile(fruit => string.Compare("orange", fruit, true) != 0);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:

passionfruit
grape
strawberry
*/

string[] fruits = { "apple", "passionfruit", "banana", 
                    "mango", "orange", "blueberry", 
                    "grape", "strawberry" };

var query = fruits.TakeLastWhile((fruit, index) => fruit.Length >= index);

foreach (string fruit in query)
{
    Console.WriteLine(fruit);
}

/*
This code produces the following output:

strawberry
*/

实现 TakeLastWhile 运算符

要实现此运算符，我们首先使用一个空缓冲区，并缓冲满足谓词实现的条件的每个项。每当一个项不满足条件时，缓冲区就会被清除

var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}

遍历源序列后，我们只产生缓冲区中的所有项（如果有）

foreach (var item in buffer)
{
    yield return item;
}

考虑项索引的重载仅在调用实现条件的谓词时有所不同

var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        buffer.Clear();
    }
}

foreach (var item in buffer)
{
    yield return item;
}

SkipLast 运算符

此运算符返回序列末尾除指定数量的连续元素之外的所有元素，并作为 SkipLast 扩展方法实现

public static IEnumerable<TSource> SkipLast<TSource>(
              this IEnumerable<TSource> source, int count)

它的工作原理如下

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades.OrderBy(g => g).SkipLast(3);

Console.WriteLine("All grades except the top three are:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades except the top three are:
56
59
70
82
*/

实现 SkipLast 运算符

要实现此运算符，我们首先在充当循环缓冲区的数组中最多缓冲 count 个来自 source 序列的项

var sourceEnumerator = source.GetEnumerator();
var buffer = new TSource[count];
int idx;

for (idx = 0; (idx < count) && sourceEnumerator.MoveNext(); idx++)
{
    buffer[idx] = sourceEnumerator.Current;
}

接下来，我们迭代源序列的其余项，循环缓冲它们，并产生先前缓冲的位于相同位置的项

idx = 0;

while (sourceEnumerator.MoveNext())
{
    var item = buffer[idx];

    buffer[idx] = sourceEnumerator.Current;

    idx = (idx + 1) % count;

    yield return item;
}

如果跳过的项数大于或等于源序列中的项数，则 sourceEnumerator.MoveNext() 将在 while 循环的第一次迭代中返回 false，并生成一个空序列。

与 TakeLast 运算符一样，这里可以进行一些优化。

第一个很明显，如果请求的项数为 0 (零)或更少，我们只返回一个等效序列

if (count <= 0)
{
    return source.Select(i => i);
}

第二个是如果已知 source 序列实现了 IList<T> 接口。实现此接口的对象具有 Count 属性和对其项的索引访问，这使我们能够优化最终序列的生成。

如果跳过的项数大于或等于源序列中的项数，我们只返回一个空序列

var list = source as IList<TSource>;

if (list != null)
{
    if (count >= list.Count)
    {
        return System.Linq.Enumerable.Empty<TSource>();
    }

    // ...
}

如果源序列中的项数大于要跳过的项数，则生成最终序列包括生成源序列中的所有项，但最后 count 个项除外

int returnCount = list.Count - count;

for (int idx = 0; idx < returnCount; idx++)
{
    yield return list[idx];
}

SkipLastWhile 运算符

此运算符返回序列中的所有元素，跳过末尾满足指定条件的那些元素，并作为 SkipLastWhile 扩展方法实现

public static IEnumerable<TSource> SkipLastWhile<TSource>(
       this IEnumerable<TSource> source, Func<TSource, bool> predicate)
public static IEnumerable<TSource> SkipLastWhile<TSource>(
       this IEnumerable<TSource> source, Func<TSource, int, bool> predicate)

它的工作原理如下

int[] grades = { 59, 82, 70, 56, 92, 98, 85 };

var lowerGrades = grades.OrderBy(grade => grade).SkipLastWhile(grade => grade >= 80);

Console.WriteLine("All grades below 80:");
foreach (int grade in lowerGrades)
{
    Console.WriteLine(grade);
}

/*
This code produces the following output:

All grades below 80:
56
59
70
*/

int[] amounts = { 5000, 2500, 5500, 8000, 6500, 4000, 1500, 9000 };

var query = amounts .SkipLastWhile((amount, index) => amount > index * 1000);

foreach (int amount in query)
{
    Console.WriteLine(amount);
}

/*
 This code produces the following output:

9000
*/

实现 SkipLastWhile 运算符

要实现此运算符，我们使用一个空缓冲区，并缓冲满足谓词实现的条件的每个项。每当一个项不满足条件时，所有缓冲的项都会被生成，缓冲区被清除，并且不满足条件的项也会被生成

var buffer = new List<TSource>();

foreach (var item in source)
{
    if (predicate(item))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}

考虑项索引的重载仅在调用实现条件的谓词时有所不同

var buffer = new List<TSource>();
var idx = 0;

foreach (var item in source)
{
    if (predicate(item, idx++))
    {
        buffer.Add(item);
    }
    else
    {
        if (buffer.Count > 0)
        {
            foreach (var bufferedItem in buffer)
            {
                yield return bufferedItem;
            }

            buffer.Clear();
        }

        yield return item;
    }
}

结论

如您所见，实现这种运算符非常容易。没有理由在需要时不去实现这样的运算符。实现这样的运算符还可以提高代码的可读性和可维护性。您可以在我的 CodePlex 项目中找到这些（以及更多）LINQ 实用工具和运算符：PauloMorgado.Linq。