STL/CLR 线性容器性能分析

Nish Nishant

4.80/5 (23投票s)

2008年3月8日

CDDL

5分钟阅读

92758

本文将 STL/CLR 序列容器的性能与相应的 BCL 通用集合类进行了比较。

test app

引言

本文旨在将 STL/CLR 序列容器的整体性能与 .NET 通用 List<T> 集合类进行比较。在开始撰写本文之前，我坚信 STL/CLR 容器会快很多。令我非常惊讶的是，我发现并非如此，List<T> 轻松超越了 STL/CLR 容器。

我如何比较性能

我想保持简单，并使用了重复执行特定操作多次的常用技术。为了使设计更顺畅，我有一个如下的接口：

namespace STLCLRTests 
{
    public interface class IMeasurable 
    {
        Int64 RunCode(int iterations);
    };
}

RunCode 将按照 iterations 指定的次数运行一段特定的代码，并返回所花费的时间（以毫秒为单位）。我还有以下实现此接口的 abstract 类。

namespace STLCLRTests 
{
    public ref class MeasurableDoubleOp abstract : IMeasurable
    {
    private:
        static Stopwatch^ stopWatch = gcnew Stopwatch();

    public:
        virtual Int64 RunCode(int iterations)
        {
            Initialize();

            stopWatch->Reset();
            stopWatch->Start();

            RunCodeFirstOp(iterations);
            RunCodeSecondOp(iterations);

            stopWatch->Stop();

            return stopWatch->ElapsedMilliseconds;
        }

    protected:
        virtual void Initialize() {}
        virtual void RunCodeFirstOp(int iterations) abstract;
        virtual void RunCodeSecondOp(int iterations) abstract;
    };
}

要分析某个集合类，我只需继承这个抽象类并实现 RunCodeFirstOp 和 RunCodeSecondOp。我还有一个 MeasurableSingleOp 类用于执行不涉及两步操作的测试。

STL vector 与 List<T> - 基本插入/删除

以下是 vector 特定的和 List<T> 特定的类的实现。

namespace STLCLRTests 
{
    public ref class VectorInsertRemove : MeasurableDoubleOp
    {
    private:
        cliext::vector<int> vector;

    protected:
        IEnumerable<int>^ GetVector()
        {
            return %vector;
        }

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                vector.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for(int count=0; count<iterations; count++)
            {
                vector.pop_back();
            }
        }
    };
}
namespace STLCLRTests 
{
    public ref class GenericListInsertRemove : MeasurableDoubleOp
    {
    private:
        List<int> list;

    protected:
        IEnumerable<int>^ GetList()
        {
            return %list;
        }

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                list.Add(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for(int count=0; count<iterations; count++)
            {
                list.RemoveAt(list.Count - 1);
            }
        }
    };
}

以下是我的测试结果。如您所见，BCL 类 (List<T>) 完全胜过了 STL/CLR vector 类。

迭代	STL/CLR	BCL
100000	15	3
500000	63	32
1000000	122	21
10000000	1311	299

这是一张图表，展示了两种容器的性能。很明显，BCL 类的性能远远优于 STL vector。

STL vector vs List - basic insertion/removal

正如您所能想象的，我对这个结果感到非常惊讶。我只是出于好奇，认为我也应该将标准的 STL vector 与 STL/CLR vector 实现进行比较。请注意，我仍然在使用托管代码 (/clr) - 标准 STL 代码也是以 /clr 编译的。这是我令人惊讶的结果。

迭代	STL/CLR	标准 STL
100000	11	39
500000	58	202
1000000	117	391
10000000	1161	3919

STL/CLR vector vs standard vector - basic insertion/removal

基于该结果，您绝对应该避免使用 /clr 编译原生 STL 代码。仅仅移植到 STL/CLR 就能在很大程度上提高性能。您可能会发现，您只需要更改命名空间（从 cliext 到 std），而无需更改其他太多代码。而且，我并非仅仅根据 vector 的测试结果得出此结论，我还比较了标准的 list 和 STL/CLR list 容器，结果如下。

迭代	STL/CLR	Std list
100000	33	101
500000	63	175
1000000	274	349
10000000	2969	3663

STL list vs List - basic insertion/removal

如您所见，性能差异并非微不足道。请注意，我们并非在此比较 STL 的原生性能。我们比较的是在 /clr 下编译的原生实现与 STL 的 CLR 实现之间的对比。

STL list 与 List<T> - 基本插入/删除

以下是 STL list 特定类的实现。

namespace STLCLRTests 
{
    public ref class StlListInsertRemove : MeasurableDoubleOp
    {
    private:
        cliext::list<int> list;

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                list.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for(int count=0; count<iterations; count++)
            {
                list.pop_back();
            }
        }
    };
}

以下是我的测试结果。这里，反差甚至更大 - 并不令人意外，因为对于直接插入和删除，STL list 的速度总是比 STL vector 慢。

迭代	STL/CLR	BCL
100000	32	2
500000	149	11
1000000	332	23
10000000	3719	331

这是结果的图表。

STL list vs List - basic insertion/removal

STL deque 与 List<T> - 基本插入/删除

这是 deque 的实现。

namespace STLCLRTests 
{
    public ref class DequeInsertRemove : MeasurableDoubleOp
    {
    private:
        cliext::deque<int> deque;

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                deque.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for(int count=0; count<iterations; count++)
            {
                deque.pop_back();
            }
        }
    };
}

以下是我的结果。模式没有改变 - 在这里，BCL 类也快得多。

迭代	STL/CLR	BCL
100000	33	2
500000	66	13
1000000	83	26
10000000	1061	251

这是图表。

STL deque vs List - basic insertion/removal

队列的 BCL 等效项是 Queue<T> 类 - 所以为了确保我们是苹果对苹果地比较，我继续进行了测试，比较了 STL/CLR deque 和 BCL Queue<T>。我的结果和相应的图表如下。

迭代	STL/CLR	BCL
100000	12	6
500000	49	15
1000000	89	28
10000000	1044	335

STL deque vs Queue - basic insertion/removal

Queue<T> 类似乎比 List<T> 稍慢，但仍然比 STL/CLR deque 容器快得多。

STL vector 与 List<T> - 基本迭代

这次，我想测试我们迭代线性集合的速度。以下是 vector 和 List<T> 特定的迭代测试实现。

namespace STLCLRTests 
{
    public ref class VectorIterate : MeasurableDoubleOp
    {
    private:
        cliext::vector<int> vector;

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            vector.clear();

            for(int count=0; count<iterations; count++)
            {
                vector.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for(cliext::vector<int>::iterator it = vector.begin(); it != vector.end(); it++)
            {
            }
        }
    };
}
namespace STLCLRTests 
{
    public ref class GenericListIterate : MeasurableDoubleOp
    {
    private:
        List<int> list;

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            list.Clear();

            for(int count=0; count<iterations; count++)
            {
                list.Add(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            for each(int x in list)
            {
            }
        }
    };
}

以下是我的测试结果。结果进一步证明了 List<T> 类的卓越效率。

迭代	STL/CLR	BCL
100000	24	2
500000	93	16
1000000	194	31
10000000	2009	394

这是相应的图表。

STL vector vs List - basic iteration

STL vector 与 List<T> - Linq 访问 (where)

对于 Linq 测试，我使用了 C# 项目（为了更方便的语法）。我继承了插入测试器，仅重写了 RunCodeSecondOp 方法，因为我想保持插入代码不变。

namespace LinqTests
{
    public class VectorLinqWhere : VectorInsertRemove
    {
        public override void RunCodeSecondOp(int iterations)
        {
            IEnumerable<int> _vector = GetVector();
            var newVector = _vector.Where(x => x % 2 == 0);
        }
    }
}
namespace LinqTests
{
    public class GenericListLinqWhere : GenericListInsertRemove
    {
        public override void RunCodeSecondOp(int iterations)
        {
            IEnumerable<int> _list = GetList();
            var newList = _list.Where(x => x % 2 == 0);
        }
    }
}

以下是我的测试运行结果。这里的测试结果部分受到插入代码速度差异的影响。但是性能差异足够大，可以暂时忽略这一点，而且 LINQ 在 BCL 类上的运行速度也比 STL/CLR 版本快得多。

迭代	STL/CLR	BCL
100000	18	1
500000	44	7
1000000	79	11
10000000	842	168

这是图表。

Linq where test

STL vector 与 List<T> - Linq 访问 (take)

这与上一个类似，只是我使用了 Take 而不是 Where。

namespace LinqTests
{
    public class VectorLinqTake : VectorInsertRemove
    {
        public override void RunCodeSecondOp(int iterations)
        {
            IEnumerable<int> _vector = GetVector();
            var newVector = _vector.Take(_vector.Count() / 2);
        }
    }
}
namespace LinqTests
{
    public class GenericListLinqTake : GenericListInsertRemove
    {
        public override void RunCodeSecondOp(int iterations)
        {
            IEnumerable<int> _list = GetList();
            var newList = _list.Take(_list.Count() / 2);
        }
    }
}

这是我的测试结果。这些结果与之前的测试非常相似。

迭代	STL/CLR	BCL
100000	7	0
500000	35	4
1000000	70	10
10000000	865	205

这是相应的图表。

Linq take test

排序

我运行了测试，比较了 List<T> 类与 STL/CLR vector 和 list 容器的排序速度。使用的代码如下。

namespace STLCLRTests 
{
    public ref class GenericListSort : MeasurableDoubleOp
    {
    private:
        List<int> list;

    protected:
        IEnumerable<int>^ GetList()
        {
            return %list;
        }

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                list.Add(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            list.Sort();
        }
    };
}

namespace STLCLRTests 
{
    public ref class StlListSort : MeasurableDoubleOp
    {
    private:
        cliext::list<int> list;

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                list.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            list.sort();
        }
    };
}
namespace STLCLRTests 
{
    public ref class VectorSort : MeasurableDoubleOp
    {
    private:
        cliext::vector<int> vector;

    protected:
        IEnumerable<int>^ GetVector()
        {
            return %vector;
        }

    public:
        virtual void RunCodeFirstOp(int iterations) override
        {
            for(int count=0; count<iterations; count++)
            {
                vector.push_back(10);
            }
        }

        virtual void RunCodeSecondOp(int iterations) override 
        {
            sort(vector.begin(), vector.end());
        }
    };
}

以下是 vector 与 List<T> 的结果。

迭代	STL/CLR	BCL
100000	37	7
500000	136	53
1000000	325	137
10000000	2695	1088

以下是我的 stl list 与 List<T> 的结果。

迭代	STL/CLR	BCL
100000	138	7
500000	1162	51
1000000	5355	128
10000000	31985	1095

结论

STL/CLR 发布之前大力宣传的功能之一是其相比于常规 .NET 集合的性能优势。但是 .NET 通用 List<T> 似乎快得多。在此阶段，我唯一能想到的使用 STL/CLR 的有效情况是在将现有的 C++ 代码（大量使用 STL）首次移植到托管代码时。