C# 字节数组快速压缩工具

ronnotel

3.27/5 (26投票s)

2007年1月11日

CPOL

2分钟阅读

109926

2598

一个快速但实用的字节数组压缩和解压缩工具

下载源代码文件 - 543 B

引言

为了提高高性能应用程序的网络带宽利用率，我发现压缩一个较大的对象（6 MB）后再进行传输，可以将网络调用的性能提高大约 10 倍。然而，MSDN 示例代码中的 GZipStream 类留下了很多不足。它晦涩难懂且编写质量不高，理解起来花费了我太多的时间。

当我最终弄清楚了发生了什么时，我留下了一个 IMHO 来说，对于字节数组的通用压缩和解压缩来说，是一个相当有用的工具。它使用了作为 System.IO.Compression 包的标准部分提供的 GZipStream 类。

我的工具包含一个单独的类，Compressor，其中包含两个 static 方法，Compress() 和 Decompress()。这两个方法都将字节数组作为参数，并返回一个字节数组。对于 Compress()，参数是未压缩的字节数组，返回值是压缩后的字节数组，反之对于 Decompress() 亦然。

在压缩过程中，压缩后的字节会以一个 Int32 头文件开头，该文件包含未压缩字节数组中的字节数。在解压缩过程中，该头文件用于分配 Decompress() 要返回的字节数组。

Using the Code

只需将要压缩的对象（或对象集合）转换为字节数组即可。我发现使用 BitConverter 和/或 Buffer 类进行一些自定义序列化可以很好地工作。对于具有固定记录大小的类（即仅包含值类型，不包含 string），还可以深入到 Marshal 类（如下例所示）中，将对象转换为指针，然后将指针指向的内存复制到你的缓冲区中。

一旦你有了字节数组，只需将其传递给 Compressor.Compress() 即可获得用于传输的压缩数组。在另一端，只需将压缩后的字节数组传递给 Decompress() 即可恢复原始字节数组。完成了！

//
// Sample Compression - how to send 100,000 stock prices across town in 1 second.
//
  public struct StockPrice
  {
    public int ID;
    public double bidPrice;
    public double askPrice;
    public double lastPrice;

    public static int sz = Marshal.SizeOf(typeof(StockPrice));
    public void CopyToBuffer(byte[] buffer, int startIndex)
    {
      IntPtr ptr = Marshal.AllocHGlobal(sz);
      Marshal.StructureToPtr(this, ptr, false);
      Marshal.Copy(ptr, buffer, startIndex, sz);
      Marshal.FreeHGlobal(ptr);
    }

    public static StockPrice CopyFromBuffer(byte[] buffer, int startIndex)
    {
      IntPtr ptr = Marshal.AllocHGlobal(sz);
      Marshal.Copy(buffer, startIndex, ptr, sz);
      StockPrice stockPrice = 
        (StockPrice)Marshal.PtrToStructure(ptr, typeof(StockPrice));
      Marshal.FreeHGlobal(ptr);
      return stockPrice;
    }
  }

  int Main()
  {
    // Assume that you are starting with a populated dictionary of StockPrice objects,
    // which is an instance of Dictionary<int, StockPrice> and is keyed by the ID field
    byte[] buffer = new byte[StockPriceDict.Count * StockPrice.sz];
    int startIndex = 0;
    foreach(StockPrice price in StockPriceDict.Values)
    {
      price.CopyToBuffer(buffer, startIndex);
      startIndex += StockPrice.sz;
    }

    byte[] gzBuffer = Compressor.Compress(buffer);

    // now uncompress the bytes and recover the original dictionary. 
    // This is *much* faster than
    // using .NET Remoting or similar techniques

    Dictionary<int, StockPrice> newStockPriceDict = new Dictionary<int, StockPrice>();
    byte[] buffer1 = Compressor.Decompress(gzBuffer);
    startIndex = 0;
    while (startIndex < buffer1.Length)
    {
      StockPrice stockPrice = StockPrice.CopyFromBuffer(buffer1, startIndex);
      newStockPriceDict[stockPrice.ID] = stockPrice;
    }
  }

关注点

如果 C# 有任何我可以改进的地方，那就是它处理对象作为字节数组的能力。这方面对于高性能计算至关重要，但 C# 产品团队并没有给予足够的重视。看起来此功能仅包含为了与 COM 兼容。然而，这可能是我在从事通常为 C++ 保留的高性能领域时最依赖的代码。

历史

v1.0 - 2007 年 1 月 10^日