通过 ASP.NET MVC 从 SQL Server 下载和上传图像

rusanu

4.89/5 (14投票s)

2010年12月29日

Apache

9分钟阅读

109936

4182

在通过 HTTP 与 SQL Server 数据库传输大型文件时进行高效流式处理。

下载源文件 - 270.93 KB

引言

在讨论论坛上经常出现一个问题，那就是如何从 ASP.NET 应用程序中提供存储在 SQL Server 表中的图像。不幸的是，答案几乎总是错误的，因为普遍的解决方案涉及在将**整个**图像文件返回给客户端之前将其复制到内存中。当用少量数据和几个小图像进行测试时，此解决方案可以正常工作。但在生产环境中，所有这些存储在内存中作为字节数组的图像文件所需的内存消耗会导致严重的性能下降。一个好的解决方案必须使用流式处理语义，将数据以小块的形式从 SQL Server 传输到 HTTP 返回的结果中。

SqlClient 组件确实为大型结果集（包括大型 BLOB 字段）提供了流式处理语义，但客户端必须明确要求。‘秘密配料’是在 `SqlCommand.ExecuteReader` 中传递 `CommandBehavior.SequentialAccess` 标志。

提供了一种方法，让 `DataReader` 处理包含具有大型二进制值的列的行。`SequentialAccess` 不会加载整个行，而是允许 `DataReader` 将数据加载为 `stream`。然后，您可以使用 `GetBytes` 或 `GetChars` 方法指定一个字节位置来开始读取操作，并为返回的数据指定一个有限的缓冲区大小。

一个由 SQL Server 支持的 ASP.NET MVC 虚拟媒体文件夹

假设我们想在 ASP.NET MVC 站点中拥有一个虚拟媒体文件夹，从 SQL Server 数据库提供文件。一个 URL 的 GET 请求，例如"http://site/Media/IMG0042.JPG"应该返回数据库中名为 IMG0042.JPG 的文件的内容。一个 URL 的 POST 请求"http://site/Media"其中包含嵌入的文件，应该将此新文件插入数据库，并将响应重定向到新添加文件的虚拟路径。这就是我们的上传 HTML 表单的样子

<form method="post" action="/Media" enctype="multipart/form-data">
<input type="file" name="file" id="file"/>
<input type="submit" name="Submit" value="Submit"/>
</form>

我们将从在 Global.asax.cs 中添加一个特殊路由开始，该路由将像一个虚拟文件夹一样处理文件下载请求。

routes.MapRoute(
	"Media",
	"Media/{filename}",
	new { controller = "Media", action = "GetFile" },
	new { filename = @"[^/?*:;{}\\]+" });

请注意，如果我们向控制器添加一个处理 POST 请求的 Index() 方法，上传将由默认的 MVC 路由处理。

对于我们的 GET 请求，我们需要一个 FileDownloadModel 类来表示请求的文件的属性。对于此示例，我们不需要 POST 模型，因为我们只有一个输入字段，即上传的文件。我们将使用一个 Repository 接口，该接口声明了两个方法：`GetFileByName` 根据文件名从存储库返回一个 FileDownloadModel，而 `PutFile` 将接受一个上传的文件并将其放入存储库。

public interface IMediaRepository
{
    bool GetFileByName(
        string fileName,
        out FileDownloadModel file);

    void PostFile(
        HttpPostedFileBase file,
        out string fileName);
}

有了这个 Repository 接口，我们就可以编写 MediaController 类了。

public class MediaController : Controller
{
    public IMediaRepository Repository { get; set; }
    
    public MediaController()
    {
        Repository = new SqlMediaRepository();
    }
    
    [HttpPost]
    public ActionResult Index()
    {
        string fileName;
        Repository.PostFile(Request.Files[0], out fileName);
        return new RedirectResult("/Media/" + fileName);
    }
    
    [HttpGet]
    public ActionResult GetFile(string fileName)
    {
        FileDownloadModel model;
        if (false == Repository.GetFileByName(
            fileName,
            out model))
        {
            return new HttpNotFoundResult
            {
                StatusDescription = String.Format(
                    "File {0} not found",
                    fileName)
            };
        }
        
        if (null != model.ContentCoding)
        {
            Response.AddHeader(
                "Content-Encoding",
                model.ContentCoding);
        }
        Response.AddHeader(
            "Content-Length",
            model.ContentLength.ToString ());
            
        Response.BufferOutput = false;
        
        return new FileStreamResult(
            model.Content,
            model.ContentType);
    }
}

我已将 Repository 实现硬编码为 SqlMediaRepository，这是一个我们稍后将创建的类。实际项目可能会使用依赖注入或控制反转模式，例如可能使用 Castle Windsor。为简洁起见，我将省略这些细节，有很多博客和文章对此进行了描述。

注意使用了 FileStreamResult 返回，这是一个 MVC 提供的用于返回任意 Stream 对象下载的操作。这也引出了下一个问题，我们需要实现一个 Stream 来从 SqlDataReader 读取内容。

基于 SqlDataReader 的 Stream

现在我们需要一个 `abstract Stream` 类的实现，该类可以从 `SqlDataReader` 流式传输 `BLOB` 列。等等，你说，`SqlBytes` 难道没有一个 `Stream` 属性来将结果中的 `BLOB` 读取为 `Stream` 吗？不幸的是，这个小小的说明使得这个类对我们来说毫无用处。

获取或设置 `Stream` 属性会将所有数据加载到内存中。与大型值数据一起使用它可能会导致 `OutOfMemoryException`。

因此，我们剩下的是实现一个基于 `SqlDataReader BLOB` 字段的 `Stream`，一个使用适当的 `GetBytes` 调用返回 `BLOB` 内容的 `Stream`，并且不将整个 `BLOB` 加载到内存中。幸运的是，这相当简单，因为我们只需要实现少量方法。

public class SqlReaderStream: Stream
{
    private SqlDataReader reader;
    private int columnIndex;
    private long position;
    
    public SqlReaderStream(
        SqlDataReader reader,
        int columnIndex)
    {
        this.reader = reader;
        this.columnIndex = columnIndex;
    }
    
    public override long Position
    {
        get { return position; }
        set { throw new NotImplementedException(); }
    }
    
    public override int Read(byte[] buffer, int offset, int count)
    {
        long bytesRead = reader.GetBytes(columnIndex, position, buffer, offset, count);
        position += bytesRead;
        return (int)bytesRead;
    }
    
    public override bool CanRead
    {
        get { return true; }
    }
    
    public override bool CanSeek
    {
        get { return false; }
    }
    
    public override bool CanWrite
    {
        get { return false; }
    }
    
    public override void Flush()
    {
        throw new NotImplementedException();
    }
    
    public override long Length
    {
        get { throw new NotImplementedException(); }
    }
    
    public override long Seek(long offset, SeekOrigin origin)
    {
        throw new NotImplementedException();
    }
    
    public override void SetLength(long value)
    {
        throw new NotImplementedException();
    }
    
    public override void Write(byte[] buffer, int offset, int count)
    {
        throw new NotImplementedException();
    }
    
    protected override void Dispose(bool disposing)
    {
        if (disposing && null != reader)
        {
            reader.Dispose();
            reader = null;
        }
        base.Dispose(disposing);
    }
}

正如您所见，我们只需要返回 `CanRead`（是）、`CanWrite`（否）和 `CanSeek`（也否）的正确响应，跟踪我们的当前位置，并且我们需要实现 `Read` 以从读取器中获取更多字节，使用 `GetReader`。

我们还覆盖了Dispose(bool disposing)方法。这是因为当流式传输完成内容流时，我们将不得不关闭 `SqlDataReader`。如果这是您第一次看到此 `Dispose` 方法的签名，那么您必须阅读 Implementing a Dispose Method。

BLOB 数据流式上传

就像从 SQL Server 检索大型 BLOB 数据会带来避免创建完整 BLOB 内存副本的挑战一样，在尝试插入 BLOB 时也会出现类似的问题。最好的解决方案实际上非常复杂。它涉及将数据分块发送到服务器，并使用就地 BLOB UPDATE 语法。MSDN 在 Remarks 部分有如下说明：

使用 `.WRITE` (expression, @Offset, @Length) 子句来执行 `varchar(max)`、`nvarchar(max)` 和 `varbinary(max)` 数据类型的部分或完整更新。例如，`varchar(max)` 列的部分更新可能只删除或修改该列的前 200 个字符，而完整更新则删除或修改该列中的所有数据。为了获得最佳性能，我们建议以 8040 字节的倍数插入或更新数据。

为了实现这种语义，我们将编写一个*第二个* `Stream` 实现，这次用于上传。

    public class SqlStreamUpload: Stream
    {
        public SqlCommand InsertCommand { get; set; }
        public SqlCommand UpdateCommand { get; set; }
        public SqlParameter InsertDataParam { get; set; }
        public SqlParameter UpdateDataParam { get; set; }

        public override bool CanRead
        {
            get { return false; }
        }

        public override bool CanSeek
        {
            get { return false; }
        }

        public override bool CanWrite
        {
            get { return true; }
        }

        public override void Flush()
        {
        }

        public override long Length
        {
            get { throw new NotImplementedException(); }
        }

        public override long Position
        {
            get; set;
        }

        public override int Read(byte[] buffer, int offset, int count)
        {
            throw new NotImplementedException();
        }

        public override long Seek(long offset, SeekOrigin origin)
        {
            throw new NotImplementedException();
        }

        public override void SetLength(long value)
        {
            throw new NotImplementedException();
        }

        public override void Write(byte[] buffer, int offset, int count)
        {
            byte[] data = buffer;
            if (offset != 0 ||
                count != buffer.Length)
            {
                data = new byte[count];
                Array.Copy(buffer, offset, data, 0, count);
            }
            if (0 == Position &&
                null != InsertCommand)
            {
                InsertDataParam.Value = data;
                InsertCommand.ExecuteNonQuery();
            }
            else
            {
                UpdateDataParam.Value = data;
                UpdateCommand.ExecuteNonQuery();
            }
            Position += count;
        }
    }

此 `Stream` 实现使用了两个 `SqlCommand` 对象：一个 `InsertCommand` 用于保存第一个块，一个 `UpdateCommand` 用于保存后续块。请注意，块大小（最佳的 8040 字节）并未在任何地方指定，这可以通过将 `SqlStreamUpload` 包装在 `BufferedStream` 实例中轻松实现。

MEDIA 表

create table media (
	[media_id] int not null identity(1,1),
	[file_name] varchar(256),
	[content_type] varchar(256),
	[content_coding] varchar(256),
	[content] varbinary(max),
        constraint pk_media_id primary key([media_id]),
	constraint unique_file_name unique ([file_name]));

此表包含可下载的媒体文件。文件由名称标识，因此名称具有唯一约束。我添加了一个 `IDENTITY` 主键，因为在 CMS 中，这些文件经常被应用程序的其他部分引用，而 `INT` 比文件名更短的引用键。`content_type` 字段是必需的，以了解文件的类型：“image/png”、“image/jpg”等（请参阅官方注册的 IANA Media types。`content_coding` 字段是必需的，如果文件存储为压缩文件，则必须将带有“gzip”或“deflate”标签的 `Content-Encoding` HTTP 标头添加到响应中。请注意，大多数图像类型（JPG、PNG）的压缩效果不佳，因为这些文件格式已经包含压缩算法，并且当使用常见的 HTTP 传输算法（gzip、deflate、compress）再次压缩时，它们通常会*增加*大小。content_typefield is needed to know the type of file: “image/png”, “image/jpg”, etc. (see the officially registered IANA Media types. Thecontent_codingfield is needed if the files are stored compressed and aContent-EncodingHTTP header with the tag “gzip” or “deflate” has to be added to the response. Note that most image types (JPG, PNG) don’t compress well, as these file formats already include a compression algorithm and when compressed again with the common HTTP transfer algorithms (gzip, deflate, compress) they usually increase in size.

SqlMediaRepository

拼图的最后一块：一个使用 SQL Server 后端进行文件存储的 `IMediaRepository` 接口的实现。

public class SqlMediaRepository: IMediaRepository
{
    private SqlConnection GetConnection()
    {
        SqlConnectionStringBuilder scsb = new SqlConnectionStringBuilder(
            ConfigurationManager.ConnectionStrings["Images"].ConnectionString);
        scsb.Pooling = true;
        SqlConnection conn = new SqlConnection(scsb.ConnectionString);
        conn.Open();
        return conn;
    }
    
    public void PostFile(
        HttpPostedFileBase file,
        out string fileName)
    {
        fileName = Path.GetFileName(file.FileName);
        
        using (SqlConnection conn = GetConnection())
        {
            using (SqlTransaction trn = conn.BeginTransaction())
            {
                SqlCommand cmdInsert = new SqlCommand(
                    @"INSERT INTO media (
                        file_name,
                        content_type,
                        content_coding,
                        content)
                    values (
                        @content_disposition,
                        @content_type,
                        @content_coding,
                        @data);", conn, trn);
                cmdInsert.Parameters.Add("@data", SqlDbType.VarBinary, -1);
                cmdInsert.Parameters.Add("@content_disposition", SqlDbType.VarChar, 256);
                cmdInsert.Parameters["@content_disposition"].Value = fileName;
                cmdInsert.Parameters.Add("@content_type", SqlDbType.VarChar, 256);
                cmdInsert.Parameters["@content_type"].Value = file.ContentType;
                cmdInsert.Parameters.Add("@content_coding", SqlDbType.VarChar, 256);
                cmdInsert.Parameters["@content_coding"].Value = DBNull.Value;
                
                SqlCommand cmdUpdate = new SqlCommand(
                        @"UPDATE media
                        SET content.write (@data, NULL, NULL)
                        WHERE file_name = @content_disposition;", conn, trn);
                cmdUpdate.Parameters.Add("@data", SqlDbType.VarBinary, -1);
                cmdUpdate.Parameters.Add("@content_disposition", SqlDbType.VarChar, 256);
                cmdUpdate.Parameters["@content_disposition"].Value = fileName;
                
                using (Stream uploadStream = new BufferedStream(
                    new SqlStreamUpload
                    {
                        InsertCommand = cmdInsert,
                        UpdateCommand = cmdUpdate,
                        InsertDataParam = cmdInsert.Parameters["@data"],
                        UpdateDataParam = cmdUpdate.Parameters["@data"]
                    }, 8040))
                {
                    file.InputStream.CopyTo(uploadStream);
                }
                trn.Commit();
            }
        }
    }
    
    public bool GetFileByName(string fileName, out FileDownloadModel file)
    {
        SqlConnection conn = GetConnection();
        try
        {
            SqlCommand cmd = new SqlCommand(
                @"SELECT file_name,
                 content_type,
                    content_coding,
                    DATALENGTH (content) as content_length,
                 content
                FROM media
                WHERE file_name = @fileName;", conn);
            SqlParameter paramFilename = new SqlParameter(
                  @"fileName", SqlDbType.VarChar, 256);
            paramFilename.Value = fileName;
            cmd.Parameters.Add(paramFilename);
            SqlDataReader reader = cmd.ExecuteReader(
                CommandBehavior.SequentialAccess |
                CommandBehavior.SingleResult |
                CommandBehavior.SingleRow |
                CommandBehavior.CloseConnection);
            if (false == reader.Read())
            {
                reader.Dispose();
                conn = null;
                file = null;
                return false;
            }
            
            string contentDisposition = reader.GetString(0);
            string contentType = reader.GetString(1);
            string contentCoding = reader.IsDBNull(2) ? null : reader.GetString(2);
            long contentLength = reader.GetInt64(3);
            Stream content = new SqlReaderStream(reader, 4);
            
            file = new FileDownloadModel
            {
                FileName = contentDisposition,
                ContentCoding = contentCoding,
                ContentType = contentType,
                ContentLength = contentLength,
                Content = content
            };
            conn = null; // ownership transferred to the reader/stream
            return true;
        }
        finally
        {
            if (null != conn)
            {
                conn.Dispose();
            }
        }
    }
}

通常，MVC 应用程序倾向于使用基于 LINQ 的 Repository。在这种情况下，由于实现高效的大型 `BLOB` 流式处理的特殊要求，我无法利用 LINQ。因此，此 Repository 使用纯粹的 `SqlClient` 代码。

`GetFileByName` 方法从媒体table 中获取一行，并返回一个 `FileDownloadModel`，其中包含一个包装了 `SELECT` 命令结果的 `SqlReaderStream` 对象。请注意，我无法部署典型的“`using`”模式来处理可处置的 `SqlConnection`，因为连接必须保持打开状态，直到命令结果 `SqlDataReader` 完成 BLOB 的流式传输为止，并且流式传输将由 MVC 框架在 `GetFileByName` 完成*之后*执行我们的 `ActionResult` 来控制。连接将在 `SqlReaderStream` 被处置时关闭，因为 `CommandBehavior.CloseConnection` 标志。流将由 FileStreamResult.WriteFile 方法处置。

`PostFile` 方法创建两个 `SqlCommand` 语句：一个用于插入第一块数据以及其他相关列，另一个用于使用 `BLOB` 部分更新写入后续块的更新命令。然后，`SqlStreamUpload` 对象使用这两个 `SqlCommand` 来有效地流式传输上传的文件。中间的 `BufferedStream` 用于创建关键大小为 8040 字节的上传块（见上文）。如果内容需要压缩，则在此处进行，将 `GZipStream` 放在 `Bufferedstream` 前面以压缩上传的文件，并且“`@content_coding`”参数必须设置为“`gzip”。

HTTP 缓存

在此实现中，我省略了适当的 HTTP 缓存控制。HTTP 缓存对于高流量网站至关重要，优化 HTTP 请求处理的最佳方法是根本不接收该请求，而是让用户代理缓存或某个中间代理提供响应。我们的应用程序将必须添加适当的 `ETag` 和 Cache-Control HTTP 标头，并且 `MediaController` 需要一个用于 `HEAD` 请求的动作。`IMediaRepository` 接口需要一个新方法来获取所有文件属性而不获取实际内容。目前，我将将其留给读者作为练习……

BLOB 或不 BLOB

本文并不试图解决是否应该一开始就在数据库中存储图像这一根本问题。无论哪种情况都有论据支持。Russel Sears、Catherine van Ingen 和 Jim Gray 在 2006 年发表了一篇研究论文，比较了在数据库中存储文件和在文件系统中存储文件的性能：To BLOB or Not To BLOB: Large Object Storage in a Database or a Filesystem?。他们得出结论：

研究表明，如果对象平均大于 1 MB，NTFS 相对于 SQL Server 具有明显优势。如果对象小于 256 KB，数据库具有明显优势。在此范围之内，取决于工作负载的写入密集程度以及系统中典型副本的存储年龄。

然而，他们的研究并未比较如何将文件作为 HTTP 响应提供。Web 服务器可以直接从文件系统高效地提供文件而无需在 Web 应用程序中运行任何代码，这一事实改变了等式，并且在性能上极大地倾向于文件系统。但是，如果您已经考虑了优缺点并决定一致的备份/还原和强大的引用完整性优势值得数据库存储的 BLOB，那么我希望本文能突出一种高效的方法来将这些 BLOB 作为 HTTP 响应返回。