在存储过程中创建动态 SQL 的两个实用技巧

DaveDbViewSharp

4.91/5 (6投票s)

2012年2月14日

CPOL

6分钟阅读

33472

201

描述了在存储过程中生成动态 SQL 时使用的两种技术，并提供了一个示例，演示了它们组合时会发生什么

下载源代码 - 2.05 KB

引言

我访问 CodeProject 已经很多年了，并且从许多优秀的文章中受益匪浅。我一直想回馈一些东西，但总是找不到一个好主题。然而，今天，我结合了两个巧妙的技术，在存储过程中创建动态 SQL，其最终结果的简洁性令我惊讶，促使我分享它。

背景

虽然这常常被视为不当，但有时在存储过程中只能使用动态 SQL。对于非琐碎的情况，这可能导致一些非常混乱的代码，难以理解和调试。我想介绍一种方法，它几乎总是能生成更短、更清晰、更容易扩展的代码。

常规方法

在存储过程中开发动态查询，最常规的方法是声明一个大的 string 变量，并通过添加和连接字面量文本和变量的混合物来逐步构建查询。例如

declare @query varchar(max)
set @query = 'select top '+str(@top)+' * from '+@table
if @sort_key is not null
    set @query = @query + ' order by '+@sort_key
print @query
exec (@query)

这个简单的例子看起来还不错，但即使在这里，字面量文本与 SQL 函数和变量的混合也降低了查询的可理解性。此外，以这种方式构建查询往往会生成一个脚本字符串，而对其中 SQL 的格式几乎不加考虑。在上面的例子中，虽然代码看起来很简单，但最终的 SQL 中没有换行符。随着动态查询变得越来越复杂，缺乏格式化使得调试它们更加恼人。

模板方法

我更喜欢模板方法。我在许多编程环境中都使用过模板，几乎总是享受这种体验。通过这种技术，起点是写下最终的查询在一个模板字符串中，在变量应该出现的地方插入一个标记。通过在模板中用变量值替换标记来生成要执行的查询。例如

declare @query_template varchar(max)
declare @query varchar(max)

set @query_template = '
select top %top *
  from %table
  %sort'

set @sort = case when @sort_key is null 
                 then '' 
                 else 'order by '+@sort_key end

set @query = replace(@query_template, '%top', @top)
set @query = replace(@query,    '%table', @table)
set @query = replace(@query,    '%sort',  @sort)

print @query

您可以立即看到，这是一个从某个表、以某种顺序选择前 N 行的查询。选择的标记名称使得在用参数值替换它们时，很容易弄清楚查询的外观。在这个简单的例子中，额外的代码可能看起来不值得，但在更复杂的例子中，好处会体现在更清晰的代码和更好的关注点分离上。模板化是我的第一个实用技术。

通过查询表构建字符串

我的第二种技术是使用对表的查询作为一种手段，生成一个包含表中每行数据的单个 string。你可能遇到过将其用于生成逗号分隔的 string。这是一个绝妙的技巧（不是我发明的：在 Google 上搜索“SQL Server coalesce comma”可以找到参考），因为它几乎不可能通过研究 SQL 语言来理解。对于不熟悉的人来说，这里有一个快速回顾。

典型问题是获得一个包含表中所有行中的一列数据的单个 string。所以如果你的表是

id    name
1    John
2    James
3    Jason

任务是获得字符串 "John, James, Jason"。有两个问题需要解决；首先是连接姓名；其次是正确地标点它们。这是解决方案

declare @names varchar(max)
select @names = coalesce(@names+', ','') + name 
  from names
 order by id
print @names

在处理每一行时，string 会通过添加逗号和当前姓名而修改。在第一行，变量 @name 是 null，所以 ', ' 连接到 null 也是 null，这导致 coalesce 函数选择空 string。如果你用 select @names = coalesce(@names+', ','The names are: ') + name 替换顶行，你将得到 "The names are: John, James, Jason"。

这最大的优点是它可能使您避免在存储过程代码中使用循环结构。SQL 中的游标语法确实丑陋且冗长。任何可以将其替换为更优雅的替代方案的机会都应该被考虑。

协同工作

本节介绍一个结合了这两种技术的示例。假设您被要求报告数据库中每个表中每个字段的 null 值所占的百分比。这里有一个解决方案。如果您选择 count(pkfield)，其中 pkfield 是主键字段，那么这将计算表中的所有记录。如果您计算任何其他字段，它只计算具有非 null 数据的行。两者的差值就是该字段的 null 值计数，只需稍加计算，就可以用来计算百分比。所以查询的大致思路是

select count(ID), 
       count(ID)-count(name) as name_empty,
       (count(ID)-count(name))/count(ID) as name_empty_pc, 
       count(ID)-count(addr1) as addr1_empty,
       (count(ID)-count(addr1))/count(ID) as addr1_empty_pc
       -- repeat for all other fields
  from Customers

这需要转换为一个模板 string，以便它适用于任何表。第一次尝试是

select count(%key), 
       count(%key)-count(%field) as %field_empty,
       (count(%key)-count(%field))/count(%key) as %field_empty_pc
  from %table

这个模板将无法工作，因为它只适合一个字段。需要的是一个内部的、或字段的模板，应用于每个字段，并将结果连接成一个单一的 string。这个 string 然后会替换外部的、或主模板中的一个标记。所以将原始模板分解成一个字段模板

count(%key)-count(%field) as %field_empty,
(count(%key)-count(%field))/count(%key) as %field_empty_pc

和一个主模板

select count(%key), %fields
  from %table

现在查询将分两个阶段构建。首先，每个字段依次应用到字段模板，并将所有这些连接成一个单一的 string。然后，这个 string 将替换主模板中的 %fields 标记，并且所有其他标记都将被正确的参数值替换。

现在您应该有足够背景来理解下面代码中的内容。我创建了一个函数，它返回一个用于单个表的查询。然后可以使用它来构建一个单一的动态脚本来查询每个表。

create function dbo.ScriptCountNulls(@table varchar(120), @key_field varchar(120))
    returns varchar(max) 
as
begin
    declare @template varchar(max),         -- main template
            @field_template varchar(max),   -- template for each field
            @sql varchar(max),              -- to hold final script with all template 
                                            -- tags replaced
            @fields varchar(max)            -- to hold all field sql with tags replaced
    
    -- the main template is somewhat simple. All the interesting work is with the fields
    -- %fields is a placemarker for the string generated from the field query
    set @template = '
    select count(%key) as [%table_rowcount], %fields
      from %table'
    
    -- the field template generates 2 columns for each field: 
    -- the empty count and the percentage
    set @field_template = '
            count(%key)-count(%field) as %field_empty,
            case when count(%key) > 0 then _
        cast(100.0*(count(%key)-count(%field))/count(%key) as decimal(6,2)) 
                 else 0 end as %field_empty_pc'
    
    -- build the field string replacing the tag in the template with the field name from 
    -- the current row and then appending it to the field string. 
    select @fields = coalesce(@fields+', ','') + 
                     replace(@field_template, '%field', name) 
     from sys.columns 
    where object_id = object_id(@table) 
 order by column_id
    
    -- Assemble the final template
    -- Replace the fields template first as the value itself contains tags
    set @sql = replace(@template, '%fields', @fields) 
    set @sql = replace(@sql, '%table', @table)
    set @sql = replace(@sql, '%key', @key_field)

    return @sql -- this line for debugging 
end

并进行测试

create table t1 (id int primary key, n25 int, n50 int, n75 int, n100 int)
go
insert into t1 (id, n25, n50, n75, n100)
select 1, NULL, NULL, NULL, NULL union
select 2,    1, NULL, NULL, NULL union
select 3,    1,    1, NULL, NULL union
select 4,    1,    1,    1, NULL
go
declare @script varchar(max)
set @script = dbo.ScriptCountNulls('t1', 'id')
exec (@script)go
drop table t1
go

结果是

t1_rowcount id_empty  id_empty_pc n25_empty  n25_empty_pc n50_empty   
----------- --------- ----------- ---------- ------------ ---------   
4           0         .00         1          25.00        2          

n50_empty_pc n75_empty   n75_empty_pc n100_empty  n100_empty_pc
------------ ----------- ------------ ----------- -------------
50.00        3           75.00        4           100.00

为了完成您的任务，您需要将此函数应用于数据库中的所有表。这也可以变得通用。

declare @script varchar(max)

select @script = coalesce(@script, '')+dbo.ScriptCountNulls(o.name, c.name)
  from sys.objects o
  join (select object_id, min(column_id) as keyid 
          from sys.columns 
         where is_nullable = 0 
      group by object_id) sq on sq.object_id = o.object_id
  join sys.columns c on c.object_id = o.object_id and c.column_id = keyid
 where type = 'U'
   
exec (@script)

脚本假定每个表字段列表中遇到的第一个非空字段是主键。如果不是，也没关系，因为任何非空字段都可以提供表的完整行计数。注意这里再次使用了 string 连接技术。运行此脚本会为数据库中的每个表生成一个单独的结果行。表名包含在每个结果行的第一个列名中。

注意事项

始终使用 varchar(max)（或 nvarchar(max)）来保存脚本。在 SQL Server 2005 之前的时代，varchar 的限制是 8000 个字符。当脚本溢出一个 string 时，它会被截断而不会生成错误，直到您尝试 exec 截断的脚本。除非万不得已，否则不值得在代码中放入这样的定时炸弹。Varchar(max) 可以避免这种情况。
如果您尝试用包含 NULL 的参数替换标记值，那么这将使整个脚本 string 变为 NULL。这是一个相当出乎意料的后果，如果您第一次遇到它，将很难诊断。下面是一个简短的示例来说明这一点

declare @script varchar(max), 
        @template varchar(max),
        @data varchar(20)

set @template = 'Data goes here [ %t ]'
print '@data is NULL'
set @script = replace(@template, '%t', @data)
print @script

set @data = 'This is the data'
print '@data is not null'
set @script = replace(@template, '%t', @data)
print @script
print '-------------------------------------'

生成输出

@data is NULL
 
@data is not null
Data goes here [ This is the data ]
-------------------------------------

如果您需要用字符串 "NULL" 替换标记，请使用 ISNULL() 函数。

结论

我希望这个过程能展示每种技术的潜力。单独使用它们可以产生一些优雅的代码，但结合使用我认为它们是无敌的。