使用 Schema Views 根据预定义的质量规则验证数据库设计问题
本文有助于在生产环境中,使用 SQL Server 模式视图识别数据库设计问题,例如缺少某些字段、可为空字段、未找到主键等问题。
引言
为了避免由于客户端环境中的糟糕数据库设计而导致的性能问题,在调查性能问题时,我们应该对高端产品的基线数据库进行审查。为了避免混淆,基线数据库是为每个客户实现量身定制的产品数据库。为了给您一个相当直观的理解,如果我们的一款产品数据库缺少关键的性能指标,那么这个指标将会在以后的每个新实现中都丢失。这些审查流程确保了实施过程中的质量标准得到了满足。
这个技巧可以作为一个基线脚本,执行后可以突出显示违反既定标准的项。因此,规则很重要。让我们看看如何使用 SQL Server/Oracle 元视图创建规则。
规则
规则只是在设计数据库时必须遵循的已定义标准。例如,大多数应用程序都支持审计报告,这些报告要求每个表都包含某些审计字段,如 createdOn
、CreatedBy
、UpdatedOn
、UpdatedBy
。此处突出显示的字段仅用于理解目的,用户当然可以使用自己的字段来定义强制性规则。
为了便于解释,我们将讨论以下规则。这些想法可以扩展到任何类型的规则。
- 所有表都必须存在审计字段(
CREATED_ON
、CREATED_BY
、UPDATED_ON
、UPDATED_BY
) - 审计字段不能标记为 Nullable
Created_ON
/UPDATED_ON
应具有默认日期约束CREATED_BY
/UPDATED_BY
字段必须是大小为30
的nvarchar
类型- 相同列名和数据类型和大小应匹配
- 所有表都必须有主键约束
- 在相同列名上识别到数据类型不匹配
创建测试
SQL 可以用作验证的规则引擎,因为 SQL Server 的 sys 视图包含有关 SQL Server 所有内容的丰富信息。因此,我们使用相同的 sys 视图编写了测试用例来验证各个规则。让我们逐一来看每个规则。
所有表都必须存在审计字段
查找不包含特定列的表很简单,我们只需要查看 sys.tables
/columns 视图并应用适当的过滤器即可获得所需结果。以下方法不是理想的方法,但可以完成工作!
select distinct 'Audit Field Not Found' as reason, t.name, 'UPDATED_BY' as fieldname
from sys.tables t
left outer join
(
select distinct t.name
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
where c.name = 'UPDATED_BY'
) tmp on t.name = tmp.name
where tmp.name is null
union all
select 'Audit Field Not Found' as reason, t.name, 'CREATED_BY' as fieldname
from sys.tables t
left outer join
(
select distinct t.name
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
where c.name = 'CREATED_BY'
) tmp on t.name = tmp.name
where tmp.name is null
union all
select 'Audit Field Not Found' as reason, t.name, 'CREATED_ON' as fieldname
from sys.tables t
left outer join
(
select distinct t.name
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
where c.name = 'CREATED_ON'
) tmp on t.name = tmp.name
where tmp.name is null
union all
select 'Audit Field Not Found' as reason, t.name, 'UPDATED_ON' as fieldname
from sys.tables t
left outer join
(
select distinct t.name
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
where c.name = 'UPDATED_ON'
) tmp on t.name = tmp.name
where tmp.name is null
select distinct 'Audit Field Not Found' as reason, t.table_name, 'updated_by' as fieldname
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name and c.column_name in _
('UPDATED_BY')
where c.column_name is null
union all
select distinct 'Audit Field Not Found' as reason, t.table_name, 'created_by' as fieldname
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name and c.column_name in _
('CREATED_BY')
where c.column_name is null
union all
select distinct 'Audit Field Not Found' as reason, t.table_name, 'created_on' as fieldname
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name and c.column_name in _
('CREATED_ON')
where c.column_name is null
union all
select distinct 'Audit Field Not Found' as reason, t.table_name, 'updated_on' as fieldname
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name and c.column_name in _
('UPDATED_ON')
where c.column_name is null
所有审计字段都不能标记为 Nullable
审计字段在本质上是强制性的,因此不能为 Nullable
。以下查询识别出所有违反此规则的表。
SELECT distinct 'Nullable field found' as Reason, T.NAME, c.name, c.is_nullable
FROM sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.types ty on c.system_type_id = ty.system_type_id
WHERE c.name in ('CREATED_ON', 'UPDATED_ON', 'CREATED_BY', 'UPDATED_BY')
and c.is_nullable = 1
order by 1
SELECT distinct 'Nullable field found' as Reason, T.TABLE_NAME, c.COLUMN_NAME, c.NULLABLE
FROM USER_TABLES T
INNER JOIN USER_TAB_COLUMNS C ON T.TABLE_NAME = C.TABLE_NAME AND C.NULLABLE = 'Y'
WHERE C.COLUMN_NAME IN ('CREATED_ON', 'UPDATED_ON', 'CREATED_BY', 'UPDATED_BY')
ORDER BY T.TABLE_NAME
CREATED/UPDATED ON 字段应具有 getdate() 作为默认约束<o:p>
由于 CREATED
/UPDATED ON
字段是强制性字段,提供默认值的理想方法是具有默认的 getdate()
约束。我们可以使用以下查询识别出所有在 Created
/Updated On
字段上没有默认约束的表。
SELECT 'Default Constraint Not Found' as Reason, T.NAME, c.name, c.is_nullable
FROM sys.tables t
INNER JOIN sys.columns c ON c.object_id = t.object_id
INNER JOIN sys.types ty on c.system_type_id = ty.system_type_id
WHERE c.name in ('CREATED_ON', 'UPDATED_ON')
and c.default_object_id = 0
order by 1
SELECT 'Default Constraint Not Found' AS REASON, T.TABLE_NAME, C.COLUMN_NAME
FROM USER_TABLES T
LEFT OUTER JOIN user_tab_columns c ON T.TABLE_NAME = C.TABLE_NAME AND C.DATA_DEFAULT is not null
WHERE C.COLUMN_NAME IN ('CREATED_ON', 'UPDATED_ON')
AND C.TABLE_NAME IS NULL
ORDER BY T.TABLE_NAME
CREATED/UPDATED BY 字段必须是 NVARCHAR 类型
在任何时候,当看到生产数据库时,您会惊讶地发现审计字段具有不同的数据类型,例如在极少数情况下,您会发现 created
/updated by
字段的类型为 char
、nchar
、varchar
等。下面的查询将识别出所有这些罕见情况。
select distinct 'DataType difference Found' as reason, t.name, c.name, ty.name
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
inner join sys.types ty on c.system_type_id = ty.user_type_id
where c.name in ('CREATED_BY', 'UPDATED_BY')
and ty.name != 'nvarchar'
order by 1
select distinct 'DataType difference Found' as reason, t.table_name, c.column_name, c.data_type
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name
where c.column_name in ('CREATED_BY', 'UPDATED_BY')
and c.data_type != 'VARCHAR2'
order by t.table_name;
CREATED/UPDATED BY 字段长度应为 30 个字符<o:p>
类似地,有时 Created
/Updated by
字段的大小会有所不同,有些是长度 50,有些是 70,在极少数情况下是 10。我们使用以下查询来识别所有大小不同的实例。
select distinct 'DataType size difference Found' as reason, t.name, c.name, ty.name, c.max_length
from sys.tables t
inner join sys.columns c on c.object_id = t.object_id
inner join sys.types ty on c.system_type_id = ty.user_type_id
where c.name in ('CREATED_BY', 'UPDATED_BY')
and ty.name = 'nvarchar'
and c.max_length != 60
order by 1
select distinct 'DataType size difference Found' as reason, t.table_name, _
c.column_name, c.data_type, c.data_length
from user_tables t
left outer join user_tab_columns c on t.table_name = c.table_name
where c.column_name in ('CREATED_BY', 'UPDATED_BY')
and c.data_type = 'VARCHAR2'
and c.data_length != 250
order by t.table_name;
未找到主键
经验表明,有时开发人员也会错过主键的创建。识别缺少主键的表有点棘手,但可以使用以下 SQL 来识别这些表。
select 'Primary Key Not Found' as reason, t.name
from sys.tables t
left outer join
(
select t.name
from sys.indexes i
inner join sys.index_columns ic on i.object_id = ic.object_id and i.index_id = ic.index_id
inner join sys.columns c on ic.object_id = c.object_id and c.column_id = ic.column_id
inner join sys.tables t on c.object_id = t.object_id
where i.is_primary_key = 1
) t2 on t.name = t2.name
where t2.name is null
order by 2
select 'Primary Key Not Found' as reason, t.table_name
from user_tables t
inner join user_constraints c on t.table_name = c.table_name and c.constraint_type = 'P'
where c.table_name is null
order by t.table_name
找到标识主键
有些产品不鼓励使用基于标识的主键。以下查询可用于识别基于标识的主键。
select 'Identity Primary Key Found' as reason,t.name,
c.name as column_name,
c.is_identity
from sys.indexes i
inner join sys.index_columns ic on i.object_id = ic.object_id and i.index_id = ic.index_id
inner join sys.columns c on ic.object_id = c.object_id and c.column_id = ic.column_id
inner join sys.tables t on c.object_id = t.object_id
where c.is_identity = 1
order by 2
在相同列名上识别到数据类型不匹配
这是最常见的设计问题,即开发人员忘记在子表上强制执行外键约束。识别这些列很棘手;为了简化这个问题,我们假设两个或多个表中的相同列名应该链接在一起。下面的查询使用上述假设来突出显示/建议所有应该具有外键约束的此类列。
declare @tbl table
(
pk_column_name varchar(100),
pk_table_name varchar(100),
pk_user_type_id varchar(100),
pk_type_length int,
fk_column_name varchar(100),
fk_table_name varchar(100),
fk_user_type_id varchar(100),
fk_type_length int
)
insert into @tbl (pk_column_name, pk_table_name, pk_user_type_id, pk_type_length, _
fk_column_name, fk_table_name, fk_user_type_id, fk_type_length)
select tk.pk_column_name, tk.table_name, tk.user_type_id, _
tk.max_length, c1.name, t.name, c1.user_type_id, c1.max_length
from sys.columns c1
inner join
(
select c.name as pk_column_name,
t.name as table_name,
c.user_type_id,
c.max_length
from sys.indexes i
inner join sys.index_columns ic on i.object_id = ic.object_id and i.index_id = ic.index_id
inner join sys.columns c on ic.object_id = c.object_id and c.column_id = ic.column_id
inner join sys.tables t on c.object_id = t.object_id
where i.is_primary_key = 1
and c.name not in ('CREATED_ON', _
'UPDATED_ON', 'CREATED_BY', 'UPDATED_BY')
and t.name not like 'QRTZ_%'
) tk on c1.name = tk.pk_column_name
inner join sys.tables t on c1.object_id = t.object_id
where t.name != tk.table_name
delete from @tbl
where fk_column_name in
(select fk_column_name
from @tbl
group by fk_column_name
having count(*) = 1
)
select 'Data Type mismatch Identified over SAME COLUMN name',
t.pk_column_name,
t.pk_table_name,
ty1.name,
t.pk_type_length,
t.fk_column_name,
t.fk_table_name,
ty2.name,
t.fk_type_length
from @tbl t
inner join sys.types ty1 on t.pk_user_type_id = ty1.user_type_id
inner join sys.types ty2 on t.fk_user_type_id = ty2.user_type_id
where t.pk_user_type_id != fk_user_type_id
order by 2
select 'Data Type mismatch Identified over SAME COLUMN name' as reason, _
p.pk_column_name, p.table_name, tc.column_name as fk_column_name, _
tc.table_name as fk_table_name, p.data_type, p.data_length
from user_tab_columns tc
inner join
(
select distinct ucc.column_name as pk_column_name,
t.table_name as table_name,
c.data_type,
c.data_length
from user_tables t
inner join user_tab_columns c on t.table_name = c.table_name
inner join user_constraints uc on t.table_name = uc.table_name
inner join user_cons_columns ucc on uc.constraint_name = ucc.constraint_name
where uc.constraint_type = 'P'
and ucc.column_name not in ('CREATED_ON', 'UPDATED_ON', 'CREATED_BY', 'UPDATED_BY')
and ucc.column_name not like 'QRTZ_%'
) p on tc.column_name = p.pk_column_name
where tc.table_name != p.table_name;
关注点
这些测试用例不仅有助于我们标准化数据库设计,在某些情况下还有助于优化我们产品的性能。根据业务需求,添加新测试用例也非常容易。
历史
- 版本 1.0:第一个版本
- 版本 1.1:增加了 Oracle 支持