云 Azure 架构师高级中级开发 .NET Visual Basic

Windows Azure 上的 CQRS -命令处理程序

Duncan Edwards Jones

4.67/5 (4投票s)

2014 年 2 月 2 日

CPOL

5分钟阅读

21646

继“Windows Azure 上的 CQRS -命令端”之后，本文展示了命令如何被命令处理程序处理。

引言

在文章 CQRS on Windows Azure - The Command side 的结尾，我们完成了使用队列和表来定义特定命令并传递命令的架构和代码，分别用于存储命令及其参数。在这篇文章中，我将探讨命令处理流程的另一端——接收和处理命令。

背景

最简单的情况下，命令处理程序的流程是：

从队列中取出下一个命令
查找适用于该类型命令的处理程序
读取命令的参数
将这些参数传递给命令处理程序
等待命令完成

然后从队列中取出下一个命令。

如果处理是无限且免费的，命令总能完成，硬件 100% 可靠，那么这就足够了。然而，现实并非如此，因此需要额外的思考和代码。

处理不是无限的，也不是免费的

在 Windows Azure 上，有一些限制会影响您应用程序的设置。首先是队列存储每秒的读取次数限制为 500 次，如果您的应用程序被认为负担过重，它可能会受到更严格的节流限制。

因此，您不能让队列读取代码处于一个紧密的循环中——最好设置您自己的最小读取间隔时间。

此外，处理不是免费的。在 Windows Azure 上，每次访问队列（即使队列中没有消息）都会收取少量费用。

因此，您应该批量读取命令，而不是一次只读取一个，并且如果多次出现无命令可处理的读取周期，还应动态增加读取间隔时间。

Public Interface IPullModeCommandProcessor

    ''' <summary>
    ''' The maximum number of commands to pull in one iteration
    ''' </summary>
    ''' <value>
    ''' Greater than or equal to 1 - the higher the  number the larger the set of 
    ''' commands the processor will take in each cycle
    ''' </value>
    ''' <remarks>
    ''' This is used for tuning the processor to reduce the wait time for new commands while
    ''' existing commands execute while also reducing the cost of "nothing to do" cycles
    ''' </remarks>
    Property BatchSize As Integer

    ''' <summary>
    ''' The amount of time to wait between polling the command list
    ''' </summary>
    ''' <remarks>
    ''' Along with batch size, this allows you to reactively tune the command processor
    ''' to the workload it is experiencing
    ''' </remarks>
    Property InterPollingDelay As TimeSpan

End Interface

命令并非总能完成

命令无法完成的原因有两个类别：瞬时问题，例如硬件繁忙；以及永久性问题，例如不可能完成的命令。在大多数业务场景中，对所有可能的失败原因进行编目和分类是不明智的，因此会采用一种折衷方案：重试几次，但如果仍然失败，则将命令搁置，视为无法完成。我倾向于将此重试次数规则设置为每个命令定义的基础，因为某些类型的命令比其他命令更容易出现可重复的瞬时错误。

    ''' <summary>
    ''' If this command is retried this many times or more, consider it a poison message
    ''' </summary>
    ''' <remarks>
    ''' This may be overridden by the command processor, or a specific command definition
    ''' </remarks>
    ReadOnly Property RecomendedPoisonMessageCeiling As Integer

每次从队列读取任何命令时，都会检查 DequeueCount 属性，如果该属性大于或等于“毒消息”上限，则将命令添加到“毒消息”表中并从队列中删除。这样可以防止再次尝试。

如果消息的DequeueCount小于此上限，则在成功完成之前不会从队列中删除。这意味着，如果由于任何原因命令未能完成，它将在稍后被重新提交以供重试。

然而，有些命令可能会处于部分完成的状态下失败。例如，如果有一个命令是将一百个文件从一种格式转换为另一种格式，并且在第 30 个文件处失败了，那么当它再次出现时，就需要做出选择——回滚失败命令的更改，还是从失败命令的进度处继续。在这两种情况下，我们都需要通过“已完成步骤”事件来跟踪命令的进度。

...
''' <summary>
''' Event arguments for an event raised when a multi-step command has been 
''' partially processed
''' </summary>
''' <remarks>
''' This allows any re-run of the command to pick up where the failed run left off,
''' or allows the prior command execution's effects to be undone before the new
''' command executes
''' </remarks>
Public NotInheritable Class CommandStepCompletedEventArgs
    Inherits EventArgs

    ReadOnly m_instanceIdentifier As Guid
    ReadOnly m_stepNumber As Integer


    ''' <summary>
    ''' The unique identifier of the command instance for which the step completed
    ''' </summary>
    Public ReadOnly Property CommandInstanceIdentifier As Guid
        Get
            Return m_instanceIdentifier
        End Get
    End Property

    ''' <summary>
    ''' The step that was completed
    ''' </summary>
    ''' <remarks>
    ''' This only has to have meaning to the command handler itself
    ''' </remarks>
    Public ReadOnly Property StepNumber As Integer
        Get
            Return m_stepNumber
        End Get
    End Property

    Public Sub New(ByVal commandInstanceIdentifierIn As Guid, ByVal stepCompletedIn As Integer)
        m_instanceIdentifier = commandInstanceIdentifierIn
        m_stepNumber = stepCompletedIn
    End Sub

End Class

硬件并非 100% 可靠

命令处理程序实现为工作角色，该工作角色可能随时失败。虽然 Azure 会在检测到任何工作角色故障时启动新实例，但这需要少量时间，并且会影响处理。

对此的解决方案是通过水平（弹性）扩展工作角色来实现，即运行的工作角色数量比必需的数量至少多一个——这样，如果一个节点发生故障，这个备用的就可以接替工作。

状态通知

一个常见的问题是，我如何向用户反馈他们发出的命令是否已完成？我在这里的建议是，使用命令的唯一标识符，并有一个查询来返回命令的状态。然后可以轮询此查询——无论是客户端显式轮询，还是服务器端任务轮询——并相应地发送任何状态更改。

这维护了命令和查询之间分离的纯粹性，从而允许它们独立扩展。

将命令实现为事件流

在 CQRS 系统中处理命令端的一种有趣方法是将每个命令的生命周期实现为事件流，并对该事件流进行投影以获取命令的当前状态。

例如，您可能有针对“命令已创建”、“命令执行开始”、“致命错误”、“命令已完成”等的命令事件。每个事件都包含详细说明命令发生情况的属性，通过“播放”命令的事件流，您可以推导出其当前状态。

这还允许命令处理程序通过播放其事件流来维护一组“待处理”命令，这样，如果命令在瞬时故障后被重新入队，您就不必显式地重新发出任何命令。

参考文献

Microsoft 的文档——特别是如何使用队列，是本文档的主要参考。

历史

2014-02-02 初始版本
2016-04-15 添加了状态通知
2017-01-08 添加了“将命令实现为事件流”的思路