65.9K
CodeProject 正在变化。 阅读更多。
Home

MS Chart Control 和 VB.NET 在统计中的应用:绘制 Kaplan-Meier 估计。

starIconstarIconstarIconstarIcon
emptyStarIcon
starIcon

4.90/5 (15投票s)

2012年11月9日

CPOL

4分钟阅读

viewsIcon

62349

我们在 Web 应用程序中使用 VB.NET 和 MS Chart,从用户上传的或在线数据库中提取数据,使用 Kaplan-Meier 公式创建生命表并绘制生存曲线。整个操作通过用户友好的基于 Web 的向导完成。

引言

自从 17 世纪以来,随着精算科学和人口统计学的发展,生存分析开始出现。[1] 生存分析,今天被广泛应用于临床研究,不仅限于对实际生存率的分析,测量死亡率,而且具有更广泛的意义,包括临床试验研究者可能想要研究的任何“事件”。 Kaplan 和 Meier 在 1958 年发表的一篇研究论文,被引用超过 33,000 次[2],提出了他们“著名的”生存曲线估计器,这导致了生存分析的重大进展。

 

Kaplan 和 Meier 提出了经典生命表的“连续时间”版本,后者基于将时间划分为固定间隔[1]。

 

有很多商业上可用的桌面软件,用于计算总体或无事件生存率并绘制 Kaplan-Meier 估计;例如 IBM SPSS ® 和 MedCalc ®。“R”等开源统计软件也可使用,但是,“标准”用户仍然需要接受专门的培训才能有效使用该软件。虽然可以实现 Web 启用的数据收集和分析界面,但要执行它并非易事。

我们在 Web 应用程序中使用 VB.Net 和 MS Chart,从用户上传的或在线数据库中提取数据,使用 Kaplan-Meier 公式创建生命表并绘制生存曲线。整个操作通过用户友好的基于 Web 的向导完成。

向导首先要求用户选择要分析的数据库表。选择后,用户会将“开始日期”、“结束日期”、“事件”和“分层依据”与他上传表中的字段进行匹配。

 

然后将绘制曲线以及生命表。以编程方式生成的图表将通过 MS Chart 的工具提示属性描述指定时间点的生存数据。

  

我们的工具提供了一种“实时”分析生存数据的方法,可以对代码进行操作以符合研究方案的统计考量部分(例如,设置最佳的随访期)。可以导出生成的生命表以进行进一步的研究。 作为基于 Web 的应用程序,不需要客户端设置。 与其他统计软件相比,无需在服务器上安装其他软件。

背景 

S(t) 为给定研究人群中的患者的寿命超过 t [3,4] 的概率。对于来自该人群的大小为 N 的样本,令观察到的 N 个样本成员的死亡时间为

 

在下面的示例中,有十名患者参加了一项研究,“持续时间”表示 [结束日期 - 开始日期],其中开始日期是研究或临床试验的登记日期。 如果发生“死亡”或“事件”,结束日期分别代表死亡日期事件日期。 如果患者仍然存活或未发生指定的“事件”,则结束日期最后随访日期

  

对于每个 ti,对应的是 ni (时间 ti 之前“有风险”的数量),以及 di (时间 ti 发生的死亡或事件的数量)。

 

为了将其应用于前面的示例,让我们设置“事件”= 1 的时间点

第一个时间点 @ 3 个月

ni = 10

di = 1 

第二个时间点 @ 11 个月

ni = 10(初始)– 1(患者在时间点 1 死亡)– 1(患者在持续时间 9 个月时存活)= 8

di = 2(两名患者在持续时间 11 个月时死亡)

此示例中 11 个月的生存率 = 67.5 %

使用代码 

该项目由两个网页(向导绘图)以及放置在 "App_Code" 文件夹中的 VB.Net 代码文件 (SurvivalCurve.vb) 组成。 以下代码以内联方式记录。

向导页面

以下是构建向导所需的 ASP.Net 控件

<asp:Label ID="Label1" runat="server" Text="Label"> </asp:Label> 
<asp:Label ID="Label2" runat="server" Text="Label"> </asp:Label> 
<asp:DropDownList ID="myList" runat="server"> </asp:DropDownList>
<asp:Button ID="Button3" runat="server" Text="Select Table" />
<asp:TextBox ID="choosetbl" runat="server">Table / Query Name</asp:TextBox>
<asp:Button ID="Button2" runat="server" Text="Cancel" />
<asp:Button ID="Button1" runat="server" Text="Next" /> 

 以下是代码隐藏文件

 Imports System.Data
Imports System.Data.OleDb
Partial Class DynamicKaplan
    Inherits System.Web.UI.Page
    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load
        If Session("ChosenTable") = "" Then InitMyDB()
    End Sub
    Private Sub InitMyDB()
        Dim myDB As New OleDbConnection
        Try
            myDB = New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Session("CurrentDB") + ";Persist Security Info=True")
            myDB.Open()
        Catch
            Threading.Thread.Sleep(1)
            Response.Redirect("Default.aspx")
        End Try
        'Retrieve the name of the current database
        Dim tmpString = myDB.DataSource.ToString
        Dim tmpInd = tmpString.LastIndexOf("|")
        tmpString = Right(tmpString, Len(tmpString) - (tmpInd + 1))
        Label1.Text = "The currently used Database is " & tmpString
        Label2.Text = "Please select a table or query from the database to estimate Kaplan Meier curves from: "
        'Retrieve the tables & queries schema
        Dim mytables = myDB.GetOleDbSchemaTable(OleDbSchemaGuid.Tables, New Object() {})
        myDB.Close()
        'Iterate through the schema & fill the names to the dynamically created list
        '  myList.Items.Clear()
        
        For i = 1 To mytables.Rows.Count
            'to clean the list from annoying system tables
            If mytables.Rows(i - 1).Item(3).ToString = "VIEW" Then
                'Item(2 is the third column containing table name from the schema)
                myList.Items.Add(mytables.Rows(i - 1).Item(2).ToString)
            End If
        Next i
    End Sub
    Protected Sub Button1_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button1.Click
        If Session("ChosenTable") = "" Then
            'this is the first time click
            Session("ChosenTable") = choosetbl.Text
            Session("CurrentLoadState") = "Tables"
        End If
        LoadCases()
    End Sub
    Private Sub LoadCases()

        Dim CurrentLoadStateEx = Session("CurrentLoadState")
        Select Case CurrentLoadStateEx
            Case "Tables"
                'The user has pressed the "Next" Button for the first time, and chosen a table. It's time to fill the list with the available columns in the chosen table.
                InitSurvivalStatus()
                Session("CurrentLoadState") = "SurvivalStatus"
            Case "SurvivalStatus"
                Session("SurvivalStatusColumn") = myList.SelectedItem.Text
                Label1.Text = Label1.Text & ", 'EVENT STATUS' Column is '" & Session("SurvivalStatusColumn") & "' "
                Label2.Text = "Please select the field of 'REGISTRATION DATE' (The date of first contact) : "
                myList.Items.Remove(myList.SelectedItem)
                Session("CurrentLoadState") = "RegDate"
            Case "RegDate"
                Session("RegDateColumn") = myList.SelectedItem.Text
                Label1.Text = Label1.Text & "& the 'REGISTRATION DATE' Column is '" & Session("RegDateColumn") & "' "
                Label2.Text = "Please select the field of 'LAST CONTACT DATE' : "
                myList.Items.Remove(myList.SelectedItem)
                Session("CurrentLoadState") = "LastContact"
            Case "LastContact"
                Session("LastContactColumn") = myList.SelectedItem.Text
                Label1.Text = Label1.Text & "& the 'LAST CONTACT DATE' Column is '" & Session("LastContactColumn") & "'"
                Label2.Text = "Please select the field of 'STRATIFICATION' : "
                myList.Items.Remove(myList.SelectedItem)
                myList.Items.Insert(0, "None")
                Session("CurrentLoadState") = "Strata"
            Case "Strata"
                Session("Strata") = myList.SelectedItem.Text
                If Session("Strata") <> "None" Then
                    CheckStrata()
                Else
                    Session("StrataCount") = 1
                End If
                Response.Redirect("KaplanAccess.aspx")
        End Select
    End Sub
    Private Sub InitSurvivalStatus()
        Dim myDB = New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & Session("CurrentDB") & ";Persist Security Info=True")
        myDB.Open()
        Label1.Text = "The currently chosen table or view is '" & Session("ChosenTable") & "'"
        Label2.Text = "Please select the field of 'SURVIVAL STATUS' where 1 means occurence of the event while 0 means no event : "
        'Retrieve the tables & queries schema
        Dim SqlString = "SELECT * FROM [" & Session("ChosenTable") & "];"
        Dim myDBCommand = New OleDbCommand(SqlString, myDB)
        Dim myReader As IDataReader = myDBCommand.ExecuteReader()
        Dim myColumns = myReader.GetSchemaTable
        myDB.Close()
        myList.Items.Clear()
        For I = 1 To myColumns.Rows.Count
            myList.Items.Add(myColumns.Rows(I - 1).Item(0).ToString)
        Next I
    End Sub
    Private Sub CheckStrata()
        Dim myDB = New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" + Session("CurrentDB") + ";Persist Security Info=True")
        myDB.Open()
        Dim SqlString = "SELECT DISTINCT [" + Session("Strata") + "] FROM [" + Session("ChosenTable") + "] WHERE ([" + Session("Strata") + "] Is Not Null);"
        Dim myDBCommand = New OleDbCommand(SqlString, myDB)
        Dim myReader As IDataReader = myDBCommand.ExecuteReader()
        Dim StrataCount As Integer
        Dim StrataArray As New ArrayList
        Do While myReader.Read
            StrataArray.Add(myReader.GetValue(0))
            StrataCount = StrataCount + 1
        Loop
        Session("StrataCount") = StrataCount
        Session("StrataArray") = StrataArray
    End Sub
    Protected Sub Button2_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button2.Click
        Response.Redirect("default.aspx")
    End Sub
    Protected Sub Button3_Click(ByVal sender As Object, ByVal e As System.EventArgs) Handles Button3.Click
        choosetbl.Text = myList.SelectedItem.Text
    End Sub
<pre lang="vb.net">End Class  

 

 绘图页面 

在这里,我们添加 MS Chart 如下

<asp:Chart ID="Chart1" runat="server" Height="600px" Width="742px">
<pre lang="aspnet">     <ChartAreas>
            <asp:ChartArea Name="ChartArea1">
             </asp:ChartArea>
     </ChartAreas>
</asp:Chart> 
         

 以及代码隐藏文件

 Imports System.Web.UI.DataVisualization.Charting
Imports Microsoft.VisualBasic
Imports System.Data
Imports System.Data.OleDb
Imports System.Math

Partial Class KaplanAccess
    Inherits System.Web.UI.Page

    Protected Sub Page_Load(ByVal sender As Object, ByVal e As System.EventArgs) Handles Me.Load

        LoadKaplanData()
        'Chart visual settings
        Chart1.ChartAreas(0).AxisY.Maximum = 1
        Chart1.ChartAreas(0).AxisY.Interval = 0.1
        Chart1.ChartAreas(0).AxisX.MajorGrid.Enabled = False
        Chart1.ChartAreas(0).AxisY.MajorGrid.Enabled = False
        Chart1.ChartAreas(0).AxisX.Title = "Duration of Survival in months"
        Chart1.ChartAreas(0).AxisY.Title = "Cumulative Survival Propabaility"
        Dim myTitle As New Title
       
        myTitle.Text = "Kaplan Meier Survival Curve(s) for total of " & Session("TotalCasesCount").ToString & " cases"
        Chart1.Titles.Add(myTitle)

    End Sub
    Private Sub LoadKaplanData()
        'Get all settings from the session
        Dim CurrentDB = Session("CurrentDB")
        Dim ChosenTable = Session("ChosenTable")
        Dim SurvivalStatusColumn = Session("SurvivalStatusColumn")
        Dim RegDateColumn = Session("RegDateColumn")
        Dim LastContactColumn = Session("LastContactColumn")
        Dim StrataColumn = Session("Strata")
        Dim StrataCount = CInt(Session("StrataCount"))
        Dim StrataArray = CType(Session("StrataArray"), ArrayList)
        'Open the Database
        Dim myDB = New OleDbConnection("Provider=Microsoft.Jet.OLEDB.4.0;Data Source=" & CurrentDB & ";Persist Security Info=True")
        myDB.Open()
        'Base of the SQL String, Note that you can manipulate the DateDiff to add the optimum followup period recommended by the study protocol statistical consideration
        Dim SQLString = "SELECT [" & RegDateColumn & "], [" & LastContactColumn & "], [" & SurvivalStatusColumn & "] , DateDiff('m',[" & RegDateColumn & "],[" & LastContactColumn & "]) AS survivalduration" _
                        & " FROM([" & ChosenTable _
                        & "]) WHERE ((([" & RegDateColumn & "]) Is Not Null) AND (([" & LastContactColumn & "]) Is Not Null) AND (([" & SurvivalStatusColumn & "]) Is Not Null)) and ([" & RegDateColumn & "] < [" & LastContactColumn & "])"
        'Do this as many times as there is strata
        For I = 1 To StrataCount
            Dim FinalSQLString, SQLString2, Stratum As String
            SQLString2 = SQLString
            Stratum = "All cases"
            'Append strata selection criteria if there is any strata
            If StrataCount > 1 Then
                If IsNumeric(StrataArray.Item(I - 1)) Then
                    SQLString2 = SQLString & "and ([" & StrataColumn & "]=" & StrataArray.Item(I - 1).ToString & ")"
                Else
                    SQLString2 = SQLString & "and ([" & StrataColumn & "]='" & StrataArray.Item(I - 1).ToString & "')"
                End If
                Stratum = StrataArray.Item(I - 1).ToString
            End If
            'Formulate the final SQL String
            FinalSQLString = SQLString2 & "ORDER BY DateDiff('m',[" & RegDateColumn & "],[" & LastContactColumn & "]), " & SurvivalStatusColumn & " DESC;"
            Dim myDBCommand = New OleDbCommand(FinalSQLString, myDB)
            'Generate a temporary reader to get the number of cases
            Dim myReader As IDataReader = myDBCommand.ExecuteReader()
            Dim CasesCount As Integer = 0
            Do While myReader.Read
                CasesCount = CasesCount + 1
            Loop
            myReader.Close()
            'Generate the main reader
            myReader = myDBCommand.ExecuteReader()
            'Call the function which performs calculations
            Session("TotalCasesCount") = CInt(Session("TotalCasesCount")) + Kaplan.KaplanTables(Me, myReader, CasesCount, Stratum, Chart1)
            myReader.Close()
        Next I
        myDB.Close()
    End Sub
End Class  

 

 SurvivalCurve.vb

Imports Microsoft.VisualBasic
Imports System.Web.UI.DataVisualization.Charting
Imports System.Data
Imports System.Data.OleDb
Imports System.Math
Public Class Kaplan
    Public Shared Function KaplanTables(ByRef myPage As Page, ByRef myReader As IDataReader, ByVal CasesCount As Integer, ByVal Stratum As String, ByRef myChart As Chart) As Int32
        'This function will take two variables, an OLE Database Connection with a table named "Query1", _
        'This query should contain the Survival status as the third column, and the survival duration as the fourth colum,
        'the other variable passed to this function should be a gridview to produce the lifetables in.
        'The function will bind the generated lifetables to the gridview & return the total number of processed cases as Int32


        'Start generating the life tables
        Dim myTable As New DataTable
        myTable.Columns.Add("Survial Status")
        myTable.Columns.Add("Remaining Patients")
        myTable.Columns.Add("Survial Duration")
        myTable.Columns.Add("Survial Propability")
        myTable.Columns.Add("Cumulative Survial Propability")
        Dim myFirstRow As DataRow = myTable.NewRow
        myFirstRow.Item(0) = 1
        myFirstRow.Item(1) = CasesCount
        myFirstRow.Item(2) = 0
        myFirstRow.Item(3) = 1
        myFirstRow.Item(4) = 1
        myTable.Rows.Add(myFirstRow)
        Dim Ptnseq = CasesCount
        For I = 1 To CasesCount
            Dim myRow As DataRow = myTable.NewRow
            'Get only one record  from KaplanTable
            Dim Kaplantmp = myReader.Read
            Ptnseq = Ptnseq - 1
            myRow.Item(0) = myReader.GetValue(2)
            myRow.Item(1) = Ptnseq 'Sets the total number of remaining patients
            myRow.Item(2) = myReader.GetValue(3)
            If myRow.Item(0) = 0 Then
                myRow.Item(3) = myTable.Rows(I - 1).Item(3)
                myRow.Item(4) = myTable.Rows(I - 1).Item(4)
            ElseIf myRow.Item(0) = 1 Then
                myRow.Item(3) = myRow.Item(1) / myTable.Rows(I - 1).Item(1)
                myRow.Item(4) = myRow.Item(3) * myTable.Rows(I - 1).Item(4)
            End If
            myTable.Rows.Add(myRow)
        Next I
        'Finished generating the lifetables, bind it to a grid
        Dim myGrid As New GridView 'Create a new dynamic Grid
        Dim myLabel As New Label 'Create a new dynamic label for this grid
        myPage.Form.Controls.Add(myLabel) 'add the label, then
        myPage.Form.Controls.Add(myGrid) 'add the grid
        myGrid.DataSource = myTable 'Bind the grid to the calculated lifetables
        myGrid.DataBind()
        DrawKaplanCurve(myTable, myChart, Stratum)
        myLabel.Text = "Current Stratum is: " & Stratum & "<br/>" & "Total Number of cases is: " & (myTable.Rows.Count - 1).ToString & " Cases" '(if  strata)
        Return myTable.Rows.Count - 1
    End Function
    Public Shared Sub DrawKaplanCurve(ByVal myTable As DataTable, ByVal myChart As Chart, ByVal Stratum As String)
        Dim KaplanSeries As New Series
        KaplanSeries.ChartType = SeriesChartType.StepLine
        KaplanSeries.Name = Stratum
        Dim CensoredSeries As New Series
        CensoredSeries.ChartType = SeriesChartType.Stock
        CensoredSeries.Name = "Censored " & Stratum
        For I = 1 To myTable.Rows.Count - 1
            Dim myPoint As New DataPoint
            Dim xval As Double = myTable.Rows(I).Item(2)
            Dim yval As Double = myTable.Rows(I).Item(4)
            myPoint.SetValueXY(xval, yval)

            ' If alive case, then add to censored data
            If myTable.Rows(I).Item(0) = 0 Then
                Dim CensoredPoint As New DataPoint
                CensoredPoint.SetValueXY(myPoint.XValue, yval - 0.01, yval + 0.01)
                CensoredPoint.ToolTip = "Censored Case Number " & myTable.Rows(I).Item(1).ToString & vbNewLine & "Survival Duration = " & myTable.Rows(I).Item(2).ToString & " months" & vbNewLine & "Cumulative Survival Propability = " & Round(yval * 100, 2).ToString & "%"
                CensoredPoint.Color = myPoint.Color
                If I <> myTable.Rows.Count - 1 Then CensoredSeries.Points.Add(CensoredPoint) 'add all except the last point because it shouldn't be censored
            End If
            
            If I = myTable.Rows.Count - 1 Then myPoint.Label = Round(yval * 100, 2).ToString & "%"
            KaplanSeries.Points.Add(myPoint)
        Next
        myChart.Series.Add(KaplanSeries)
        myChart.Series.Add(CensoredSeries)
       
        Dim myLegend As New Legend
        myLegend.TitleForeColor = myChart.Series(myChart.Series.Count - 1).Color
        myChart.Legends.Add(myLegend)
    End Sub
End Class 

参考文献 

1. martingales 在生存分析中的应用历史。 Odd O.AALEN, PerKragh ANDERSEN , Ørnulf BORGAN, Richard D. GILL 和 Niels KEIDING。 2009 年 6 月,概率和统计历史电子杂志,第 5 卷。

2. 来自不完整观测的非参数估计。 Kaplan, E. L. 和 Meier。 1958 年,美国统计协会杂志,pp. P.53, 457–481 和 562–563。

3. 社区,维基。 Kaplan–Meier 估计器。 wikipedia.org。 [在线] 2012。

4. Dunn, Steve。 生存曲线:应计和 Kaplan-Meier 估计。 cancerguide.org。 [在线] 2002。 http://www.cancerguide.org/scurve_km.html.

© . All rights reserved.