65.9K
CodeProject 正在变化。 阅读更多。
Home

SQLXAgent - SQL Express 的作业 - 第 3 部分(共 6 部分)

starIconstarIconstarIconstarIconstarIcon

5.00/5 (2投票s)

2017 年 9 月 30 日

CPOL

10分钟阅读

viewsIcon

9161

downloadIcon

54

创建和运行类似 SQL Server Enterprise 的作业 - CSV 和 Excel 导入器代码。

第 1 部分 - 使用 SQLXAgent
第 2 部分 - 架构和设计决策
第 3 部分 - CSV 和 Excel 导入器代码 (本文)
第 4 部分 - 作业调度代码
第 5 部分 - 包如何运行
第 6 部分 - 有趣的编码

引言

本文是描述 SQLXAgent 实用程序的系列文章中的第三篇,专门讨论处理导入 Excel 和 CSV 文件的代码。不会有很多屏幕截图,只有一堆代码片段和枯燥的叙述。如果您不关心本文的主题,请随意忽略它并去做其他事情。其他编码挑战将在本系列文章的第 4 部分中讨论。

注意: - 文件中的代码与本文中的代码之间的唯一区别与消除(通过行格式化)行长导致的水平滚动条有关。

在这些代码中,您可能会注意到调用了不熟悉的方法(例如,string.ReplaceNonAlphaNumeric())。我大量使用了扩展方法,这可能就是您所看到的。如果您真的好奇,可以在 SQLXCommon 项目中的 ObjectExtensions 文件夹中找到它们。

注意: 文章中提供的代码片段可能反映或不反映代码的绝对最新版本。如果它与实际代码不完全匹配,它也会非常接近。这是在撰写文章时发现问题的性质。

总体概述

此处描述的类位于 SQLXPkgBase 程序集中。类继承和依赖层次结构如下所示。

Importing class hierarchy

我想提一下,CSV 导入器松散地基于我 2016 年撰写的文章 - CSV 文件解析器 。我说“松散地基于”是因为编写此代码迫使我重新评估了使导入数据可供父应用程序/程序集访问的方式。

代码

在本文中,我们将讨论主要涉及数据导入的类。它们是

  • ImporterBase - CSV 和 Excel 导入器类的绝对(抽象)基类。
     
  • CSVImportBase - CSV 导入器类的(抽象)基类
     
  • ExcelImportBase - Excel 导入器类的(抽象)基类
     
  • HintItem 和 HintItemList - 描述导入的列
     

ImporterBase

这是一个抽象类,包含每个 CSV 或 Excel 导入器的大部分属性和功能。大多数方法都受到保护,以便继承类可以使用它们。继承类负责实际加载原始数据,并创建一个 DataTable 对象,此类的处理在该对象上执行。一旦加载了所需文件,继承类将调用 ProcessData() 方法。

/// <summary>
/// Calls all of the methods involved in processing data loaded from the file.
/// All of the methods called here operate on a DataTable.
///
</summary>
protected virtual void ProcessData()
{
    // remove empty rows (if configured to do so)
    this.RemoveEmptyRows();
    // remove linefeeds and spaces from column names
    this.NormalizeNames();
    // Create column hints list if necessary. If you want to use your own,
    // instantiate it in your derived class BEFORE calling the ProcessData()
    // method.
    this.CreateColumnHints();
    // determine the max width of each column
    this.EstablishStringLengths();
    // builds a default database prep query (which can be overridden from the
    // derived class)
    this.BuildPrepQuery();
    // build the query that creates the table (which can be overridden from the
    // derived class)
    this.BuildCreateTableQuery();
    // build the query the inserts the data (which can be overridden from the
    // derived class)
    this.BuildInsertQuery();
}

方法描述如下。我保留了注释,试图(可能徒劳地)将叙述文本保持在最低限度。

  • RemoveEmptyRows() - 如果 DeleteEmptyRows 属性为 true,则数据表对象中所有字段为空/null 的任何行都将从数据表中删除。
    /// <summary>
    /// If DeleteEmptyRows property is true, removes empty rows (all columns = DBNull) 
    /// from the loaded excel data.
    /// </summary>
    protected virtual void RemoveEmptyRows()
    {
        this.RemovedRowCount = 0;
        if (this.DeleteEmptyRows)
        {
            for (int i = this.ImportedData.Rows.Count-1; i >= 0; i--)
            {
                DataRow row = this.ImportedData.Rows[i];
                if (row.ItemArray.All(x=>x is DBNull || string.IsNullOrEmpty(x.ToString())))
                {
                    this.ImportedData.Rows.Remove(row);
                    this.RemovedRowCount++;
                }
            }
        }
    }
    
  • NormalizeNames() - 此方法将规范化列名和表名。如果源数据文件中没有标题行,则将使用“Column_n”格式自动创建列,其中“n”是源数据中列的序号位置。如果程序员未明确指定表名,则导入数据的表的名称由文件名确定。
    /// <summary>
    /// If the first row is a header row, this method changes ExcelData.Columns to the 
    /// contents of the first row of the table, and the first row is deleted. The 
    /// column names (and table name) are massaged to replace all non-alphanumeric 
    /// characters with underscores, as well as trimming leading/trailing spaces.
    /// </summary>
    /// <param name="dt"></param>
    protected virtual void NormalizeNames()
    {
        // The ExcelDataReader object does not appear to have a property we can use to 
        // determine if the first row is a header, so we use our own property (set by 
        // the programmer).
        // If the first row is the header, pull the names from the first row, and 
        // delete the row.
        if (this.FirstRowIsHeader)
        {
            // first row
            DataRow headerRow = this.ImportedData.Rows[0];
            // processe each column
            for (int i = 0; i < headerRow.ItemArray.Count(); i++)
            {
                // to ease typing
                DataColumn excelColumn = this.ImportedData.Columns[i];
                string rowColumn = (string)headerRow.ItemArray[i];
                // Set the column name from the first row of the data
                                            // if the column in the row is null/empty
                excelColumn.ColumnName = (string.IsNullOrEmpty(rowColumn) 
                                            // keep the name we already have
                                            ? excelColumn.ColumnName 
                                            // otherwise set the excel column to whatever 
                                            // the row says it should be
                                            : rowColumn
                                            // trim leading/trailing spaces, and 
                                            // replace linefeeds and embedded spaces 
                                            // with underscores
                                            ).Trim().ReplaceNonAlphaNumeric('_');
            }
            // delete the header row
            this.ImportedData.Rows.RemoveAt(0);
        }
        // We use either the specified ImportedTableName, or we build the table name 
        // based on the filename and (if this is an excel file) sheet name. I try to 
        // avoid nested ternary conditions, but sometimes, it just makes sense to use 
        // them. In these situations, code formatting goes a long way toward assisting 
        // a programmer who is not familiar with the code base.
        string newTableName = (string.IsNullOrEmpty(this.ImportedTableName)) 
                                ? string.Concat("Imported_", 
                                                Path.GetFileName(this.FileName).Trim(), 
                                                ((this.FileType == ImportFileType.EXCEL) 
                                                 ? "_" 
                                                 : ""), 
                                                ((this.FileType == ImportFileType.EXCEL) 
                                                 ? this.SheetName.Trim() 
                                                 : "")).ReplaceNonAlphaNumeric('_')
                                : this.ImportedTableName;
        // prepend the word "Imported_" and set the excel table name 
        this.ImportedData.TableName = string.Format("Imported_{0}", newTableName);
    }
    
  • CreateColumnHints() - 此方法负责自动构建“提示”列表,SQL 查询代码使用该列表创建带有字段名和适当数据类型的导入表。如果继承类已经创建了提示列表(通常是通过此方法的重写版本),此方法将简单地退出而不执行任何操作。列提示相关类将在下一节中描述。
    /// <summary>
    /// Instantiate the hint list with one hint item for every column specified in the 
    /// data table columns. Then, the hint item's type is determined by iterating all 
    /// values for a given row/column to find the most appropriate type.
    /// </summary>
    /// <param name="fromCSV">True if importing from CSV file.</param>
    protected virtual void CreateColumnHints(bool fromCSV=false)
    {
        // if the programmer hasn't already specified a hints list
        if (this.ColumnHints == null || this.ColumnHints.Count == 0)
        {
            // instantiate
            this.ColumnHints = new ItemHints();
            // for each column in the Columns collection
            for (int i = 0; i < this.ImportedData.Columns.Count; i++)
            {
                // get the column
                DataColumn col = this.ImportedData.Columns[i];
                // if the name isn't null/empty (theoretically impossible, but we 
                // check simply because it's a good idea)
                if (!string.IsNullOrEmpty(col.ColumnName))
                {
                    // create a new hint item
                    HintItem hint = new HintItem(){ Name = col.ColumnName, ItemType=null};
                    // iterate each row
                    foreach(DataRow row in this.ImportedData.Rows)
                    {
                        // try to determine the best data type based on all of the 
                        // possible values
                        hint.DetermineType(row[col], fromCSV);
                        // if we determine at any point that the column should be a 
                        // string, we can quit because a string type is our ultimate 
                        // fallback data type.
                        if (hint.ItemType.Name.IsLike("String"))
                        {
                            break;
                        }
                    }
                    // add the hint to our list
                    this.ColumnHints.Add(hint);
                }
            }
        }
    }
    
  • EstablishStringLengths() - 此方法尝试确定 string 类型列的最大长度。它使用 VarcharLengthMultiple 属性来确定最大长度。因此,使用默认的 255 个字符的倍数,并且给定列所有值的最大长度为 300 个字符,则该列的最终最大字符串大小将为 510 个字符。
    /// <summary>
    /// Determines the max necessary string length for all columns in the column hints 
    /// list. Lengths will be multiples of the value of VarcharLengthMultiple. By 
    /// default, this value is 255.
    /// </summary>
    protected virtual void EstablishStringLengths()
    {
        // if we have a hint list
        if (this.ColumnHints != null && this.ColumnHints.Count > 0)
        {
            // for each hint in the list
            for (int i = 1; i < this.ImportedData.Columns.Count; i++)
            {
                // it the hint represents a string
                if (this.ImportedData.Columns[i].DataType.Name.IsLike("string"))
                {
                    // calculate the length of largest string in the data
                    int maxSize = 0;
                    foreach(DataRow row in this.ImportedData.Rows)
                    {
                        maxSize = Math.Max(maxSize, (row.ItemArray[i] is DBNull) 
                                                    ? 0 
                                                    : ((string)(row.ItemArray[i])).Length);
                    }
                    // Determine how many times we have to mulitply our varchar 
                    // multiple value 
                    long rem = 0;
                    int factor = (int)(Math.DivRem((long)maxSize, (long)this.VarcharLengthMultiple, out rem));
                    factor += ((rem > 0) ? 1 : 0);
                    // Find the appropriate hint (we could assume an ordinal position, 
                    // but that's just crazy talk, and besides, there's no point in 
                    // tempting fate)
                    HintItem hint = this.ColumnHints.FirstOrDefault
                                    (x=>x.Name.ToUpper() == this.ImportedData.Columns[i].ColumnName.ToUpper());
                    // if we found the appropriatre hint, set the size (this is used 
                    // to build the CreateQuery)
                    if (hint != null)
                    {
                        hint.VarcharSize = this.VarcharLengthMultiple * factor;
                    }
                }
            }
        }
    }
    
  • BuildPrepQuery() - 此方法构建数据库准备查询,换句话说,是完全删除导入表还是仅截断它。
    /// <summary>
    /// Builds the database prep query. An empty string indicates append and nothing is done 
    /// to the table before-hand. Otherwise the table is either dropped and recreated, 
    /// or truncated.
    /// </summary>
    protected virtual void BuildPrepQuery()
    {
        // we're going to prep the table based on the programmer-specified setting
        switch (this.PrepAction)
        {
            case TablePrepAction.Drop :
                {
                    this.PrepQuery = string.Format("DROP TABLE [{0}].[dbo].[{1}];", 
                                                    this.InitialCatalog, 
                                                    this.ImportedData.TableName);
                }
                break;
            case TablePrepAction.Truncate :
                {
                    this.PrepQuery = string.Format("TRUNCATE TABLE [{0}].[dbo].[{1}];", 
                                                    this.InitialCatalog, 
                                                    this.ImportedData.TableName);
                }
                break;
        }
    }
    
  • BuildCreateTableQuery() - 此方法构建创建表的查询。它使用上面讨论的列提示列表来确定列名和数据类型。此方法有两个重载。一个创建包含导入数据中所有列的表,另一个创建基于 SqlParameter 数组中指定列的表。
    /// <summary>
    /// Builds the create table query (if table does not exist, create it).This 
    /// method includes ALL fields from the loaded worksheet.
    /// </summary>
    protected virtual void BuildCreateTableQuery()
    {
        StringBuilder query = new StringBuilder();
        StringBuilder fields = new StringBuilder();
        query.AppendFormat("IF (OBJECT_ID('{0}','U') IS NULL ",
                             this.ImportedData.TableName).AppendLine();
        query.AppendFormat("    CREATE TABLE {0}.dbo.{1}", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
        for (int i = 0; i < ImportedData.Columns.Count; i++)
        {
            string colName = Convert.ToString(this.ImportedData.Columns[i]);
            if (!string.IsNullOrEmpty(colName))
            {
                HintItem hint = this.ColumnHints.FirstOrDefault(x=>x.Name == colName);
                string typeName = (hint == null) 
                                    ? this.GetDbDataType(this.ImportedData.Columns[i]) 
                                    : this.GetDbDataType(hint.ItemType.Name);
                fields.AppendFormat("{0}[{1}] {2} NULL", 
                                    (fields.Length>0) ? "," : "", 
                                    colName, 
                                    typeName).AppendLine();
            }
        }
        query.Append(fields);
        query.Append(");");
        this.CreateQuery = query.ToString();
    }
    /// <summary>
    /// Builds the CREATE TABLE query using the fields as specified by the paramaters 
    /// list.
    /// </summary>
    /// <param name="parameters">List of sql parameters</param>
    protected virtual void BuildCreateTableQuery(SqlParameter[] parameters)
    {
        StringBuilder query = new StringBuilder();
        StringBuilder fields = new StringBuilder();
        query.AppendFormat("IF (OBJECT_ID('{0}','U') IS NULL ", 
                            this.ImportedData.TableName).AppendLine();
        query.AppendFormat("    CREATE TABLE {0}.dbo.{1}", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
        // find all hints that are represented in the parameters list, and include 
        // them in the query
        List<hintitem> hints = this.ColumnHints.Where
                                     (a => parameters.Any(b => b.ParameterName == a.DbValue)).ToList();
        foreach(HintItem hint in hints)
        {
            string typeName = this.GetDbDataType(hint.ItemType.Name);
            fields.AppendFormat("{0}{1} {2} NULL", 
                                (fields.Length>0) ? "," : "", 
                                hint.DbName, 
                                typeName).AppendLine();
        }
        query.Append(fields);
        query.Append(");");
        this.CreateQuery = query.ToString();
    }
    
  • BuildInsertQuery() - 此方法构建将导入数据插入到导入表中的插入查询。与上面的“创建表”方法一样,有两个重载,一个插入所有列,另一个插入 SqlParameter 数组中指定的列。
    /// <summary>
    /// Constructs the insert query we need to insert the imorted data into the sql 
    /// table. This method includes ALL fields from the loaded worksheet.
    /// </summary>
    protected virtual void BuildInsertQuery()
    {
    	StringBuilder query = new StringBuilder();
    	StringBuilder fields = new StringBuilder();
    	StringBuilder values = new StringBuilder();
    	// use the existing hints collection to determine the columns to be inserted.
    	foreach(HintItem hint in this.ColumnHints)
    	{
    		fields.AppendFormat("{0}{1}", 
    							(fields.Length>0) ? "," : "", 
    							hint.DbName).AppendLine();
    		values.AppendFormat ("{0}{1}", 
    								(fields.Length>0) ? "," : "", 
    								hint.DbValue).AppendLine();
    	}
    	query.AppendFormat("INSERT INTO [{0}].[dbo].[{1}]", 
    						this.InitialCatalog, 
    						this.ImportedData.TableName).AppendLine();
    	query.AppendLine("(");
    	query.Append(fields);
    	query.AppendLine(")").AppendLine("VALUES").AppendLine("(");
    	query.Append(values);
    	query.AppendLine(");");
    	this.InsertQuery = query.ToString();
    }
    /// <summary>
    /// Builds the INSERT query using the fields as specified by the paramaters list.
    /// </summary>
    /// <param name="parameters">List of sql parameters</param>
    protected virtual void BuildInsertQuery(SqlParameter[] parameters)
    {
    	StringBuilder query = new StringBuilder();
    	StringBuilder fields = new StringBuilder();
    	StringBuilder values = new StringBuilder();
    	// find all hints that are represented in the parameters list, and include 
    	// them in the query
    	List<hintitem> hints = this.ColumnHints.Where
                                    (a => parameters.Any(b => b.ParameterName == a.DbValue)).ToList();
        foreach(HintItem hint in hints)
    	{
    		fields.AppendFormat("{0}{1}", (fields.Length>0)?",":"", hint.DbName).AppendLine();
    		values.AppendFormat ("{0}{1}", (fields.Length>0)?",":"", hint.DbValue).AppendLine();
    	}
    	query.AppendFormat("INSERT INTO [{0}].[dbo].[{1}]", 
                            this.InitialCatalog, 
                            this.ImportedData.TableName).AppendLine();
        query.AppendLine("(");
    	query.Append(fields);
    	query.AppendLine(")").AppendLine("VALUES").AppendLine("(");
    	query.Append(values);
    	query.AppendLine(");");
    	this.InsertQuery = query.ToString();
    }
    

类中剩余的方法是帮助器方法,它们确定提示列的数据库等效数据类型,并实际调用执行数据库功能的 SQL 查询。

HintItem 和 HintItemList

HintItem 类允许导入器类自动为导入的列建立适当的数据类型。该类最有趣的部分是 DetermineType() 方法。

如果类型尚未确定为字符串,我们将继续尝试将其解析为合理的类型。正如您所看到的,我们是导入 Excel 文件还是 CSV 文件在类型确定中起着作用。此外,此代码仅支持一种整数类型 (Int32)。这可能是一个最终会给我带来麻烦的缺陷,但如果需要,应该很容易解决。

/// <summary>
/// Determines the data type of the column.
/// </summary>
/// <param name="value">The value to be examined</param>
/// <param name="fromCSV">True if this is being dne in a CSV importer.</param>
public void DetermineType(object value, bool fromCSV=false)
{
    // if the type isn't already a string
    if (this.ItemType == null || !this.ItemType.Name.IsLike("String"))
    {
        Type theType = this.ItemType;
        // we have to check if its from a CSV import becaus everything imported 
        // from a CSV defaults to string.
        if (value is string && !fromCSV)
        {
            theType = typeof(String);
        }
        else
        {
            string valueStr = value.ToString();
            DateTime parsedDateTime;
            Int32    parsedInt32;
            double   parsedDouble;
            bool     parsedBool;
            // We try to parse the value using all standard datetime formats 
            // (custom formats are not supported unless you want to modify the 
            // DateFormatStrings object.).
            if (DateTime.TryParseExact(valueStr, DateFormatStrings.Formats, CultureInfo.CurrentUICulture, DateTimeStyles.None, out parsedDateTime))
            {
                theType = typeof(DateTime);
            }
            else if (Int32.TryParse(valueStr, out parsedInt32))
            {
                // we only want to change the type if it's not already a double
                if (theType == null || !theType.Name.IsLike("Double"))
                {
                    theType = typeof(Int32);
                }
            }
            else if (double.TryParse(valueStr, out parsedDouble))
            {
                theType = typeof(Double);
            }
            else if (bool.TryParse(valueStr, out parsedBool))
            {
                theType = typeof(Boolean);
            }
            else 
            {
                theType = typeof(String);
            }
        }
        this.ItemType = theType;
    }
}

ExcelImportBase

此类的原始实现包含 ImportBase 类中的所有内容,因为我当时只打算为 Excel 做一个导入器。

(直言不讳部分) - 然后,我琢磨着我也应该为 CSV 文件做一个导入器,心想,“我只要参考去年写的 CSV 文件解析器文章,让程序员当个程序员,把那篇文章的代码修改成他自己的需求就行了”。在编写了 Excel 导入器类之后,我发现某些程序员会因为我的懒惰态度(以及他们必须做一些实际“工作”的事实)而感到冒犯,所以我决定自己修改那篇文章(在下一节中描述)。这就是为什么我们现在有三个抽象类而不是一个。

这个类也是抽象的,这意味着您必须使用另一个类来继承它,这个类是必须编写的。由于这一切都是为了支持程序员为 SQLXAgent 编写的包,我假设这就是您要这样做的原因,但它当然也可以用于您可能想到的其他项目。

由于所需的大部分代码都在 ImportBase 类中,因此在此类中需要做的事情很少。此类最有趣的部分是,我使用了一个 NuGet 可用的库,名为 ExcelDataReader(及其配套扩展库 ExcelDataReader.DataSet),以促进实际的文件加载需求。这个库将 Excel 文件读取到一个 DataTable 对象中。但它只做这些。它似乎没有尝试确定适当的数据类型,如果我没记错的话,所有内容都以 object 形式存储在表中。这完全没有帮助,导致我开发了上面讨论的列提示代码。以下是类的全部内容。

/// <summary>
/// Base class for importing excel spreadsheets
/// </summary>
public abstract class ExcelImportBase : ImporterBase
{
    public ExcelImportBase(string filename, string sheetName, string connectionString):base(filename, sheetName, connectionString)
    {
        this.FileType  = ImportFileType.EXCEL;
        this.FileName  = filename;
        this.SheetName = sheetName;
        this.Init();
    }
    ~ExcelImportBase()
    {
    }
    /// <summary>
    /// Initialize this object with default property values.
    /// </summary>
    protected override void Init()
    {
        base.Init();
        this.FileType = ImportFileType.EXCEL;
    }
    /// <summary>
    /// Import the specified sheet from the file.
    /// </summary>
    public override void Import()
    {
        if (this.CanImport)
        {
            // load the worksheet from the file
            this.ImportedData  = this.ReadSheet(this.FileName, this.SheetName);
            // process the data
            base.ProcessData();
            // update the database
            this.UpdateDatabase();
        }
    }
    /// <summary>
    /// Reads the specified sheet and returns a DataTable object.
    /// </summary>
    /// <param name="filename">The name of the file conbtaining the desired worksheet</param>
    /// <param name="sheetName">The name of the worksheet from which to import</param>
    /// <returns>Datatable with content from source</returns>
    protected virtual DataTable ReadSheet(string filename, string sheetName)
    {
        DataSet ds = null;
        DataTable data  = null;
        try
        {
            using (System.IO.FileStream stream = File.Open(filename, FileMode.Open, FileAccess.Read))
            {
                using (var reader = ExcelDataReader.ExcelReaderFactory.CreateReader(stream))
                {
                    do
                    {
                        if (reader.Name.IsLike(sheetName))
                        {
                            ds = reader.AsDataSet();
                        }
                    } while (reader.NextResult());
                }
                if (ds != null)
                {
                    data = ds.Tables[0];
                }
            }
        }
        catch (Exception ex)
        {
            throw new SQLXAgentException(SQLXExceptionEnum.ExcelDataReaderError, ex);
        }
        return data;
    }
    /// <summary>
    /// Abstract method that gets the data for the specified row
    /// </summary>
    /// <param name="rowIndex">The 0-based row index</param>
    /// <returns>An array of SqParameter objects</returns>
    protected abstract override SqlParameter[] GetRowParameters(int rowIndex);
}

该类中唯一抽象的组件是 GetRowParameters() 方法。在这里您将指定实际要导入的列。使用空方法覆盖此方法应导致导入所有列。

CSVImportBase

这个类与我最初的 CSV 解析器文章非常相似,并经过修改以适应新的特定于应用程序的范例。

注意:新程序员 - 这就是您如何利用其他人编写的 CodeProject 文章。您检查代码以评估其在您的项目中的适用性,如果必要,进行更改以使其适应您自己的用法。一篇写得好的文章将为您提供所需的所有指导。即使写得不好,也要花时间评估代码,添加必要的注释,然后继续您的生活。仅仅通过自己多做一点努力,您就会成为一个更好的程序员。 现在,回到我们的故事...

我必须保留负责加载和解析数据的代码,但修改了代码以将结果存储在 DataTable 对象中,供 ImportBase 类使用。由于加载文件涉及很多(我依赖自己的代码,而不是让库为我完成),因此此类的体积明显大于 ExcelImportBase 类,并且涉及额外的 CSV 专用属性和方法(这些方法来自我获得代码的文章)。我不会用旧文章中写过的废话来烦您,我只会简单地提一下我必须编写的新方法,用于将导入的数据存储到 DataTable 对象中。

/// <summary>
/// Adds a row to the datatable (initializes the columns if the datatable is new).
/// </summary>
/// <param name="parts"></param>
protected virtual void AddToDataTable(string[] parts)
{
    // Assume there's no header row, and default the column names to "Column_n" 
    // of type "object". We need to have columns specified before we can add 
    // new rows.
    if (this.ImportedData.Columns.Count == 0)
    {
        for (int i = 0; i < parts.Length; i++)
        {
            this.ImportedData.Columns.Add(string.Format("Column_{0}", i+1), typeof(object));
        }
    }
    // Create a new row. New rows contain n items in the ItemArray property. The 
    // number of items is determined by the number of columns the datatable has.
    DataRow row = this.ImportedData.NewRow();
    // now add our parts to the row
    for (int i = 0; i < parts.Length; i++)
    {
        string part = parts[i];
        if (!string.IsNullOrEmpty(this.CurrencySymbol))
        {
            part = (part.StartsWith(this.CurrencySymbol) || part.EndsWith(this.CurrencySymbol)) ? part.Replace(this.CurrencySymbol, "").Replace(",", "") : part;
        }
        row[i] = parts[i].Replace("\"","").Trim();
    }
    // add the new row to the datatable.
    this.ImportedData.Rows.Add(row);
}

一旦数据表填充完毕,ImportBase 代码就开始执行其余的导入功能。

编写您自己的导入器

由于上面讨论的三个导入类都是抽象的,这意味着您需要编写自己的导入器类来执行导入操作。由于大部分工作已经为您完成,这需要程序员付出微不足道的努力。

尽管以下两个类几乎相同,但为了完整性,我包含了每个类的所有代码。

示例 Excel 导入器

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data.SqlClient;
using System.Data;
using SQLXCommon;
using SQLXPkgBase;
namespace PkgSampleImportExcel2
{
    /// <summary>
    /// This class represents a sample of how you would code a class that inherits from 
    /// the abstract ExcelImportBase class (which is the class that does all of the heavy 
    /// lifting for the import excel worksheet process). All methods in the base class 
    /// are virtual, so you can override any of them that you deem necessary. It is the 
    /// responsibility of this derived class to provide success/failure information to 
    /// the calling DLL.
    /// </summary>
    public class SampleExcelImporter : ExcelImportBase
    {
        public SampleExcelImporter(string filename, string sheetName, string connectionstring)
               : base(filename, sheetName, connectionstring)
        {
            this.Init();
        }
        /// <summary>
        /// The Init method initializes all of the properties that determines how the 
        /// importer does its work. After calling base.Init(), set applicable properties 
        /// to desired values.
        /// </summary>
        protected override void Init()
        {
            // By default, the base importer code creates a table called "Imported_XXXXXX" 
            // (where "XXXXXX" is the name of the file). If you want to give it a name 
            // yourself, specify it AFTER the call to base.Init().
            base.Init();
            // TO-DO: Change the property values that make the importer behave as desired.
        }
        /// <summary>
        /// Creates the column hits used to define what columns are imported from the file 
        /// into the database. Of course, doing this manually requires knowledge of the 
        /// content of the file being imported. If 
        /// </summary>
        /// <param name="fromCSV">True if this is importing from a CSV file. Default value is false.</param>
        protected override void CreateColumnHints(bool fromCSV=false)
        {
            base.CreateColumnHints(fromCSV);
            // TO-DO: If you want to omit columns being imported, remove the hint items 
            //        here. You can remove columns beither by name, or by column index, 
            //        using code like you see below. If the indicated column doesn't 
            //        exist, no exception will be thrown.
            //this.ColumnHints.Remove("my column name");
        }
        /// <summary>
        /// This is an abstract method override which pulls the desired columns for 
        /// saving into the database.
        /// </summary>
        /// <param name="rowIndex">The index of the row for which data is being retrieved.</param>
        /// <returns>An array of SqlParameter objects</returns>
        /// <remarks>Do not put a try/catch block in this method. Instead, catch all 
        /// exceptions in the calling code so the return value and fail reason can 
        /// be set with a reasonable value.</remarks>
        protected override SqlParameter[] GetRowParameters(int rowIndex)
        {
            int      id;           
            string   comment;   
            DateTime theDate; 
            double   floatValue;
            int      intValue;
            string   ipAddr;
            // get your column values
            id         = (int)     this.GetColumnValue(rowIndex, "ID");
            comment    = (string)  this.GetColumnValue(rowIndex, "Comment");
            theDate    = (DateTime)this.GetColumnValue(rowIndex, "TheDate");
            floatValue = (double)  this.GetColumnValue(rowIndex, "FloatValue");
            intValue   = (int)     this.GetColumnValue(rowIndex, "IntValue");
            ipAddr     = (string)  this.GetColumnValue(rowIndex, "IPAddress");
            // Create the SqlParameter array (this is where you choose which fields 
            // will be used. This will affect the CREATE and INSERT queries.
            SqlParameter[] parameters = new SqlParameter[]
            {
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "ID"        ).DbValue, id),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "Comment"   ).DbValue, comment),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "TheDate"   ).DbValue, theDate),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "FloatValue").DbValue, floatValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IntValue"  ).DbValue, intValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IPAddress" ).DbValue, ipAddr),
            };
            return parameters;
        }
    }
}

示例 CSV 导入器

using SQLXPkgBase;
using System;
using System.Collections.Generic;
using System.Data.SqlClient;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
namespace PkgSampleImportCSV
{
    /// <summary>
    /// This class represents a sample of how you would code a class that inherits from 
    /// the abstract CSVImportBase class (which is the class that does all of the heavy 
    /// lifting for the import excel worksheet process). All methods in the base class 
    /// are virtual, so you can override any of them that you deem necessary. It is the 
    /// responsibility of this derived class to provide success/failure information to 
    /// the calling DLL.
    /// </summary>
    public class SampleCSVImporter : CsvImportBase
    {
        public SampleCSVImporter(string filename, string connectionstring)
               : base(filename, connectionstring)
        {
            this.Init();			
        }
        /// <summary>
        /// The Init method initializes all of the properties that determines how the 
        /// importer does its work. After calling base.Init(), set applicable properties 
        /// to desired values.
        /// </summary>
        protected override void Init()
        {
            // By default, the base importer code creates a table called "Imported_XXXXXX" 
            // (where "XXXXXX" is the name of the file). If you want to give it a name 
            // yourself, specify it AFTER the call to base.Init().
            base.Init();
            // TO-DO: Change the property values that make the importer behave as desired.
        }
        /// <summary>
        /// Creates the column hits used to define what columns are imported from the file 
        /// into the database. Of course, doing this manually requires knowledge of the 
        /// content of the file being imported. If 
        /// </summary>
        /// <param name="fromCSV">True if this is importing from a CSV file. Default value is false.</param>
        protected override void CreateColumnHints(bool fromCSV=false)
        {
            base.CreateColumnHints(fromCSV);
            // TO-DO: If you want to omit columns being imported, remove the hint items 
            //        here. You can remove columns beither by name, or by column index, 
            //        using code like you see below. If the indicated column doesn't 
            //        exist, no exception will be thrown.
            //this.ColumnHints.Remove("my column name");
        }
        /// <summary>
        /// This is an abstract method override which pulls the desired columns for 
        /// saving into the database.
        /// </summary>
        /// <param name="rowIndex">The index of the row for which data is being retrieved.</param>
        /// <returns>An array of SqlParameter objects</returns>
        /// <remarks>Do not put a try/catch block in this method. Instead, catch all 
        /// exceptions in the calling code so the return value and fail reason can 
        /// be set with a reasonable value.</remarks>
        protected override SqlParameter[] GetRowParameters(int rowIndex)
        {
            int      id;           
            string   comment;   
            DateTime theDate; 
            double   floatValue;
            int      intValue;
            string   ipAddr;
            // get your column values
            id         = (int)     this.GetColumnValue(rowIndex, "ID");
            comment    = (string)  this.GetColumnValue(rowIndex, "Comment");
            theDate    = (DateTime)this.GetColumnValue(rowIndex, "TheDate");
            floatValue = (double)  this.GetColumnValue(rowIndex, "FloatValue");
            intValue   = (int)     this.GetColumnValue(rowIndex, "IntValue");
            ipAddr     = (string)  this.GetColumnValue(rowIndex, "IPAddress");
            // Create the SqlParameter array (this is where you choose which fields 
            // will be used. This will affect the CREATE and INSERT queries.
            SqlParameter[] parameters = new SqlParameter[]
            {
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "ID"        ).DbValue, id),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "Comment"   ).DbValue, comment),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "TheDate"   ).DbValue, theDate),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "FloatValue").DbValue, floatValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IntValue"  ).DbValue, intValue),
                new SqlParameter(this.ColumnHints.First(x=>x.Name == "IPAddress" ).DbValue, ipAddr),
            };
            return parameters;
        }
    }
}

如何在您自己的项目中包含

我怀疑你们中的一些人会决定在自己的项目中使用此代码。为此,本节是为您准备的。要在您自己的项目中使用这些文件,最简单的方法如下。

在 Windows 资源管理器中

  • SQLXPkgBase 文件夹复制到您的解决方案文件夹。
     
  • SQLXCommon 文件夹复制到您的解决方案文件夹。
     
  • 将文件夹和文件(可选)重命名为自己的命名方案/偏好
     
  • 将 3rdParty 文件夹复制到您的解决方案文件夹。
     
  • 您只需要 3rdParty 文件夹中的 ExcelDataReader*.* 文件(用于此任务)。
     

在 Visual Studio 中

  • CommonPkgBase 项目添加到您的解决方案。
     
  • 解决(已重命名的)PkgBase 项目中损坏的引用。ExcelDataReaderSQLXCommon 引用将是损坏的)。
     
  • 解决命名问题。如果您重命名了 Common 项目中的命名空间,则会出现大多数问题。将所有对 SQLXCommon 的引用更改为您公共命名空间的名称。
     
  • 您可以安全地将 SQLXAgentPkgBase 类从(已重命名的)PkgBase 项目中排除。
     
  • 还有其他方法可以将代码精简到最低限度,但我将其留作程序员的练习。
     

最后

正如我上面引用的,我只编写了 Excel 导入器,因为考虑到 SQLXAgent 包的目的,以及企业环境中许多典型的 DTSX 包都涉及从 Excel 文件导入数据,这样做是有意义的。 (更残酷的实话) 另一方面,我编写 CSV 导入器代码仅仅是为了防止观众中那些不知感恩的混蛋们因为我提供了他们可能认为是“不完整的解决方案”而对我喋喋不休。对此我的态度绝不道歉。我的普遍感觉是,如果你想被人溺爱和手把手教导,那就回到幼儿园去吧。

历史

  • 2017 年 9 月 29 日 - 首次发布。
     
© . All rights reserved.