简单快速的 C# CSV 库






4.80/5 (59投票s)
这个库允许使用很少的代码将对象序列化和反序列化为 CSV。
引言
此库允许将对象的集合保存到 CSV 文件。
它还允许使用 LINQ 表达式从 CSV 文件加载数据。
它:
- 易于添加到您的应用程序(单个 C# 文件)
- 易于使用(加载和保存只需一行代码)
- 速度快(在我的笔记本电脑上,演示在 2 秒内创建了一百万条客户记录)
- 内存占用小(已测试超过 10,000,000 条记录)。
代码使用:保存到 CSV
使用代码非常简单。
如果您在数组中有一系列客户,则将其保存到 CSV 只需一行代码
clients.ToCsv("client.csv");
生成的 CSV 将列出集合中对象的public
属性(在此示例中为client
类)
ClientId;FirstName;MiddleName;LastName;Occupation;City
100;Michael;Son;Ray;Actuary;Fort Worth
101;Emmanuel;Nick;Williams;Earth Moving Equipment Mechanic;Ontario
102;Phyllis;Melanie;Fox;Knitter;Fresno
103;Ricardo;Percy;Beasley;Labourer;Columbus
104;Kerri;Jewell;Bennett;Production Manager;Jacksonville
105;Carlos;Luis;Hart;Locksmith;Buffalo
106;Heath;Herbert;Glass;Aeronautical Engineer;Seattle
107;Randall;Hans;Tanner;Air Traffic Controller;Corona
108;Damien;;Washington;Chartered Secretary;Allentown
109;Charlene;Alice;Thompson;Committee Clerk;Salem
110;Ann;Alicia;Potts;Actor;Raleigh
111;Richard;Gerardo;Riddle;Assayer Sampler;Akron
112;Pierre;Adolfo;Mendez;Oceanographer;Birmingham
如果您没有一次将所有对象都加载到内存中,也可以使用这种替代语法保存到 CSV
using (var csvFile = new CsvFile<Client>("clients.csv"))
{
for (int i = 0 ; i < 1000000 ; i++)
{
var client = new Client() { id = i, name = "Client " + i };
csvFile.Append(user);
}
}
代码使用:从 CSV 加载
要加载 CSV 文件,您可以使用CsvFile.Read
方法。
此命令接受一个类类型(在此示例中为client
类)和一个文件名。
foreach (var c in from CsvFile.Read<Client>("clients.csv"))
{
// client can be used here
}
CsvFile.Read
方法直接从磁盘逐条读取记录,而不会将其保存在内存中。
这允许您加载巨大的文件而不会出现内存问题。
请注意,在加载 CSV 文件时可以使用 LINQ 表达式。
foreach (var c in from client in CsvFile.Read<Client>("clients.csv")
where client.FirstName.StartsWith("J")
&& client.MiddleName.StartsWith("F")
&& client.LastName.StartsWith("K")
select client)
{
// client can be used here
}
进一步
该库能够使用流和流读取器/写入器而不是文件名。
如果您想要解析未存储在文件中的数据或来自网络的数据,这将非常有用。
我使用“;
”字符作为字段分隔符,默认情况下与 Excel 兼容。您可以更改分隔符或列列表,但可扩展性相当有限。
在 1.1 版中,我计划添加定义计算列的功能,这将允许您完成所有所需的格式化,而无需在库端添加太多代码。
如果您觉得这个库不能完全满足您的需求,我并不责怪您,它是为我的需求而设计的。
我鼓励您修改它,如果您愿意,请在下面的留言中或通过 CodeProject 邮件系统给我一些反馈。
目前还不支持追加到 CSV 文件。
完整源代码
库的全部源代码在这里
using System;
using System.Collections;
using System.Collections.Generic;
using System.Globalization;
using System.IO;
using System.Linq;
using System.Linq.Expressions;
using System.Reflection;
using System.Text;
#if !NET_3_5
using System.Threading;
using System.Collections.Concurrent;
using System.Threading.Tasks;
#endif
namespace CsvFiles
{
public class CsvFile : IDisposable
{
#region Static Members
public static CsvDefinition DefaultCsvDefinition { get; set; }
public static bool UseLambdas { get; set; }
public static bool UseTasks { get; set; }
public static bool FastIndexOfAny { get; set; }
static CsvFile()
{
// Choosing default Field Separator is a hard decision
// In theory the C of CSV means Comma separated
// In Windows though many people will try to open the csv with Excel
// which is horrid with real CSV.
// As the target for this library is Windows we go for Semi Colon.
DefaultCsvDefinition = new CsvDefinition
{
EndOfLine = "\r\n",
FieldSeparator = ';',
TextQualifier = '"'
};
UseLambdas = true;
UseTasks = true;
FastIndexOfAny = true;
}
#endregion
internal protected Stream BaseStream;
protected static DateTime DateTimeZero = new DateTime();
public static IEnumerable<T> Read<T>(CsvSource csvSource) where T : new()
{
var csvFileReader = new CsvFileReader<T>(csvSource);
return (IEnumerable<T>)csvFileReader;
}
public char FieldSeparator { get; private set; }
public char TextQualifier { get; private set; }
public IEnumerable<String> Columns { get; private set; }
public void Dispose()
{
Dispose(true);
}
protected virtual void Dispose(bool disposing)
{
// overridden in derived classes
}
}
public class CsvFile<T> : CsvFile
{
private readonly char fieldSeparator;
private readonly string fieldSeparatorAsString;
private readonly char[] invalidCharsInFields;
private readonly StreamWriter streamWriter;
private readonly char textQualifier;
private readonly String[] columns;
private Func<T, object>[] getters;
readonly bool[] isInvalidCharInFields;
#if !NET_3_5
private int linesToWrite;
private readonly BlockingCollection<string> csvLinesToWrite =
new BlockingCollection<string>(5000);
private readonly Thread writeCsvLinesTask;
private Task addAsyncTask;
#endif
public CsvFile(CsvDestination csvDestination)
: this(csvDestination, null)
{
}
public CsvFile()
{
}
public CsvFile(CsvDestination csvDestination, CsvDefinition csvDefinition)
{
if (csvDefinition == null)
csvDefinition = DefaultCsvDefinition;
this.columns = (csvDefinition.Columns ?? InferColumns(typeof(T))).ToArray();
this.fieldSeparator = csvDefinition.FieldSeparator;
this.fieldSeparatorAsString = this.fieldSeparator.ToString(CultureInfo.InvariantCulture);
this.textQualifier = csvDefinition.TextQualifier;
this.streamWriter = csvDestination.StreamWriter;
this.invalidCharsInFields = new[] { '\r', '\n', this.textQualifier, this.fieldSeparator };
this.isInvalidCharInFields = new bool[256];
foreach (var c in this.invalidCharsInFields)
{
this.isInvalidCharInFields[c] = true;
}
this.WriteHeader();
this.CreateGetters();
#if !NET_3_5
if (CsvFile.UseTasks)
{
writeCsvLinesTask = new Thread((o) => this.WriteCsvLines());
writeCsvLinesTask.Start();
}
this.addAsyncTask = Task.Factory.StartNew(() => { });
#endif
}
protected override void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
#if !NET_3_5
addAsyncTask.Wait();
if (csvLinesToWrite != null)
{
csvLinesToWrite.CompleteAdding();
}
if (writeCsvLinesTask != null)
writeCsvLinesTask.Join();
#endif
this.streamWriter.Close();
}
}
protected static IEnumerable<string> InferColumns(Type recordType)
{
var columns = recordType
.GetProperties(BindingFlags.Public | BindingFlags.Instance)
.Where(pi => pi.GetIndexParameters().Length == 0
&& pi.GetSetMethod() != null
&& !Attribute.IsDefined(pi, typeof(CsvIgnorePropertyAttribute)))
.Select(pi => pi.Name)
.Concat(recordType
.GetFields(BindingFlags.Public | BindingFlags.Instance)
.Where(fi => !Attribute.IsDefined(fi, typeof(CsvIgnorePropertyAttribute)))
.Select(fi => fi.Name))
.ToList();
return columns;
}
#if !NET_3_5
private void WriteCsvLines()
{
int written = 0;
foreach (var csvLine in csvLinesToWrite.GetConsumingEnumerable())
{
this.streamWriter.WriteLine(csvLine);
written++;
}
Interlocked.Add(ref this.linesToWrite, -written);
}
#endif
public void Append(T record)
{
if (CsvFile.UseTasks)
{
#if !NET_3_5
var linesWaiting = Interlocked.Increment(ref this.linesToWrite);
Action<Task> addRecord = (t) =>
{
var csvLine = this.ToCsv(record);
this.csvLinesToWrite.Add(csvLine);
};
if (linesWaiting < 10000)
this.addAsyncTask = this.addAsyncTask.ContinueWith(addRecord);
else
addRecord(null);
#else
throw new NotImplementedException("Tasks");
#endif
}
else
{
var csvLine = this.ToCsv(record);
this.streamWriter.WriteLine(csvLine);
}
}
private static Func<T, object> FindGetter(string c, bool staticMember)
{
var flags = BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.IgnoreCase | (staticMember ? BindingFlags.Static : BindingFlags.Instance);
Func<T, object> func = null;
PropertyInfo pi = typeof(T).GetProperty(c, flags);
FieldInfo fi = typeof(T).GetField(c, flags);
if (CsvFile.UseLambdas)
{
Expression expr = null;
ParameterExpression parameter = Expression.Parameter(typeof(T), "r");
Type type = null;
if (pi != null)
{
type = pi.PropertyType;
expr = Expression.Property(parameter, pi.Name);
}
else if (fi != null)
{
type = fi.FieldType;
expr = Expression.Field(parameter, fi.Name);
}
if (expr != null)
{
Expression<Func<T, object>> lambda;
if (type.IsValueType)
{
lambda = Expression.Lambda<Func<T, object>>
(Expression.TypeAs(expr, typeof(object)), parameter);
}
else
{
lambda = Expression.Lambda<Func<T, object>>(expr, parameter);
}
func = lambda.Compile();
}
}
else
{
if (pi != null)
func = o => pi.GetValue(o, null);
else if (fi != null)
func = o => fi.GetValue(o);
}
return func;
}
private void CreateGetters()
{
var list = new List<Func<T, object>>();
foreach (var columnName in columns)
{
Func<T, Object> func = null;
var propertyName = (columnName.IndexOf(' ') < 0 ?
columnName : columnName.Replace(" ", ""));
func = FindGetter(columnName, false) ?? FindGetter(columnName, true);
list.Add(func);
}
this.getters = list.ToArray();
}
private string ToCsv(T record)
{
if (record == null)
throw new ArgumentException("Cannot be null", "record");
string[] csvStrings = new string[getters.Length];
for (int i = 0 ; i < getters.Length ; i++)
{
var getter = getters[i];
object fieldValue = getter == null ? null : getter(record);
csvStrings[i] = this.ToCsvString(fieldValue);
}
return string.Join(this.fieldSeparatorAsString, csvStrings);
}
private string ToCsvString(object o)
{
if (o != null)
{
string valueString = o as string ?? Convert.ToString(o, CultureInfo.CurrentUICulture);
if (RequiresQuotes(valueString))
{
var csvLine = new StringBuilder();
csvLine.Append(this.textQualifier);
foreach (char c in valueString)
{
if (c == this.textQualifier)
csvLine.Append(c); // double the double quotes
csvLine.Append(c);
}
csvLine.Append(this.textQualifier);
return csvLine.ToString();
}
else
return valueString;
}
return string.Empty;
}
private bool RequiresQuotes(string valueString)
{
if (CsvFile.FastIndexOfAny)
{
var len = valueString.Length;
for (int i = 0 ; i < len ; i++)
{
char c = valueString[i];
if (c <= 255 && this.isInvalidCharInFields[c])
return true;
}
return false;
}
else
{
return valueString.IndexOfAny(this.invalidCharsInFields) >= 0;
}
}
private void WriteHeader()
{
var csvLine = new StringBuilder();
for (int i = 0 ; i < this.columns.Length ; i++)
{
if (i > 0)
csvLine.Append(this.fieldSeparator);
csvLine.Append(this.ToCsvString(this.columns[i]));
}
this.streamWriter.WriteLine(csvLine.ToString());
}
}
internal class CsvFileReader<T> : CsvFile, IEnumerable<T>, IEnumerator<T>
where T : new()
{
private readonly Dictionary<Type, List<Action<T, String>>> allSetters =
new Dictionary<Type, List<Action<T, String>>>();
private string[] columns;
private char curChar;
private int len;
private string line;
private int pos;
private T record;
private readonly char fieldSeparator;
private readonly TextReader textReader;
private readonly char textQualifier;
private readonly StringBuilder parseFieldResult = new StringBuilder();
public CsvFileReader(CsvSource csvSource)
: this(csvSource, null)
{
}
public CsvFileReader(CsvSource csvSource, CsvDefinition csvDefinition)
{
var streamReader = csvSource.TextReader as StreamReader;
if (streamReader != null)
this.BaseStream = streamReader.BaseStream;
if (csvDefinition == null)
csvDefinition = DefaultCsvDefinition;
this.fieldSeparator = csvDefinition.FieldSeparator;
this.textQualifier = csvDefinition.TextQualifier;
this.textReader =
csvSource.TextReader;// new FileStream(csvSource.TextReader, FileMode.Open);
this.ReadHeader(csvDefinition.Header);
}
public T Current
{
get { return this.record; }
}
public bool Eof
{
get { return this.line == null; }
}
object IEnumerator.Current
{
get { return this.Current; }
}
protected override void Dispose(bool disposing)
{
if (disposing)
{
// free managed resources
this.textReader.Dispose();
}
}
IEnumerator IEnumerable.GetEnumerator()
{
return this.GetEnumerator();
}
public IEnumerator<T> GetEnumerator()
{
return this;
}
public bool MoveNext()
{
this.ReadNextLine();
if (this.line == null && (this.line = this.textReader.ReadLine()) == null)
{
this.record = default(T);
}
else
{
this.record = new T();
Type recordType = typeof(T);
List<Action<T, String>> setters;
if (!this.allSetters.TryGetValue(recordType, out setters))
{
setters = this.CreateSetters();
this.allSetters[recordType] = setters;
}
var fieldValues = new string[setters.Count];
for (int i = 0 ; i < setters.Count ; i++)
{
fieldValues[i] = this.ParseField();
if (this.curChar == this.fieldSeparator)
this.NextChar();
else
break;
}
for (int i = 0 ; i < setters.Count ; i++)
{
var setter = setters[i];
if (setter != null)
{
setter(this.record, fieldValues[i]);
}
}
}
return (this.record != null);
}
public void Reset()
{
throw new NotImplementedException("Cannot reset CsvFileReader enumeration.");
}
private static Action<T, string> EmitSetValueAction(MemberInfo mi, Func<string, object> func)
{
ParameterExpression paramExpObj = Expression.Parameter(typeof(object), "obj");
ParameterExpression paramExpT = Expression.Parameter(typeof(T), "instance");
{
var pi = mi as PropertyInfo;
if (pi != null)
{
#if !NET_3_5
if (CsvFile.UseLambdas)
{
var callExpr = Expression.Call(
paramExpT,
pi.GetSetMethod(),
Expression.ConvertChecked(paramExpObj, pi.PropertyType));
var setter = Expression.Lambda<Action<T, object>>(
callExpr,
paramExpT,
paramExpObj).Compile();
return (o, s) => setter(o, func(s));
}
#endif
return (o, v) => pi.SetValue(o, (object)func(v), null);
}
}
{
var fi = mi as FieldInfo;
if (fi != null)
{
#if !NET_3_5
if (CsvFile.UseLambdas)
{
//ParameterExpression valueExp = Expression.Parameter(typeof(string), "value");
var valueExp = Expression.ConvertChecked(paramExpObj, fi.FieldType);
// Expression.Property can be used here as well
MemberExpression fieldExp = Expression.Field(paramExpT, fi);
BinaryExpression assignExp = Expression.Assign(fieldExp, valueExp);
var setter = Expression.Lambda<Action<T, object>>
(assignExp, paramExpT, paramExpObj).Compile();
return (o, s) => setter(o, func(s));
}
#endif
return ((o, v) => fi.SetValue(o, func(v)));
}
}
throw new NotImplementedException();
}
private static Action<T, string> FindSetter(string c, bool staticMember)
{
var flags = BindingFlags.Public | BindingFlags.NonPublic |
BindingFlags.IgnoreCase | (staticMember ? BindingFlags.Static : BindingFlags.Instance);
Action<T, string> action = null;
PropertyInfo pi = typeof(T).GetProperty(c, flags);
if (pi != null)
{
var pFunc = StringToObject(pi.PropertyType);
action = EmitSetValueAction(pi, pFunc);
}
FieldInfo fi = typeof(T).GetField(c, flags);
if (fi != null)
{
var fFunc = StringToObject(fi.FieldType);
action = EmitSetValueAction(fi, fFunc);
}
return action;
}
private static Func<string, object> StringToObject(Type propertyType)
{
if (propertyType == typeof(string))
return (s) => s ?? String.Empty;
else if (propertyType == typeof(Int32))
return (s) => String.IsNullOrEmpty(s) ? 0 : Int32.Parse(s);
if (propertyType == typeof(DateTime))
return (s) => String.IsNullOrEmpty(s) ? DateTimeZero : DateTime.Parse(s);
else
throw new NotImplementedException();
}
private List<Action<T, string>> CreateSetters()
{
var list = new List<Action<T, string>>();
for (int i = 0 ; i < this.columns.Length ; i++)
{
string columnName = this.columns[i];
Action<T, string> action = null;
if (columnName.IndexOf(' ') >= 0)
columnName = columnName.Replace(" ", "");
action = FindSetter(columnName, false) ?? FindSetter(columnName, true);
list.Add(action);
}
return list;
}
private void NextChar()
{
if (this.pos < this.len)
{
this.pos++;
this.curChar = this.pos < this.len ? this.line[this.pos] : '\0';
}
}
private void ParseEndOfLine()
{
throw new NotImplementedException();
}
private string ParseField()
{
parseFieldResult.Length = 0;
if (this.line == null || this.pos >= this.len)
return null;
while (this.curChar == ' ' || this.curChar == '\t')
{
this.NextChar();
}
if (this.curChar == this.textQualifier)
{
this.NextChar();
while (this.curChar != 0)
{
if (this.curChar == this.textQualifier)
{
this.NextChar();
if (this.curChar == this.textQualifier)
{
this.NextChar();
parseFieldResult.Append(this.textQualifier);
}
else
return parseFieldResult.ToString();
}
else if (this.curChar == '\0')
{
if (this.line == null)
return parseFieldResult.ToString();
this.ReadNextLine();
}
else
{
parseFieldResult.Append(this.curChar);
this.NextChar();
}
}
}
else
{
while (this.curChar != 0 && this.curChar != this.fieldSeparator &&
this.curChar != '\r' && this.curChar != '\n')
{
parseFieldResult.Append(this.curChar);
this.NextChar();
}
}
return parseFieldResult.ToString();
}
private void ReadHeader(string header)
{
if (header == null)
{
this.ReadNextLine();
}
else
{
// we read the first line from the given header
this.line = header;
this.pos = -1;
this.len = this.line.Length;
this.NextChar();
}
var readColumns = new List<string>();
string columnName;
while ((columnName = this.ParseField()) != null)
{
readColumns.Add(columnName);
if (this.curChar == this.fieldSeparator)
this.NextChar();
else
break;
}
this.columns = readColumns.ToArray();
}
private void ReadNextLine()
{
this.line = this.textReader.ReadLine();
this.pos = -1;
if (this.line == null)
{
this.len = 0;
this.curChar = '\0';
}
else
{
this.len = this.line.Length;
this.NextChar();
}
}
}
public class CsvDefinition
{
public string Header { get; set; }
public char FieldSeparator { get; set; }
public char TextQualifier { get; set; }
public IEnumerable<String> Columns { get; set; }
public string EndOfLine { get; set; }
public CsvDefinition()
{
if (CsvFile.DefaultCsvDefinition != null)
{
FieldSeparator = CsvFile.DefaultCsvDefinition.FieldSeparator;
TextQualifier = CsvFile.DefaultCsvDefinition.TextQualifier;
EndOfLine = CsvFile.DefaultCsvDefinition.EndOfLine;
}
}
}
public class CsvSource
{
public readonly TextReader TextReader;
public static implicit operator CsvSource(CsvFile csvFile)
{
return new CsvSource(csvFile);
}
public static implicit operator CsvSource(string path)
{
return new CsvSource(path);
}
public static implicit operator CsvSource(TextReader textReader)
{
return new CsvSource(textReader);
}
public CsvSource(TextReader textReader)
{
this.TextReader = textReader;
}
public CsvSource(Stream stream)
{
this.TextReader = new StreamReader(stream);
}
public CsvSource(string path)
{
this.TextReader = new StreamReader(path);
}
public CsvSource(CsvFile csvFile)
{
this.TextReader = new StreamReader(csvFile.BaseStream);
}
}
public class CsvDestination
{
public StreamWriter StreamWriter;
public static implicit operator CsvDestination(string path)
{
return new CsvDestination(path);
}
private CsvDestination(StreamWriter streamWriter)
{
this.StreamWriter = streamWriter;
}
private CsvDestination(Stream stream)
{
this.StreamWriter = new StreamWriter(stream);
}
public CsvDestination(string fullName)
{
FixCsvFileName(ref fullName);
this.StreamWriter = new StreamWriter(fullName);
}
private static void FixCsvFileName(ref string fullName)
{
fullName = Path.GetFullPath(fullName);
var path = Path.GetDirectoryName(fullName);
if (path != null && !Directory.Exists(path))
Directory.CreateDirectory(path);
if (!String.Equals(Path.GetExtension(fullName), ".csv"))
fullName += ".csv";
}
}
public static class CsvFileLinqExtensions
{
public static void ToCsv<T>(this IEnumerable<T> source, CsvDestination csvDestination)
{
source.ToCsv(csvDestination, null);
}
public static void ToCsv<T>(this IEnumerable<T> source,
CsvDestination csvDestination, CsvDefinition csvDefinition)
{
using (var csvFile = new CsvFile<T>(csvDestination, csvDefinition))
{
foreach (var record in source)
{
csvFile.Append(record);
}
}
}
}
public class CsvIgnorePropertyAttribute : Attribute
{
public override string ToString()
{
return "Ignore Property";
}
}
// 2013-11-29 Version 1
// 2014-01-06 Version 2: add CoryLuLu suggestions
}
关注点
在可用时,我使用System.Task
库及其附带的一些高级集合。在多核机器上,速度提升非常显著。
历史
- 2013年11月20日
- 第一版
- 2014年3月21日
- 允许没有标题的文件
- 添加了对任何
textReader
的支持 - 添加了一个使用内存文件和自定义分隔符的示例