北美电话号码格式提供程序 (iFormatProvider 实现)
实现 IFormatProvider 接口,允许将“脏数据”电话号码格式化为统一的字符串
引言
在我的许多身份管理 (IdM) 项目中,我面临着“脏数据”的困境。“脏数据”一词用于描述数据源中存在的错误或具有误导性的数据。
自助服务数据源(例如 Web 门户、电话目录等)是不一致数据输入的最大来源,这是可以理解的,因为在任何用户被允许手动修改他/她的数据且几乎没有指南和数据验证的情况下。
身份管理专业人员将面临许多不同类型的用户提供的数据;最常见的“外包”给最终用户输入的数据类型之一是用户的电话号码。最终,所有同步的数据源都可以使用这些数据,如果/当应用程序期望更一致的数据格式时,这可能会导致处理方面的困难。
此解决方案解决的问题
重新格式化和规范化不规则输入/存储的北美电话号码。
这有什么帮助
拥有统一的电话号码将允许程序员/系统管理员或任何其他 IT 专业人员将干净的数据存储到接收数据源中。
Using the Code
在北美,我们很幸运拥有统一的电话号码计划(从程序员的角度来看),称为北美号码方案 (NANP); NANP 使电话号码的解析相对容易。本文仅涵盖北美电话号码格式,并不尝试解析任何其他电话系统的任何其他格式。将此自定义格式提供程序直接应用于其他类型的电话号码可能会导致不可预测的结果。但是,您可以通过添加可以识别特定于本地电话系统的电话号码格式的方法来扩展此代码。一个很好的例子是法国电话系统,它在其编号规则中是持久的,因此可以通过格式提供程序相对容易地量化。
代码实际如何工作
IFormatProvider
接口的实现已在 MSDN 站点上得到很好的记录。IFormatProvider
的这个特殊应用与几个预定义的“代码”一起使用,以区分string
的几个期望的结果格式。
理解的格式
{0:a}
示例:1-555-563-3434(带连字符){0:c}
示例:15555633434(仅数字){0:d}
示例:+1 (555) 123-4567(默认){0:de}
示例:1 (555) 563-3434 ex 5555(默认带分机){0:e}
示例:1-555-563-3434 ex 5555(分机){0:s}
示例:1 555 563 3434(空格)
下面提供的代码将演示如何使用 Lost and Found Identity Phone 格式化程序
using LostAndFoundIdentity.Text;
namespace Console
{
class Program
{
static void Main(string[] args)
{
string[] values = new string[] {
"1 555 123-4567",
"555 123-4567",
"1234567",
"+1 555 543-22-34",
"1(555)5633-434",
"555-5555 ext55",
"1 564 6654634 ex 5555"
};
foreach (string value in values)
{
System.Console.WriteLine("Input string: " + value);
string resultA = string.Format(new LafiPhoneFormatProvider(), "{0:a}", value);
System.Console.WriteLine("{:a} " + resultA + "\t\t" + "isModified: " +
!value.Equals(resultA));
string resultC = string.Format(new LafiPhoneFormatProvider(), "{0:c}", value);
System.Console.WriteLine("{:c} " + resultC + "\t\t" + "isModified: " +
!value.Equals(resultC));
string resultD = string.Format(new LafiPhoneFormatProvider(), "{0:d}", value);
System.Console.WriteLine("{:d} " + resultD + "\t\t" + "isModified: " +
!value.Equals(resultD));
string resultDE = string.Format(new LafiPhoneFormatProvider(), "{0:de}", value);
System.Console.WriteLine("{:de} " + resultDE + "\t\t" + "isModified: " +
!value.Equals(resultDE));
string resultE = string.Format(new LafiPhoneFormatProvider(), "{0:e}", value);
System.Console.WriteLine("{:e} " + resultE + "\t\t" + "isModified: " +
!value.Equals(resultE));
string resultS = string.Format(new LafiPhoneFormatProvider(), "{0:s}", value);
System.Console.WriteLine("{:s} " + resultS + "\t\t" + "isModified: " +
!value.Equals(resultS));
System.Console.WriteLine("- - - - - ");
}
}
}
}
结果输出
Input string: 1 555 123-4567
{:a} 1-555-123-4567 isModified: True
{:c} 15551234567 isModified: True
{:d} +1 (555) 123-4567 isModified: True
{:de} +1 (555) 123-4567 isModified: True
{:e} 1-555-123-4567 isModified: True
{:s} 1 555 123 4567 isModified: True
- - - - -
Input string: 555 123-4567
{:a} 555-123-4567 isModified: True
{:c} 5551234567 isModified: True
{:d} (555) 123-4567 isModified: True
{:de} (555) 123-4567 isModified: True
{:e} 555-123-4567 isModified: True
{:s} 555 123 4567 isModified: True
- - - - -
Input string: 1234567
{:a} 123-4567 isModified: True
{:c} 1234567 isModified: False
{:d} 123-4567 isModified: True
{:de} 123-4567 isModified: True
{:e} 123-4567 isModified: True
{:s} 123 4567 isModified: True
- - - - -
Input string: +1 555 543-22-34
{:a} 1-555-543-2234 isModified: True
{:c} 15555432234 isModified: True
{:d} +1 (555) 543-2234 isModified: True
{:de} +1 (555) 543-2234 isModified: True
{:e} 1-555-543-2234 isModified: True
{:s} 1 555 543 2234 isModified: True
- - - - -
Input string: 1(555)5633-434
{:a} 1-555-563-3434 isModified: True
{:c} 15555633434 isModified: True
{:d} +1 (555) 563-3434 isModified: True
{:de} +1 (555) 563-3434 isModified: True
{:e} 1-555-563-3434 isModified: True
{:s} 1 555 563 3434 isModified: True
- - - - -
Input string: 555-5555 ext55
{:a} 555-5555 ext55 isModified: False
{:c} 555-5555 ext55 isModified: False
{:d} 555-5555 ext55 isModified: False
{:de} 555-5555 Ext. 55 isModified: True
{:e} 555-5555 Ext. 55 isModified: True
{:s} 555-5555 ext55 isModified: False
- - - - -
Input string: 1 564 6654634 ex 5555
{:a} 1 564 6654634 ex 5555 isModified: False
{:c} 1 564 6654634 ex 5555 isModified: False
{:d} 1 564 6654634 ex 5555 isModified: False
{:de} +1 (564) 665-4634 Ext. 5555 isModified: True
{:e} 1-564-665-4634 Ext. 5555 isModified: True
{:s} 1 564 6654634 ex 5555 isModified: False
- - - - -
关注点
这个 "Lost And Found Identity" 电话格式提供程序被实现为 IFormatProvider
接口实现,这允许它在许多我的最初意图之外的应用程序中使用,作为 Microsoft 的 Identity Lifecycle Manager 2007/Forefront Identity Manager 2010 的附加组件。 我可以看到这个格式提供程序可以很容易地被 PowerShell 数据处理或任何其他数据处理/"刷洗" 采用。
代码片段中发生了什么?
此代码调用 LafiPhoneFormatProvider
(如下提供) 类,该类负责识别用户的输入和所需的格式并重新格式化提供的 string
。
LafiPhoneFormatProvider 类
//-----------------------------------------------------------------------
// <copyright file="LafiPhoneFormatProvider.cs" company="LostAndFoundIdentity">
// Copyright (c) 2007 LostAndFoundIdentity.com | All rights reserved.
// </copyright>
// <author>Dmitry Kazantsev</author>
//-----------------------------------------------------------------------
[assembly: System.CLSCompliant(true)]
namespace LostAndFoundIdentity.Text
{
using System;
using System.Diagnostics.CodeAnalysis;
using System.Globalization;
using System.Text.RegularExpressions;
/// <summary>
/// Represents implementation class LafiPhoneFormatProvider that is
/// implements IFormatProvider and ICustomFormatter
/// </summary>
[SuppressMessage("Microsoft.Naming", "CA1704:IdentifiersShouldBeSpelledCorrectly",
MessageId = "Lafi", Justification = "'Lafi' stands for Lost And Found Identity")]
public class LafiPhoneFormatProvider : ICustomFormatter, IFormatProvider
{
/// <summary>
/// Regular expression "formula" designed to catch phone number extensions
/// </summary>
private const string ExtensionFormula =
"((\\s{1,2})?(e|ext|ex|extn|extension|x)(\\.)?(\\s{1,2})?)(\\d+)";
/// <summary>
/// Gets type of the format
/// </summary>
/// <param name="formatType">Format in question</param>
/// <returns>Type of the format in question</returns>
public object GetFormat(Type formatType)
{
if (formatType == typeof(ICustomFormatter))
{
return this;
}
else
{
return null;
}
}
/// <summary>
/// Formats provided string with a pre-defined template
/// </summary>
/// <param name="format">Name of the format presented as
/// {0:x}, where 'x' can be 'a', 'c', 'd',
/// 'de', 'e', or 's' </param>
/// <param name="arg">Value to be formatted</param>
/// <param name="formatProvider">The format provider class</param>
/// <returns>Formatted string</returns>
public string Format(string format, object arg, IFormatProvider formatProvider)
{
// Convert argument to a string.
string result = arg.ToString();
switch (format.ToUpperInvariant())
{
case null:
{
return result;
}
case "A":
{
return FormatPhone(result, "-");
}
case "C":
{
return FormatPhone(result, string.Empty);
}
case "D": // Default
{
return FormatPhone(result);
}
case "DE": // Default + Extension
{
if (HasExtension(result))
{
string extension = GetExtension(result);
string phone = SubstructExtension(result);
phone = FormatPhone(phone);
phone = string.Format(CultureInfo.CurrentCulture,
"{0} Ext. {1}", phone, extension);
return phone;
}
return FormatPhone(result);
}
case "E": // Extension
{
if (HasExtension(result))
{
string extension = GetExtension(result);
string phone = SubstructExtension(result);
phone = FormatPhone(phone, "-");
phone = string.Format(CultureInfo.CurrentCulture,
"{0} Ext. {1}", phone, extension);
return phone;
}
return FormatPhone(result, "-");
}
case "S": // Space
{
return FormatPhone(result, " ");
}
default:
{
throw new FormatException(
"'" + format + "' is not a supported format type.");
}
}
}
/// <summary>
/// Formats string representation of North American telephone number;
/// Inserts provided separator
/// </summary>
/// <param name="value">String containing North American telephone number</param>
/// <param name="separator">String containing separator character</param>
/// <returns>xxx.xxxx or xxx.xxx.xxxx or x.xxx.xxx.xxxx</returns>
private static string FormatPhone(string value, string separator)
{
string tempString = GetNumericValue(value);
string countryCode = string.Empty;
string areaCode = string.Empty;
string firstThree = string.Empty;
string lastFour = string.Empty;
switch (tempString.Length)
{
case 7: //// nnn.nnnn
{
firstThree = tempString.Substring(0, 3);
lastFour = tempString.Substring(3, 4);
return string.Format(CultureInfo.CurrentCulture, "{0}{2}{1}",
firstThree, lastFour, separator);
}
case 10: //// nnn.nnn.nnnn
{
areaCode = tempString.Substring(0, 3);
firstThree = tempString.Substring(3, 3);
lastFour = tempString.Substring(6, 4);
return string.Format(CultureInfo.CurrentCulture,
"{0}{3}{1}{3}{2}", areaCode, firstThree,
lastFour, separator);
}
case 11: //// n.nnn.nnn.nnnn
{
countryCode = tempString.Substring(0, 1);
areaCode = tempString.Substring(1, 3);
firstThree = tempString.Substring(4, 3);
lastFour = tempString.Substring(7, 4);
return string.Format(CultureInfo.CurrentCulture,
"{0}{4}{1}{4}{2}{4}{3}", countryCode, areaCode, firstThree,
lastFour, separator);
}
default:
{
return value;
}
}
}
/// <summary>
/// Formats a string representing a North American phone number in
/// the "default" format
/// </summary>
/// <param name="value">The "phone number" to be formatted</param>
/// <returns>Formatted phone number</returns>
private static string FormatPhone(string value)
{
string tempString = GetNumericValue(value);
string countryCode = string.Empty;
string areaCode = string.Empty;
string firstThree = string.Empty;
string lastFour = string.Empty;
switch (tempString.Length)
{
case 7: //// nnn-nnnn
{
firstThree = tempString.Substring(0, 3);
lastFour = tempString.Substring(3, 4);
return string.Format(CultureInfo.CurrentCulture, "{0}-{1}",
firstThree, lastFour);
}
case 10: //// (nnn) nnn-nnnn
{
areaCode = tempString.Substring(0, 3);
firstThree = tempString.Substring(3, 3);
lastFour = tempString.Substring(6, 4);
return string.Format(CultureInfo.CurrentCulture,
"({0}) {1}-{2}", areaCode, firstThree, lastFour);
}
case 11: //// +n (nnn) nnn-nnnn
{
countryCode = tempString.Substring(0, 1);
areaCode = tempString.Substring(1, 3);
firstThree = tempString.Substring(4, 3);
lastFour = tempString.Substring(7, 4);
return string.Format(CultureInfo.CurrentCulture,
"+{0} ({1}) {2}-{3}", countryCode, areaCode, firstThree,
lastFour);
}
default:
{
return value;
}
}
}
/// <summary>
/// Strips all non-numerical characters form provided string
/// </summary>
/// <param name="value">String containing North American telephone number</param>
/// <returns>Numerical values of provided string</returns>
private static string GetNumericValue(string value)
{
Regex notNumerical = new Regex("[\\D]");
foreach (Match match in notNumerical.Matches(value))
{
value = value.Replace(match.Value, string.Empty);
}
return value;
}
/// <summary>
/// Determines whether provided string contains a phone "extension" and
/// returns numerical value of that extension
/// </summary>
/// <param name="value">Phone number with extension</param>
/// <returns>Telephone number extension</returns>
private static string GetExtension(string value)
{
Regex extension = new Regex(ExtensionFormula,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
MatchCollection matches = extension.Matches(value);
if (0 == matches.Count || string.IsNullOrEmpty(matches[0].Groups[6].Value))
{
return string.Empty;
}
return matches[0].Groups[6].Value;
}
/// <summary>
/// Determines whether provided string contains a phone "extension"
/// and returns numerical value of that extension
/// </summary>
/// <param name="value">Phone number with extension</param>
/// <returns>Telephone number extension</returns>
private static string SubstructExtension(string value)
{
Regex extension = new Regex(ExtensionFormula,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
value = extension.Replace(value, string.Empty);
return value;
}
/// <summary>
/// Determines whether the string in question contains the phone number
/// with extension or not
/// </summary>
/// <param name="value">The phone number in question</param>
/// <returns>'True' when extension is found and 'false'
/// when it is not found</returns>
private static bool HasExtension(string value)
{
Regex extension = new Regex(ExtensionFormula,
RegexOptions.IgnoreCase | RegexOptions.CultureInvariant);
MatchCollection matches = extension.Matches(value);
if (0 == matches.Count)
{
return false;
}
else
{
return true;
}
}
}
}
历史
- 2009 年 6 月 15 日: 初始发布