在我的办公室,我们使用旧的第三方工具来处理一些数据处理和导出工作。不幸的是,此工具的输出格式非常笨拙,因此要使我们将其转换为有意义的形式并使用它,我们必须在该数据的原始导出与我们进一步采取行动之间的中间步骤。它。
这个问题是我前段时间使用itertools在Python中相当简洁地解决的一个问题,但是由于种种原因,我需要将该工作重新定位到现有的C#应用程序中。
我已经超级概括和简化了我在此处发布的示例数据(以及相应的代码),但是它代表了真实数据的设置方式。该工具吐出的原始数据看起来像这样,但有一些警告(我将解释):
Zip Code: 11111
First Name: Joe
Last Name: Smith
ID: 1
Phone Number: 555-555-1111
Zip Code: 11111
First Name: John
Last Name: Doe
ID: 2
Phone Number: 555-555-1112
Zip Code: 11111
First Name: Mike
Last Name: Jones
ID: 3
Phone Number: 555-555-1113
记录之间没有唯一的分隔符。它们只是一个接一个地列出。有效且可操作的记录包含所有五个项目(“邮政编码”,“名字”,“姓氏”,“ ID”,“电话号码”)。
我们只需要姓/名,ID和电话号码即可。每条唯一记录总是以邮政编码开头,但是由于基础流程和第三方工具中的一些怪癖,我需要考虑一些事项:
我在C#中解决此问题的方法是从第一行开始,然后检查它是否以“邮政编码”开头。如果是这样,我将进入另一个循环,在该循环中,我将构建一个键和值的字典(在第一个“:”处拆分),直到点击下一个“邮政编码”行。然后,它会重复并再次滚动整个过程current line < (line count - 5)
。
private void CrappilyHandleExportLines(List<string> RawExportLines)
{
int lineNumber = 0;
while (lineNumber < (RawExportLines.Count - 5))
{
// The lineGroup dict will represent the record we're currently processing
Dictionary<string, string> lineGroup = new Dictionary<string, string>();
// If the current line begins with "Zip Code", this means we've reached another record to process
if (RawExportLines[lineNumber++].StartsWith("Zip Code"))
{
// If the line does NOT begin with "Zip Code", we assume it's another part of the record we're already
// working on.
while (!RawExportLines[lineNumber].StartsWith("Zip Code"))
{
// Append everything except "Error" lines to the record we're working on, as stored in lineGroup
if (!RawExportLines[lineNumber].StartsWith("Error")
{
string[] splitLine = RawExportLines[lineNumber].Split(new[] { ":" }, 2, StringSplitOptions.None);
lineGroup[splitLine[0].Trim()] = splitLine[1].Trim();
}
lineNumber++;
}
}
// Validate the record before continuing. verifyAllKeys is just a method that does a check of the key list
// against a list of expected keys using Except to make sure all of the items that we require are present.
if (verifyAllKeys(new List<string>(lineGroup.Keys)) || (lineGroup["Phone Number"] != "(n/a)"))
{
// The record is good! Now we can do something with it:
WorkOnProcessedRecord(lineGroup);
}
}
}
这行得通(至少从我的最初测试来看)。问题是我真的不喜欢这段代码。我知道有更好的方法可以做到,但是我在C#方面不如我想要的强,所以我认为我在某些方法上会有所遗漏,这些方法可以让我更优雅,更安全地获得期望的结果。
任何人都可以伸出援助之手,向我指出如何实施更好的解决方案的正确方向吗?谢谢!
我将尝试使用一种工厂化的模式,以一种更多的面向对象的方式来解决该问题。
//Define a class to hold all people we get, which might be empty or have problems in them.
public class PersonText
{
public string FirstName { get; set; }
public string LastName { get; set; }
public string PhoneNumber { get; set; }
public string ID { get; set; }
public string ZipCode { get; set; }
public bool Error { get; set; }
public bool Anything { get; set; }
}
//A class to hold a key ("First Name"), and a way to set the respective item on the PersonText class correctly.
public class PersonItemGetSets
{
public string Key { get; }
public Func<PersonText, string> Getter { get; }
public Action<PersonText, string> Setter { get; }
public PersonItemGetSets(string key, Action<PersonText, string> setter, Func<PersonText, string> getter)
{
Getter = getter;
Key = key;
Setter = setter;
}
}
//This will get people from the lines of text
public static IEnumerable<PersonText> GetPeople(IEnumerable<string> lines)
{
var itemGetSets = new List<PersonItemGetSets>()
{
new PersonItemGetSets("First Name", (p, s) => p.FirstName = s, p => p.FirstName),
new PersonItemGetSets("Last Name", (p, s) => p.LastName = s, p => p.LastName),
new PersonItemGetSets("Phone Number", (p, s) => p.PhoneNumber = s, p => p.PhoneNumber),
new PersonItemGetSets("ID", (p, s) => p.ID = s, p => p.ID),
new PersonItemGetSets("Zip Code", (p, s) => p.ZipCode = s, p => p.ZipCode),
};
foreach (var person in GetRawPeople(lines, itemGetSets, "Error"))
{
if (IsValidPerson(person, itemGetSets))
yield return person;
}
}
//Used to determine if a PersonText is valid and if it is worth processing.
private static bool IsValidPerson(PersonText p, IReadOnlyList<PersonItemGetSets> itemGetSets)
{
if (itemGetSets.Any(x => x.Getter(p) == null))
return false;
if (p.Error)
return false;
if (!p.Anything)
return false;
if (p.PhoneNumber.Length != 12) // "555-555-5555".Length = 12
return false;
return true;
}
//Read through each line, and return all potential people, but don't validate whether they're correct at this time.
private static IEnumerable<PersonText> GetRawPeople(IEnumerable<string> lines, IReadOnlyList<PersonItemGetSets> itemGetSets, string errorToken)
{
var person = new PersonText();
foreach (var line in lines)
{
var parts = line.Split(':');
bool valid = false;
if (parts.Length == 2)
{
var left = parts[0];
var right = parts[1].Trim();
foreach (var igs in itemGetSets)
{
if (left.Equals(igs.Key, StringComparison.OrdinalIgnoreCase))
{
valid = true;
person.Anything = true;
if (igs.Getter(person) != null)
{
yield return person;
person = new PersonText();
}
igs.Setter(person, right);
}
}
}
else if (parts.Length == 1)
{
if (parts[0].Trim().Equals(errorToken, StringComparison.OrdinalIgnoreCase))
{
person.Error = true;
}
}
if (!valid)
{
if (person.Anything)
{
yield return person;
person = new PersonText();
}
continue;
}
}
if (person.Anything)
yield return person;
}
看看在这里工作的代码:https : //dotnetfiddle.net/xVnATX
本文收集自互联网,转载请注明来源。
如有侵权,请联系 [email protected] 删除。
我来说两句