Multi Line Record

September 02, 2021

I was solving a puzzle at https://adventofcode.com/2020/day/4 I needed to read multiple lines and build a record (passport).

After using System.IO.File.ReadLines I found a bug: missing the last record.

Example input:

ecl:gry pid:860033327 eyr:2020 hcl:#fffffd
byr:1937 iyr:2017 cid:147 hgt:183cm

iyr:2013 ecl:amb cid:350 eyr:2023 pid:028048884
hcl:#cfa07d byr:1929

hcl:#ae17e1 iyr:2013
eyr:2024
ecl:brn pid:760753108 byr:1931
hgt:179cm

hcl:#cfa07d eyr:2025 pid:166559648
iyr:2011 ecl:brn hgt:59in

Here’s the initial function that reads the above text and return three passport records.

        private IEnumerable<...> GetPassports()
        {
            var passport = new ...;

            foreach (var line in File.ReadLines(@".\2020\day4\test.txt"))
            {
                if (line.Length == 0) // empty line separates record
                {
                    yield return passport;
                    passoport = new ...;
                    continue;
                }

                // parse line to build passport
            }
        }

The code identifies a passport when a blank line is entered. The last record however is dropped. Fixing this is simple, adding one more yield return

        private IEnumerable<...> GetPassports()
        {
            var passport = new ...;

            foreach (var line in File.ReadLines(@".\2020\day4\test.txt"))
            {
                if (line.Length == 0) // empty line separates record
                {
                    yield return passport;
                    passoport = new ...;
                    continue;
                }

                // parse line to build passport
            }

            yield return passport; // poor fix... I introduced a bug.
        }

However, the new change adds a bug: When no record is read from the source, it returns a passport instead of Enumerable.Empty()

This wasn’t the issue for the puzzle, there’s always a record. However, I wanted to make it robust. Avoid having to remember to apply in two places would also be nice.

I reverted the change.

Then there were two likely option to pursue:

  1. control the input file

Always have a new line in the end. That is, make a blank line mandatory: record lines must include a blank lines.

Instead of changing the file I can do it in the code to ensure the record separator (blank line) to be presented at the end of file read.

        private IEnumerable<...> GetPassports()
        {
            var passport = new ...;

            foreach (var line in
                File.ReadLines(@".\2020\day4\test.txt")
                .Concat(new string[] { string.Empty })) // better, but can be improved
            {
                if (line.Length == 0) // empty line separates record
                {
                    yield return passport;
                    passoport = new ...;
                    continue;
                }

                // parse line to build passport
            }
        }

More edge cases to ponder.

  • What if the ending is inconsistent? Some input may already have the line ending. Perhaps a better version is to get a set of lines and the extra empty line can be ignored during parsing.
  • This problem is special in the case of File as EOF terminates the process and exits the loop. Whereas handling a line to parse a record, you have all data from begin and end.

© 2023 Kee Nam