Beautifying Loops

August 16, 2020

Loops can be difficult to maintain with the change of states.

A typical loop involves the following:

  1. set initial state
  2. check state condition
  3. do something with the state
  4. set state for the next operation
  5. jump to step 2

Let’s see it in action by writing code to read a file with 4KB at a time. 4KB is crucial as a typical NTFS HDD is formatted with 4KB blocks (file allocation size).

    public static class FileExtensions
    {
        /// <summary>
        /// Gets 4KB blocks of the file
        /// </summary>
        /// <param name="filePath">The file path to read file from</param>
        /// <returns>4KB or less data in sequence</returns>
        /// <remarks>
        /// The file is locked until it's completely read or read is stopped.
        /// </remarks>
        public static IEnumerable<byte[]> Read4KBBlocks(this string filePath)
        {
            const int bufferSize_4KB = 4 * 1024;
            using (var fileStream = File.OpenRead(filePath))
            {
                var buffer = new byte[bufferSize_4KB];

                var bytesRead = fileStream.Read(buffer, 0, buffer.Length);
                while (bytesRead > 0)
                {
                    if (bytesRead == buffer.Length)
                    {
                        yield return buffer;
                    }
                    else
                    {
                        yield return buffer.Take(bytesRead).ToArray();
                    }
                    bytesRead = fileStream.Read(buffer, 0, buffer.Length);
                }
            }
        }
    }

Focus on the highlighted lines: line 18, 29.

Line 18: The initialization state where the first block is read

Line 29: The next state for the loop before jumping to the condition

It works, but the states can be simpler by combining the Read operation to a single line. One option is to use do-while loop

do-while loop is typically used with an operation that involves retry, but it might help us improve the loop.

Let’s take a look.

        public static IEnumerable<byte[]> Read4KBBlocks(this string filePath)
        {
            const int bufferSize_4KB = 4 * 1024;
            using (var fileStream = File.OpenRead(filePath))
            {
                var buffer = new byte[bufferSize_4KB];

                int bytesRead;
                do
                {
                    bytesRead = fileStream.Read(buffer, 0, buffer.Length);
                    if (bytesRead == 0)
                    {
                        break;
                    }

                    if (bytesRead == buffer.Length)
                    {
                        yield return buffer;
                    }
                    else
                    {
                        yield return buffer.Take(bytesRead).ToArray();
                    }
                } while (bytesRead > 0);
            }
        }

Yay, we reduced to a single Read function. However, this caused some other concerns at Line 8, 12 and 25.

  • Line 8: bytesRead is defined outside the loop
  • Line 12: additional condition is added to stop the enumeration when 0 bytes are read to avoid returning empty array.
  • Line 25: this condition may not be necessary as when 0, it won’t reach here.

Let’s address the additional conditional checks.

        public static IEnumerable<byte[]> Read4KBBlocks(this string filePath)
        {
            const int bufferSize_4KB = 4 * 1024;
            using (var fileStream = File.OpenRead(filePath))
            {
                var buffer = new byte[bufferSize_4KB];

                int bytesRead;
                while ((bytesRead = fileStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    if (bytesRead == buffer.Length)
                    {
                        yield return buffer;
                    }
                    else
                    {
                        yield return buffer.Take(bytesRead).ToArray();
                    }
                }
            }
        }

Yay, we are only left with a single concern of bytesRead being outside the loop. (Line 8). However, we added an additional concern:

  • Line 9: the while loop statement is performing the read operation and checking conditions.

This may not be a concern unless coding standard (for maintenance purpose) is to avoid an operation in conditional loop. The line takes longer to read than simple true/false condition.

Let handle the concerns.

    public static IEnumerable<byte[]> Read4KBBlocks(this string filePath)
    {
        const int bufferSize_4KB = 4 * 1024;
        using (var fileStream = File.OpenRead(filePath))
        {
            var buffer = new byte[bufferSize_4KB];

            while (true)
            {
                var bytesRead = fileStream.Read(buffer, 0, buffer.Length);
                if (bytesRead == 0)
                {
                    break;
                }

                if (bytesRead == buffer.Length)
                {
                    yield return buffer;
                }
                else
                {
                    yield return buffer.Take(bytesRead).ToArray();
                }
            }
        }
    }

The unconditional loop condition (Line 8) may be frowned upon. However, this is beautiful. We have single operations for

  • setting the state at Line 10
  • exit condition at Line 11

The function is available on GitHub: https://github.com/keenam/Codyssey/blob/master/Codyssey.Extensions/FileExtensions.cs


© 2023 Kee Nam