Mixing multiple level of abstractions makes code harder to read, therefore harder to maintain. Having a single level of abstraction per function is one of the techniques that reduce the complexity of the code.

A lot has been already written on this subject, so I am not going to repeat the details (see 1 p.36, 2, 3, etc).

The basic idea is, a function should operate on a single level of abstraction.

I was writing recently a small command line tool, QDVC, that was manipulating a bunch of files, transferring them between the local disk and a remote service. Being a simple console application, its entry point, the Main function (in C#) contained the entire workflow.

As it happens most of the time, the Main function started small, just a few files, download a file from a server and writing to it.

Then, more things were added: local configuration, caching, authentication with the server, various commands (not only download, but upload, status, clean, etc). And the Main function grew wildly until it became hard painful to read.

This explosion of a function is quite common in software development, especially with (very) long living applications. A function starts small, abiding Single Responsibility and other classic principles, then a dev adds a bit more code, a year after that another dev adds another small block, and shortly after someone adds an if or a for increasing just a bit the indentation. The function becomes harder and harder to read, understand and maintain and making other devs contribute the same way.

Here’s where Main got after just a few weeks of working on this little baby:

static async Task<int> Main(string[] args)
{
    SystemContext.Initialize();

    IOContext.Initialize();

    CommandLineArguments Args = CommandLineArguments.ParseUsingSCL(args);

    if (Args.Command == null)
        return 0;

    if (!Args.Paths.Any())
    {
        Console.WriteLine("ERROR: No paths provided.");
        return 2;
    }

    var paths = Args.Paths.Select(Path.GetFullPath);
    var currentDirectory = FileSystem.Directory.GetCurrentDirectory();

    var dvcFolder = DvcCache.FindDvcRootForRepositorySubPath(currentDirectory) ??
                    DvcCache.FindDvcRootForRepositorySubPath(paths.First());

    if (dvcFolder == null)
    {
        Console.WriteLine("ERROR: No DVC repository found.");
        return 3;
    }

    var dvcConfig = DvcConfig.ReadConfigFromFolder(dvcFolder);

    var credentials = Credentials.DetectFrom(Args, dvcConfig);
    if (credentials == null)
    {
        Console.WriteLine("Failed to detect credentials.");
        return 4;
    }
    else
        Console.WriteLine($"Credentials loaded from {credentials.Source}");

    var files = paths.SelectMany(FilesEnumerator.EnumerateFilesFromPath);

    if (files.Any() == false)
    {
        Console.WriteLine("No files found.");
        return 5;
    }

    var cacheDir = dvcConfig.GetCacheDirAbsolutePath();
    var dvcCache = DvcCache.CreateFromFolder(cacheDir) ??
                    DvcCache.CreateFromRepositorySubFolder(files.First());

    Console.WriteLine($"DVC cache folder: {dvcCache?.DvcCacheFolder}");

    using var httpClient = new HttpClient();
    httpClient.Timeout = TimeSpan.FromMinutes(10);

    if (credentials != null)
        httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", Convert.ToBase64String(Encoding.ASCII.GetBytes($"{credentials.Username}:{credentials.Password}")));

    switch (Args.Command)
    {
        case "pull":
            await new PullCommand(dvcCache, httpClient).ExecuteAsync(files);
            break;
        case "push":
            await new PushCommand(dvcCache, httpClient).ExecuteAsync(files);
            break;
        case "add":
            await new AddCommand(dvcCache).ExecuteAsync(files);
            break;
        case "status":
            await new StatusCommand(dvcCache).ExecuteAsync(files);
            break;
        case "status--repo":
            await new StatusRepoCommand(dvcCache, httpClient).ExecuteAsync(files);
            break;
        case "clean":
            await new CleanCommand(Args.Force).ExecuteAsync(files);
            break;
        default:
            Console.WriteLine($"Invalid command '{Args.Command}'");
            break;
    }

    Console.WriteLine($"Finished in {sw.Elapsed}");

    return 0;
}

Can you spot the components that handle caching, configuration, authentication? Can you tell what are the dependencies of the authentication component?

I wasn’t happy at all with this code and some refactoring was a must.

Code should be easy to read and elegant. And if it’s about putting components to work together, then it should clearly point out the relations between these components.

Refactoring

I’ve already mentioned the purpose of this Main function is to put head-to-head the building blocks (components) that contribute to the applications workflow. Inside them, the components themself deal with their own details of which the Main function doesn’t really care. So, this could be a good line of refactoring.

Look inside this Main where the HttpClient is created:

    ...

    using var httpClient = new HttpClient();
    httpClient.Timeout = TimeSpan.FromMinutes(10);

    if (credentials != null)
        httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic", Convert.ToBase64String(Encoding.ASCII.GetBytes($"{credentials.Username}:{credentials.Password}")));
    
    ...

Main should not be very concerned about how the timeout interval is configured, or how the authentication is set. All Main needs is a component to make HTTP requests. How this HTTP client component is built is at a lower level of abstraction, Main works with components, Main works at a higher level of abstraction.

Therefore, the code that creates the HTTP Client inside the Main method can be extracted to a separate function:

private static HttpClient CreateHttpClient(Credentials? credentials)
{
    var httpClient = new HttpClient();
    httpClient.Timeout = TimeSpan.FromMinutes(10);

    if (credentials != null)
        httpClient.DefaultRequestHeaders.Authorization = new AuthenticationHeaderValue("Basic",
            Convert.ToBase64String(Encoding.ASCII.GetBytes($"{credentials.Username}:{credentials.Password}")));

    return httpClient;
}

And the only thing that’s left inside Main is getting and using this component:

    ...

    using var httpClient = CreateHttpClient(credentials);

    var command = CreateCommand(arguments, dvcCache, httpClient);

    ...

Looks better now. I can see the httpClient needs only the credentials, and then it’s used further down the workflow to create a command (to download/upload files from the server).

Extracting code from Main in separate functions for the other components I’ve mentioned, makes the Main now look quite decent:

static async Task<int> Main(string[] args)
{
    SystemContext.Initialize();
    IOContext.Initialize();

    try
    {
        var arguments = ParseCommandLineArguments(args);

        var dvcFolder = FindDvcFolder(FileSystem.Directory.GetCurrentDirectory(), arguments);

        var dvcConfig = DvcConfig.ReadConfigFromFolder(dvcFolder);

        var credentials = FindCredentials(arguments, dvcConfig);

        var files = EnumerateFiles(arguments.Paths);

        var dvcCache = GetDvcCache(dvcConfig, files.First());

        using var httpClient = CreateHttpClient(credentials);

        var command = CreateCommand(arguments, dvcCache, httpClient);

        await command.ExecuteAsync(files);

        return 0;
    }
    catch (ExitException e)
    {
        if (e.ExitCode != 0)
            SystemContext.Console.StdErrWriteLine(e.Message);
        return e.ExitCode;
    }
    catch (Exception e)
    {
        SystemContext.Console.StdErrWriteLine($"ERROR: Unexpected error.{Environment.NewLine}{e.Message}{Environment.StackTrace}");
        return 99;
    }
}

Give it a try and read the content of the try/catch block. The code now reads as the workflow it represents:

  • arguments are read from the command line
  • dvcFolder is created based on CurrentDirectory and the arguments
  • dvcConfig is created from the dvcFolder from above
  • credentials are read from either arguments or dvcConfig
  • files to be copied are obtained from the provided arguments
  • the dvcCache is created based on the dvcConfig and the first provided file
  • an httpClient is created based on the credentials read earlier
  • a command is created based on the arguments, dvcCache and the httpClient
  • and in the end, the command is executed

Now the components are obvious inside the Main. The dependencies of each component are clear. The way Main puts together the components, organizing the workflow, is popping out.

Of course, I had to get rid of the “intermediate” exit points with return 3;, return 4 when some component failed to create that were also contributing to the overall headache. That’s a good case for exceptions (that’s why exceptions were invented after all).

Yes, Main might still be a bit long but it’s good as it is. It’s readable, it shows its intention. It does one thing, puts head-to-head the building blocks of the program.

Can it be made better than this? Probably, but for sure the improvement won’t represent such a big jump like this refactoring did, based on the Single Level of Abstraction Principle.

I am now quite satisfied with how Main reads.