Breaking Up Code: Thoughts

It’s always useful to break up your code into smaller, more manageable chunks. And it’s also useful to abstract away your vendor or 3rd-party dependencies where applicable. I think sometimes you can go too far, or not do this enough.

I’m outlining the various areas of implementation I need to perform to make a Swift library for SFTP, and I’m choosing to use SwiftNIO as the networking layer underneath the protocol implementation.

Dependency Abstraction/Separation

There’s the obvious choice of how to keep SwiftNIO separate from the rest of the library, or how to make it so that you can drop in a replacement for SwiftNIO in the future, if needed. For instance, what if SwiftNIO ends active development and a nasty CVE is found, and there won’t be a fix? Well, you’d want to get a new networking stack. Or, what if you found something even faster or fancy? Then replace it all you want.

But if your code is so tightly coupled to the rest of your code, either in terms of types it is using or the natural structure of the code, then you might be in trouble. Detangling all of that will be a tough exercise, and might involve re-writing huge parts of your application. The natural tendency is to write your code up-front in such a way that is generic and can handle component swaps. Monolithic classes that do everything and everything from the networking stack to handling the SFTP requests and responses is certainly possible, but it’ll be a pain in the ass to ever change significantly.

With SwiftNIO, I think there is the possibility for a clean divide between networking code and application code. The SwiftNIO portion is only for setting up the server and listening ports (which could be hidden behind a protocol and class, connecting app/lib configuration to the implementation, by passing in only the needed information), and also to translate the incoming network stream into requests (as well as the sink for outgoing requests). In order to resolve this latter part, you only need one final handler that converts the last bit of SwiftNIO types and information into pure app/lib types that get sent and handled by a class object that represents the state of the server. Replacing SwiftNIO, then, would be a matter of replacing only that glue that transforms the dependency’s data types and constructing requests with new code that doesn’t need to touch any of the handling code at all.

There is the possibility that the new dependency works so differently from SwiftNIO that it changes the overall structure of the application. SwiftNIO is an asynchronous library, and switching to a synchronous one (or more likely in the other direction) could reveal important areas that need refactoring. Think of, say, a push configuration to a pull configuration, or vice versa. This is a bit more unavoidable than just converting one type to another, and needs more of a case-by-case analysis which is difficult to generalize.

How far you want to go with abstraction is up to you. For something like ASP.NET MVC/Core, or SwiftNIO which are backed by large corporations and will likely continue to evolve and be maintained, replacing this is rarely needed. But, it is still a good idea to keep it separated from the rest of your code in such a way that unit testing and simpler classes lead to cleaner and more bug-free code.

Where to Put Helpers

However, another interesting problem domain I’ve been thinking of is extra components or helpers on top of dependencies.

SwiftNIO is a great abstraction on top of low-level networking APIs such as BSD sockets or Networking.Framework. It provides a plethora of great classes and handlers to make a quasi-framework. For example, it handles all of the socket connection setups as well as implementing the socket reads and asynchronous waiting (via epoll/kqueue) correctly for you. Doing this yourself would be…a pain, to say the least.

When I first started making an SFTP implementation, I think I started from the wrong direction. I began with writing the deserializers/parsers, working with Data slices or streams to get and parse the bytes into better POCOs. I soon ran into the problem: SFTP allow for reading/writing an entire file, instead of chunks that can be handled in small packets. This means a 1GB file could be sent in a single packet.

The obvious question is: do we really want to deal with a 1GB (or…1TB, depending on size bounds for the uint32) array of bytes in memory? No, god no. No no no. That’s so bad and could lead to really bad performance. Would the networking have to wait for the entire file contents before passing it to the application and handling the data? What if the server could have stopped the transfer earlier if the header was invalid based on permissions or something?

And, on a slightly different subject, I began moving my code to using streams, such as NSStreams (when I didn’t know Apple’s options as well). That code was synchronous. While researching streams and the common read function, I learned of a bunch of “gotchas”. You can’t just read for 4 bytes to get the SFTP packet length, and then read the next 1GB for the data. These particular streams are too low level, and the read (or equivalent) wouldn’t wait for the full gigabyte to be available before returning it to your application. It would only return up to the size of the TCP socket’s buffer, which might be on the order of kilobytes. I would have had to write glue code that collects the data reads into the full packet buffer before deserializing it. Honestly, that didn’t sound fun at all, and would have been bug-prone.

Also, let’s say SFTP didn’t have a data body, but it was just small requests of fixed sizes. You also can’t ensure that the request would come in all at once in a single read operation. the MTU of the different machines or network might be too small, or perhaps 1.5 packets came in a single TCP segment, etc. Even though we read data from the socket in buffers, it still has to be treated as a stream.

Enter SwiftNIO, stage left. It solves these problems! Well, that second one at least, but it solves it really well. Aside from setting up connections and posting ByteBuffers to your application (the output of a read operation, essentially), it also has built-in helpers such as ByteToMessageHandler and ByteToMessageDecoder, which converts the incoming stream of mis-aligned data buffers into aligned data buffers. See LengthFieldBasedFrameDecoder. This one is perfect for SFTP because there’s a 4-byte length prepended field for all SFTP requests and responses, and SwiftNIO’s code does the tough work of collecting bytes into buffers that actually represent a full message. This is so helpful.

But for the first problem domain I mentioned, it doesn’t help quite as much. SwiftNIO is still somewhat low-level, providing a nice layer on top of raw sockets and suspending the thread for incoming and outgoing data transfers, and setting up run loops/event loops to handle that efficiently and quickly. It also provides promises and futures and some other niceties.

In addition to the base SwiftNIO offerings are other modules such as HTTP 1.1 support, TLS support, WebSockets, and some more. Additionally, there is the SwiftNIO Extras repository, which further adds to the base offering with some helper constructs.

I already pointed out one of these earlier: LengthFieldBasedFrameDecoder. This repo also has a line-based decoder (LineBasedFrameDecoder) along with a couple more, a RequestResponseHandler to keep requests and responses paired, HTTP helpers for compression with GZip, and more.

So, where am I going with this. There is one thing that is a bit too high level for SwiftNIO’s current offerings. HTTP and SFTP (as well as many more protocols, I’m sure) deal partly with small, mostly fixed-size requests. But sometimes the request or response also contains a large body. It’s very often that the request is prepended with the header, and in the header contains information for how long the body of data is. HTTP has a header line Content-Length for any kind of request or response, while for certain SFTP requests and responses, there is a specific field for the data’s length. In both instances, there is nothing else at the end of the message.

We come back to the original problem: streaming this data. We don’t want to wait for the entire, possibly large body of the message to finish being transferred in the server memory. It just raises a lot of problems. So we’d like to keep the main request or header in memory, and treat the body as a stream. If you’ve come from the ASP.NET environment, you’ll know that the HttpRequest and HttpResponse treats the response as a stream, not as a string or byte[] array.

SwiftNIO doesn’t really have anything to handle this easily. It makes some sense, however. SwiftNIO’s purpose isn’t to help facilitate implementing every network protocol with total ease. More so, it provides a better platform on which to implement protocols. That’s not to say it doesn’t leak into that domain, though. It has several helpers, and an HTTP and TLS implementation (probably due to popularity).

Coming back to my project, now. Without a helper for request/body semantic, I would have to make my own ad define the logic for it. I did a lot of research into how this can work, and efficiently. I also posted on the Swift forums about this and got some great responses from @lukasa and @joannasweiss.

But there is also the incessant nagging voice in my head saying, “Why isn’t this implemented more generically? A header + body paradigm is common in tons of protocols. We shouldn’t have to re-invent the wheel so many times.” Indeed, SwiftNIO’s HTTP1 implementation bakes in an implementation of this rather obliquely. The core of it is an enumeration with .header, .body, and .end, where the header is emitted once (from the ByteToMessageDecoder, I believe) with all of bytes for it, then zero or more body segments, each with a chunk of the available data from the socket read operations, and finally an end enum marking that the body and message have concluded.

Additionally, Vapor’s implementation on top of SwiftNIO’s HTTP1 core then has to still deal with these chunks of headers and bodies, and not a single, centralized stream mechanic. And because of this, Vapor’s stream seemed to not handle it correctly. Looking back at the implementation, I now see the ChannelOutboundHandler.read() implementation correctly handles this per my forum post.

Both the HTTP1 and Vapor handlers seem rather…large and heavy. A lot of it deals with handling the stream state, whether it’s awaiting header or body or end states, and whether the connection is idle or not, etc. SwiftNIO’s own readme states:

In general, ChannelHandlers are designed to be highly re-usable components. This means they tend to be designed to be as small as possible, performing one specific data transformation. This allows handlers to be composed together in novel and flexible ways, which helps with code reuse and encapsulation.

So, I think it would be a good opportunity to remove this potentially re-usable section of protocol implementations into its own module, freeing up the complexity of higher-level handlers and decoders, and making the code easier to read, maintain, and unit test.

This brings us back to the main point of this article, finally (sorry for rambling): where do we put it? I see several possible areas, in order of escalation:

  1. Baked into my SFTP code (what we’re avoiding)
  2. Alongside my SFTP-specific decoders and handlers
  3. In a “Utils” folder within my SFTP library
  4. In an infrastructure/utils library in my solution (more applicable for large applications with many broken-out libraries, common more in C# enterprise websites)
  5. As a separate, one-off library imported as a dependency that others could potentially depend on from the internet
  6. In the SwiftNIO Extras library, as a pull request
  7. In the SwiftNIO library, as a pull request

The most common scenario would be #2, in a Utils folder within libraries. It’s probably a good mix between keeping it separate from your main application code and code that is just adding extensions of functionality off of other dependency code. But, I’m irked. Irked that this isn’t available to others. It seems like such a core part of writing many protocols, and I feel like there should be a more robust solutions others can use, much like the ByteToMessage(Handler|Decoder), so that users don’t shoot themselves in the foot re-implementing the wheel. So, I want to go a step further.

Number 4 would, then, be the next choice. I make a new repo dedicated solely to solving this problem. That would work well for a lot of people, but I can’t help but be reminded of NPM, “dependency hell”, and the “pad left” debacle. It leaves a rather dull taste in my mouth. We can do better.

Going to the extreme, #6, seems like it might be too far, in light of SwiftNIO Extras' existence. The NIO team have already made a split between different modules and parts of SwiftNIO. I do find it odd that they have a full HTTP and TLS implementations in the core library, without using and of the niceties and in the Extras library. I feel like HTTP, TLS, et al could be in a third library depending on both Core and Extras, but I can see how that could be taken to an extreme. I digress.

Extras seems like the perfect fit, considering all of the other helpers already in this location. But why would this be a good location?

So, I want to implement this project the right way. I’ll make a local implementation of this new handler and decoder myself outside of my SFTP project, and see what I can do to get it up to SwiftNIO and OSS standards, and submit a PR. Is this procastinating on my side project? A bit. Am I keeping myself busy and contributing to OSS and getting a deep knowledge of the technology relevant to my side project? You betcha.