PowerShell String.Split() Off-by-Method-Overload Error

This seemed to me an error and I and was on the point of raising it as a bug on the Powershell github repo:

PS> "\this".Split( [char]'\', [StringSplitOptions]::RemoveEmptyEntries).Length
# >> 2

Presumably it is because [StringSplitOptions]::RemoveEmptyEntries is coerced to a [char] and so the line is parsed as:

PS> "\this".Split( ([char]'\', [StringSplitOptions]::RemoveEmptyEntries) ).Length

Instead of as

PS> \this".Split( (,[char]'\'), [StringSplitOptions]::RemoveEmptyEntries).Length

If the first parameter is a string not a character then it works as expected:

PS> "\this".Split( '\', [StringSplitOptions]::RemoveEmptyEntries).Length
# >> 1

But the really unfortunate case is :

PS> "\this".Split( [System.IO.Path]::DirectorySeparatorChar, [StringSplitOptions]::RemoveEmptyEntries).Length
# >> 2

which results in

PS> "\this".Split( [System.IO.Path]::DirectorySeparatorChar, [StringSplitOptions]::RemoveEmptyEntries).[0]
# >> $null
# instead of
# >> "this"

It turns out that it's fixed in Powershell 6 Beta; or to be more precise, it doesn't happen in PowerShell 6. What changed is that the underlying .Net framework has added new overloads to String.Split():

string[] Split(char separator, System.StringSplitOptions options)                                                                                    
string[] Split(char separator, int count, System.StringSplitOptions options)                                                                         
string[] Split(string separator, System.StringSplitOptions options)                                                                                  
string[] Split(string separator, int count, System.StringSplitOptions options)                                                                       

Whereas PowerShell 5 only has these overloads available:

string[] Split(Params char[] separator)                                                                                                              
string[] Split(char[] separator, int count)                                                                                                          
string[] Split(char[] separator, System.StringSplitOptions options)                                                                                  
string[] Split(char[] separator, int count, System.StringSplitOptions options)                                                                       
string[] Split(string[] separator, System.StringSplitOptions options)                                                                                
string[] Split(string[] separator, int count, System.StringSplitOptions options)                                                                     

And so the best-match overload that PowerShell 6 chooses is different to PowerShell 5's best match.

Three Historical Definitions of the Open/Closed Principle and a Claim that it’s Pointless

Bertrand Meyer first published the OCP in his influential 1980's book Object Oriented Software Construction:

  • “A module is said to be open if it is still available for extension. For example, adding new fields, or performing new functions.
  • A module will be said to be closed if it is available for use by other modules.”

It's a neat double-definition, not least because the definition of Closed is both useful, and one that might not contradict the definition of Open. But Meyer's proposed technique—use subclassing to achieve Openness of a Closed module—is widely ignored. Many of us have discovered the pain of working with inheritance hierarchies, so we savour the Gang of Four's sage dictum: “prefer composition over inheritance.”

Dynamic languages like javascript can do the open/closed trick quite easily. The danger is that, in doing so, you develop inscrutable code and and are left with system that, when it works, you don't know how it works; and so when it breaks you don't know how to fix it.

Uncle Bob all-but-redefined the Open/Closed principle by using Interfaces as his technique. The interface is fixed and Closed; modules that depend on it can rely on it not changing. But the Implementation is Open: it can be changed without breaking the interface.

It is worth mentioning a word of wisdom from the .Net team's Framework Design Guidelines: the weakness of interfaces in Java and .Net is precisely that they are 100% closed. There can be no version 2. Or rather, if there is a InterfaceV2 then it can usually have no useful relationship to InterfaceV1. You might as well call it ICompletelyUnrelatedInterface. (Or perhaps one could put the versioning at the namespace level).
This versioning problem is widely felt in service oriented systems with public interfaces. It is often addressed by creating a new endpoint for a new version of the service. Offering two versions of a service becomes on the whole precisely as expensive as offering two services, which is to say twice as expensive. This is unfortunate.

Contrast this with Meyer's vision of OCP: On Meyer's subclassing approach, version 1 clients and version 2 clients would call the same service and get the same responses. Version 2 clients would recognise, and so be able to use, the enhanced v2 capabilities; whereas version 1 clients would only recognise the version 1 capabilities. But here I see a second problem with Meyer's vision: I've almost never seen systems (or even parts of systems) that can achieve it in practise. It's a beautiful dream. But unachievable. It is a pipedream.

More recently (Dec 2016), Michael Feathers has offered an updated version, towards the bottom of the the page at Towards a galvanizing definition of technical debt:
“our code is better to the degree that we don’t have to change it much when we add features. We should be able to make modifications primarily by adding new classes and functions rather than changing existing ones”
This is much 'softer' formulation than Meyer's or Bob Martin's and you could take it as just a rule of thumb; something to weigh in the balance against other factors. Feathers implementation in this case (and I'm left with the impression that in a different codebase he'd be happy with a different implementation) is doing event driven code as most people think it should be done: use an AddEventListener() interface, which makes the code Open to all kinds of extension.
This AddEventListener() is exactly the approach used in the HTML spec and other GUI frameworks of the past 20 years. The downside is that the 'closed' bit of the interface is so small and weakly-typed that it's almost non-existent. The interface tells you nothing about the semantics. (What kind of events can I listen to? What information do I get about each event? What can I do with them? I can only find out by reading the HTML spec, which turns out to be quite hard going, or turning to MDN, or, the first port of call for many, StackOverflow. In a bespoke codebase replace this with “ask for documentation; find it is incomplete; and then hunt through the code for examples of how I can use it”).
Strongly typed interfaces are at least somewhat self-documenting—they offer a definitive list of all syntactically valid calls to the service—even if that documentation depends heavily on how well the developers chose their method and parameter names.

These three examples leave me with mixed feelings. OCP seems like trying to square the circle, and Meyer's choice of name was a well-chosen contradiction. Yet the goals—Openness for extension, Closedness for reliability—are unavoidable.

Dan North, amongst others, has suggest that OCP, and indeed all the SOLID principles, are of limited value and we should drop them in favour of something else. I sympathise—I think that SOLID is a mishmash of mixed value—but I'm willing to wrestle for a couple more years with OCP before I admit defeat.

I'd rather have the above three technique, and others, in my toolkit because my software design still has to address the two contradictory requirements that Meyer identified in the 80s:
–Because my software is still evolving, it has to be open for evolution: it has to change.
–Because my software is already is use, and hence being depended on by some other software or person, it has to be reliable and therefore can't change.