Oct 7, 2014

10 ways of implementing Polymorphism

Firstly what is Polymorphism and why is it so important? Polymorphism is the ability to have a many implementations of a behavior that conform to a single interface. Put in perhaps slightly better, pragmatic terms, you have one implementations of a caller, that can operate on many implementations of a "parameter", without conditionals, or changing the callers code. For instance the following, pseudo?, Perl 6-ism method handler( $obj ) { $obj.execute() }. As you can imagine $obj can be anything that has an execute method. For this Article I'll give you two implementations and one caller, in either Perl 5/6 or Java 7/8, boilerplate will be excluded for brevity.

Inheritance

Single Inheritance

Single inheritance is the most simple and well understood form of Polymorphism.

Multiple Inheritance

Multiple inheritance is often considered dangerous, is unavailable in Java and suffers from the The diamond problem. You should really only use this with a C3 MRO.

Flat Composition

Interfaces

Interfaces are probably the third most common form of Polymorhism, they are essentially codified contracts.

Traits

These are just the same as Interfaces in Java 8 you say? well yes, that's what Java 8 calls them, Traits are a list of methods flattened into a class, but they cannot access state. This basically describes what Java 8 is doing, as you can't access properties from within the interface, well.. at least not unless you do what I show here, which is basically access state through getters and setters.

Mixins

Mixins are basically traits that can access state, though some mixins (AFAIK Ruby) are implemented sneakily as multiple inheritance, rather than flat list composition. IMHO, Mixins should be implemented using flat list composition.

Typeless

Duck Typing

The has $!log in the Mixin is actually a pretty good example of duck typing, we don't check for debug we are just calling it. Java is basically incapable of doing this, except, you can treat everything as an Object (if that's all you need).

Function References

references to functions may or may not be allowed to have varied signatures depending on the language, but so long as they have the same signature they are interchangeable, and thus polymorphic. So why aren't normal functions (procedures), for example, Polymormphic, the problem with procedures is that you have to import the implementation from outside the file, where with polymorphic code, you can create your instance outside the file, pass it into code that's in the file, without changing the code, pass in a different implementation, and it'll continue to work. To modify procedural code, you'd have to modify at least the import, and in compiled code that means a rebuild. It's worth noting these aren't so much typeless as their is only one type to be concerned with, a function.

Miscellaneous

I'm personally skeptical of whether these actually fit the definition of Polymorphism, but they sort of do, just in completely different ways from the above

Method Overloading

Method overloading is called ad hoc polymorphism and is kind of weird in that what it's really doing is hiding the type change from the programmer. Reality is you're kind of asking for different behavior, but you want to hide that it's different in the caller. However since it means you wouldn't have to change the caller, it counts.

Generics

I describe generics as class templates, because they remind me of having an HTML template, and then filling in the blanks by passing in variables, the variable happens to be a Type. Perl doesn't have Generics, and I'm not aware of plans for it in Perl 6.

Reflection

Reflection is sort of polymorphic in that you can essentially treat all objects the same, via a single standard API. I don't know that I want to show the kind of Reflective code because it get's real complicated fast, but for example, @Inject can be annotated in systems with CDI compliant injector, they will reflectivly treat all objects with this the same, and then set the annotated property.

Sep 20, 2014

Celebrity nude scandal, on security, an analogy

Though I won't say they aren't victims of a crime... What the victims did is fundamentally the equivalent of using skeleton keys in the modern day. What apple did or rather didn't do, is prevent that. Apple could have used a tool like cracklib, and said at the time of password creation, this is too short, this is not random enough, we are refusing to allow you to put this skeleton key lock on your front door. So while I think that the perp should be prosecuted to the full extend of the law, it should be like a Breaking & Entering where the door was left unlocked. Apple should be sued for not requiring secure passwords. Imagine if your lock company installed them wrong, and because of that you got broken into, they didn't do their job correctly. Would people just stand for that? No, I don't think so. Somehow physical locks are seen as easier to understand, and all this computer mumbo jumbo is hard, event though I suspect most people can't tell you why a deadbolt is a better lock. People should realize Skeleton keys are no longer secure, even if they look cool, and are easy to use, it's better to use a password manager (http://lastpass.com is what I use) with a randomly generated password for all other sites (I'd say 16 characters, though I think 12 is the current suggested). Fundamentally this setup is a deadbolt with a different key required for each door, but one keychain. You can also do multifactor, which is like a key with a chip in it that will refuse to start your car if it's the wrong chip, so making a physical copy of the key (password) isn't enough.

Sep 2, 2014

Using Spring to create a full REST API in less than 60 lines of code

Spring with Spring Data is awesome. Seriously, I've never been able to throw up a full HATEOAS REST web service this fast. To start, I'll admit my headliner lie, I'm not counting the pom.xml.


cloc .                                                                 slave-vi
       5 text files.
       5 unique files.
       2 files ignored.

http://cloc.sourceforge.net v 1.62  T=0.04 s (104.8 files/s, 3930.8 lines/s)
-------------------------------------------------------------------------------
Language                     files          blank        comment           code
-------------------------------------------------------------------------------
Maven                            1              6              7             65
Java                             3             15              0             57
-------------------------------------------------------------------------------
SUM:                             4             21              7            122
-------------------------------------------------------------------------------

The basics of the web service is we want to be able to create tasks, like those on a todo list, for now we want the simplest tasks possible, in as little code possible. We should use UUID's so that our service can scale horizontally, so that we can easily generate known test ID's and we know that no two entities will share an id if we ever wanted to flatten things. We need to be able to perform basic CRUD on all of our entities as well as list them.

First let's create our Task. As you can see it's incredibly simple, we have our UUID identity, the uuid and uuid2 basically are telling Hibernate and H2/PostgreSQL to use UUID's. You might ask why limit description to 100 characters, well, since these are quick tasks, I might want to share them in a tweet, and this allows enough room for a url shortner plus the description. I think the rest is pretty self explanatory.

Now let's create our Repository. Well that doesn't do anything... oh but it does, and although it doesn't show it, because this application doesn't need it, there's a nifty method signature parser dsl that allows you to build queries just by writing a method signature.

Here's our Application. ... and pom for dependencies and stuff.

Here's the output of some curl commands I ran.

For a slightly more in depth tutorial you can see the official spring date rest getting started page. In the future I'll try to write about how to actually connect to PostgreSQL and set up API Authentication and Authorization

People are always telling me how verbose Java is, how much less typing their language (especially Perl is). I'd love to see a Perl app that can do all this in fewer lines of Perl (restriction, no line may be longer than 120 characters, and must be humanly readable), I personally don't think it can be done at this time (not with full HATEOAS and as many response codes), but I'm waiting for the day it can, and can be structured this simply.

Jul 1, 2014

Writing deprecation notices in perl, optionally with Moose

Sometimes you want to remove behavior from your code in a future version, here's the right way to do it.

Here's the quick of how it works, the before has to come after attributes because the methods aren't yet created. Using before also means it'll always run with your method, without actually touching your method, insuring no accidental consequences to your method. The @CARP_NOT ensures that the warning thrown doesn't show a line number in your package, or from within where Method Modifiers are actually run. warnings::warnif( 'deprecated', ensures that these warnings are only emitted if you have the deprecated category enabled. But what if people don't have warnings enabled? um... oh well? that's there problem because what if people do and they want to silence these until they can get to them. I highly suggest putting the name of the method being called and it's successor into the message so that people know how to correct their code.

If you don't want Moose, just don't use the method modifier and put warnings:warnif directly in your code. if you're using a different AOP before, modify @CARP_NOT to have the correct module.

Jun 3, 2014

Java Privacy, broken by design

It is worth prefixing that none of the following arguments apply to anything using the keyword static which makes things more procedural (or in some cases functional, than Object Oriented.

The suggestion in Java is to give the least required permission, but this, in my humble opinion, violates the Open-Closed Principle. Java has four privacy levels. Giving something the least permission required to function is fine in a Security context, privacy in programming however is simply there to discourage developers from doing stupid things. In most cases, unlike security, it only makes them difficult, not impossible. I believe that any SOLID principle should make your code more easily extensible, so while in fact Java's privacy is not in literal violation of Open-Closed, it does make extension more difficult than it otherwise should be, thus violating the spirit of the principle.

Before I continue on to how I think Java's design, and common usage, violates the Open Closed Principle, I should explain how I interpret the Principle, as my interpretation appears to be slightly different from what's on Wikipedia. The Principle as described on Wikipedia appears to be combining it with two other SOLID Principles, namely Liskov Substitution and Interface Segregation. So first let's assume that The principle stands alone, and that although it'd be bad design to not be completely SOLID, Open-Closed by itself does not require a subclass to support the same interface. Let's also assume that Not modifying the source to add features is also an unrealistic expectation. The purpose of Open-Closed is to ensure that your subclasses are not modifying the the structure or data of their child classes and that a child may easily add to, or change the behavior it got from its parent (Liskov says that it must be substitutable for its parent).

First let's talk about final, marking a class as final, means you can't extend it. This by the very definition is in violation of Open-Closed, because the class is not Open for extension. Classes such as UUID are marked final, you might ask, why would I want to extend a UUID? maybe I want to give it a toURISafeBase64 method. That wouldn't break any of the orignal behavior, and is almost as legitimately belonging as representing the UUID as hex. What if I wanted to extend a nested final class like an Iterator on a Map? I can't do that, which means I have to completely reimplement the Iterator to add simple functionality. In fact the way those are implemented I have to implement much more than just the Iterator.

It is recommended by the official Java Docs, and the community, to make member variables private unless otherwise necessary. Private variables are only accessible to the current class and nested classes, they are not visible to subclasses, in or out of the package. In my opinion this violates Open-Closed because now, if I subclass I need to reimplement all the fields, or use getters/setters. Getters and Setters for every single attribute are actually almost no better than the attribute itself, and an object that is nothing more than those is an Anemic. Now it could be argued that making subclasses call methods makes them more... impervious to change, because if you change the data structure you can preserve the methods. The problem is that most classes wouldn't use their own getters internally, and thus break this, because then extending that getter won't actually modify the class as completely as desired. Also remember that subclasses are by definition, tightly coupled, usually changes to the superclass require taking a look at the subclasses. So if you are using getters and setters to ensure extensibility and preserve internal/external interface changes, use them exclusively, meaning only they can have raw access, all constructors, and business logic methods must go through them. At that point they are the replacment for direct member access and private won't matter as much (I will probably advocate a variant of this in the next article). However if you still want to access some member data hidden by the class directly, you should ensure that your subclasses can easily do so as well. You should only make a member private if it would actually cause a bug in any subclass.

So if we go on to assume that all subclasses, even ones in a different package (because you know people using your code are going to extend things) then we should be making all members protected. This would mean that all subclasses could reuse the member variables. Of course the problem is now your data is not encapsulated in your package, once a member variable is not private, is is available to your entire package. To me this also seems like a bad idea, other classes in my package don't need to see my objects internals unless they're a subclass. So now you have to choose, make all classes easily extended? or protect people who are programming in your package from themselves. You can probably control who's modifying your package and how, and have static code analysis to check that you're not calling obj.foo only this.foo. But nothing can give you back extensibility you've taken away (outside of adding it back).

So let's look at interfaces, interfaces generally have two options, public, or protected. This is fine, but has a problem, protected interfaces are only applicable to the package that has the interface defined. Methods implementing the interface must have the same privacy level. Most of the time what I actually want is an interface which I've defined globally as a contract, but I want the implementations to only be called by their package. For example, a DAO (Data Access Object) might be able to share the same interface (with judicious generic usage), between entities. However if you do this, you may find that your interface must be public, so it can be between packages, now the DAO itself must have these methods as public, even if it's being called only by something in the same package, because the interface was public so that the interface could be shared. I don't see that you can get away with this whether you use package by feature or package by layer. If you follow this through with previous design thoughts such as everything is an Interface, and those end up being public, and you want nice subclassibility, whether through protected members or through interfaced getters/setters, now everything is public, and we've completely lost any real encapsulation.

So how could it be done better? have a privacy type subclass which makes the method or member available to only subclasses and not throughout the package. Allow interfaces that have global definitions, but implementations of the methods can be at a package or subclass level. I feel like this could still be accomplished, perhaps by creating an interface type that is a "contract", and a new privacy keyword for "subclass". Contracts could define that methods be subclass, or protected, in their implementation. At that point you could have all kinds of methods that are still hidden to the general world. You could then build package by feature, have all methods that are required within the package have contracts, but share contracts between features, so all CRUD controllers would have the same method signatures, all repositories would share signatures, etc, etc.

What if I actually want more privacy? well you could not share interfaces between packages, and then have interfaces not be public. You could also not use an interface at all unless it's for a method on your bounded context that must be public. You can also say that ease of extensibility is not a goal and continue to not use your getters/setters internally, and yet make your members private.

You could also say, privacy is irrelevant, if the language is then preventing good, SOLID, design. Specifically here, Open-Closed, Liskov Substitution, and Interface Segregation. If you go this route you'll need conventions, and to trust other developers, because a lot of things will be public or protected. I recommend Perl's convention of prefixing subclass private methods with _ and assuming that all member fields are subclass/trait private and should never be called outside of their inheritance hierarchy.

May 6, 2014

Two Hundred Posts

My blog is 6 years old and 200 posts, and over 120k hits, Probably my first interesting post is when decided I was switching to git from svn, and it's not very interesting, and I think much more poorly written than I write things now. Since then I've re-skinned the blog to new templates at least twice. I now list books that I recommend on the right side of my blog, and I've ensured that all content is clearly licensed under the creative commons. Personally I've moved from being a student, to system administrator, to Perl developer, and am now building things with Java and potentially Ruby.

Given that I'm now building things in Java and Ruby their may be posts that are about those technologies and the good and bad things I've found out about them. One thing I'll say is that some of the Java as a language hate is as unfounded as the hate for Perl. All languages have good and bad things about them, even Perl 6 has warts in its design.

Since Java is my full time job now, and I have little reason to be doing Perl as I've been unhappy with the Perl 5 Framework landscape, I'm unlikely to continue developing features for my Perl 5 modules. If you're interested in becoming a comaintainer on any of my modules, my requirement is that you show interest in the module by contributing high quality patches to that module. I'd like to see evidence that I won't have to come back and fix things later, and that your interest is sincere. If you're not interested in being a comaint patches are still welcome.

I haven't found frameworks that I'm completely happy with in any other language either, at this point I'm considering making a very minor project developing a full framework for Perl 6. This framework (probably split into components) would be built on a new Dependency Injection module, using what I've learned from Bread::Board, AngularJS, and Java's CDI. It would also include an ORM that is based on Data Mapper principles and make high use of introspection. I would like to mention I have some doubt in myself making serious traction, but we'll see

Apr 2, 2014

REST, ROA, and HATEOAS often leads to bad webservice design

This is not to say that they are bad, but I find that all too frequently the resulting API's are poorly designed due to forgetting one thing, RPC (Remote Procedure Call) is expensive. Now by RPC, I do not mean custom messaging formats such as SOAP, or XML-RPC, I mean calling a method on a remote server. Do not think that just because you are using HTTP as the message format with something like XML or JSON, that calling GET /resource, is significantly all that different from calling get_resource in a SOAP call. The frequent idempotence also does not mean that you're not actually doing RPC as often good method design server side also implies idempotence, e.g. adding an object to a Set in Java will not result in the object being added twice if you add it twice. All calls to a remote is a form of RPC. The most expensive part of RPC is creating a new connection, just how depends on the protocol. This is why web sockets, for instance, is much cheaper than repeated calls (there are other reasons and expenses too, like maintaining many connections).

I've worked with a few Resource Oriented Architecture (ROA) web services, and they each suffered from the same flawed design, an excessive number of RPC calls was required to do seemingly simple tasks. This is caused by the, misguided, belief that every single aggregate should be it's own resource and that components of the aggregate should also have it's own resource, and that those should be the only access to the underlying aggregate. In one case working with an ROA we had to do about 5 RPC calls for every single product we wanted to create, and we were bulk creating. This problem was aggravated by the lack of an idempotent PUT for most resources.

The reality is, with a good API design we could have created all, of the objects we needed with a single API call to a bulk interface. I'm talking the RESTful equivalent to a Java Collection.addAll( objs[] ). In fact if you use addAll on a Set, the result of multiple same calls is idempotent, the same object will not be added twice. It would be really easy to write this given a good ORM, and a good interface so that you could do a POST or PUT to /entities. this is a significant improvement to a design where you'd have to do a PUT or POST for every single item you wanted to create. DELETE may be the only place where I'd consider not doing a bulk request, and it is generally able to be completed asynchronously. You may of course consider limiting the number of entities acted on in a request, so if you need to create 1000 entities, it might take 10 requests doing 100 at a time, this is still better for both the client and the server than doing 1000 requests.

The choice between PUT and POST depends on whether you believe that the call to GET must return the exact same view as PUT, meaning that a PUT would delete resources not included (for a single aggregate that's probably true), or should the behavior be equivalent to addAll or replacing the reference to the collection with a new one. Remember PUT must be idempotent, this only means that subsequent calls using the exact same arguments should result in the exact same result. You may want to consider using a different URI for manipulating your entity collections in these ways.

Another problem that was encountered with a web service we encountered is it had sub resources, that had to exist prior to creating the resource we needed to create, akin to tags. Not having a idempotent put to that resource meant we were doing create on exception update. But given the simplicity of this resource it would have been even better to just allow the api to take the final object representation of that resource, instead of requiring the id, and done a lookup by name, or a create or update, under the hood. Doing this is more difficult logic wise, and impossible if there's no natural key (because you can't look it up).

You probably are asking yourself, but how do I handle errors for these things. Well, the way I see it you have three options. One requests are a transaction, so you wrap your database code with a transaction, and it either succeeds or fails, you can return a 200 on success, ensure HATEOAS, with links to any new resources in the response. Two, you could allow partial success, and return the successful objects. Three you could return a custom message envelope payload, this isn't very RESTful because it's a protocol on top of HTTP (it's more like SOAP).

I'm currently working on designing a new REST Web Service, and I've decided that no page load, or "single conceptual action" should take more than 6 API requests. This number is not arbitrary, it's the median concurrent connection amount, per host name, for consumer web browsers. Even that number is too many, but I felt that I needed to alot more than one request allowed due to some completely different actions that may need to occur on a page load.

Keep on with the Resource Oriented REST with HATEOAS, just try to think of how to minify the number of calls could by designing less granular resources

Mar 13, 2014

Matching Hex characters in a Regex

I've noticed a common problem with regular expressions and Hex Characters, so I thought I'd blog about it. The most common way to regex a UUID, or SHA1 or some other hex encoded binary value is this (and I've seen this in Perl libraries and StackOverflow answers).

[a-f0-9] or [A-F0-9]

Neither of these are correct as Hex is case insensitive and both of these regex's are. Hex is most commonly lowercase (unless you're Data::UUID), but that's an aesthetic, not a requirement. The best way to match Hex is using a POSIX character class.

[[:xdigit:]] or \x

which matches this in a more readable manner, and intent driven manner

[A-Fa-f0-9]

as a side note it's this in a regex string in Java

"\\p{XDigit}"

Feb 27, 2014

The ShareDir Problem

Some of you may have noticed a while back that converted Pod::Spell to the use of File::ShareDir::ProjectDistDir instead of keeping the wordlist in Pod::Wordlist::__DATA__. This move was made in conjunction with making Pod::Wordlist an Object, and in preparation for a time when you'll be able to specify your own wordlist file. It was also made so that non technical contributors could more easily update the wordlist without going near anything that looked like code.

So why shouldn't you put them in __DATA__? According to File::ShareDir

Quite often you want or need your Perl module (CPAN or otherwise) to have access to a large amount of read-only data that is stored on the file-system at run-time. On a linux-like system, this would be in a place such as /usr/share, however Perl runs on a wide variety of different systems, and so the use of any one location is unreliable. Perl provides a little-known method for doing this, but almost nobody is aware that it exists. As a result, module authors often go through some very strange ways to make the data available to their code.

The most common of these is to dump the data out to an enormous Perl data structure and save it into the module itself. The result are enormous multi-megabyte .pm files that chew up a lot of memory needlessly.

Another method is to put the data "file" after the __DATA__ compiler tag and limit yourself to access as a filehandle.

The problem to solve is really quite simple.

1. Write the data files to the system at install time.
 
2. Know where you put them at run-time.

Knowing where you put them at run-time is actually still a problem, because, we don't develop in the same spot that perl installs stuff. The first portion of the problem is, "my tests can't find my sharedir file". So Test::File::ShareDir, which overrides the File::ShareDir method. People say, use Test::File::ShareDir, it solves the pain, well that's not true, they're missing a different pain. What happens if you're trying to run, say bin/podspell from the git directory? oh right now it can't find the sharedir file again. In that case I could probably work around it, but it's a mild symptom of a greater problem I've encountered, people aren't deploying CPAN modules, they're deploying from git. Now I could say, "not supported", but unfortunately I'd usually have to say that to my current boss, or coworker, whomever that may be (and I tried it, didn't work). This isn't actually the root of the problem with Pod::Spell, but I guarantee it was a problem with Business::CyberSource. Mostly I feel like leaving Pod::Spell this way is helping to weed out the issues people will have with File::ShareDir::ProjectDistDir

So what do I think the solution is? There are obviously numerous "social" problems here, that I don't think can be easily solved. I'm sure that Kent Fredric, has a better grasp than I of the technical solutions. Though I have had one reoccurring idea which is apparently not tangible without significant effort. Have a searchable sharedir path, like in unix PERL5_SHAREDIR="./share:$DETECTED_DEV_DIR:$PERL5_LIB...", and try looking for the "file in the path" until you find it, then cache that location in memory so you only have to search once per run. This is probably not a good solution for various reasons, or perhaps it's certainly grossly oversimplified in how it could work.

Ultimately, there isn't a good solution right now, and I'm not sure we've actually thought of one.

Dec 1, 2013

Advent, good idea, but problematic execution

So advent is 24 days of high quality tutorials, and it's great, and ++ too all the people who make articles. But I've got a problem... it never shows up in my feed that I read in Feedly (formerly read in Google reader). This is compounded by the fact that there are many advents, each with there own yearly feed... so each year I have to poke around at the various projects to see if they're doing advent, and if so to subscribe to the feed. The solution... we just have the advents aggregated by ironman This is a really cheap hack, but would allow the advents that are being created a greater distribution than it is probably now getting. We could also just patch the various advent sofwares to provide a feed that continues eternally year after year, instead of a new feed each year which seems not so useful, and make sure we provide that link instead of the "just this year" link, in the UI. I suppose I could go fix it... but at this time I'm not sure where the source code for advent is, nor whether each advent has it's own software backing it, Perl 6 is using Wordpress which doesn't have this problem. I suppose I could add the ones I find to ironman but maybe the advent creator's don't do that for a reason, so I'd rather not step on toes.