Sep 2, 2014

Using Spring to create a full REST API in less than 60 lines of code

Spring with Spring Data is awesome. Seriously, I've never been able to throw up a full HATEOAS REST web service this fast. To start, I'll admit my headliner lie, I'm not counting the pom.xml.

cloc .                                                                 slave-vi
       5 text files.
       5 unique files.
       2 files ignored. v 1.62  T=0.04 s (104.8 files/s, 3930.8 lines/s)
Language                     files          blank        comment           code
Maven                            1              6              7             65
Java                             3             15              0             57
SUM:                             4             21              7            122

The basics of the web service is we want to be able to create tasks, like those on a todo list, for now we want the simplest tasks possible, in as little code possible. We should use UUID's so that our service can scale horizontally, so that we can easily generate known test ID's and we know that no two entities will share an id if we ever wanted to flatten things. We need to be able to perform basic CRUD on all of our entities as well as list them.

First let's create our Task. As you can see it's incredibly simple, we have our UUID identity, the uuid and uuid2 basically are telling Hibernate and H2/PostgreSQL to use UUID's. You might ask why limit description to 100 characters, well, since these are quick tasks, I might want to share them in a tweet, and this allows enough room for a url shortner plus the description. I think the rest is pretty self explanatory.

Now let's create our Repository. Well that doesn't do anything... oh but it does, and although it doesn't show it, because this application doesn't need it, there's a nifty method signature parser dsl that allows you to build queries just by writing a method signature.

Here's our Application. ... and pom for dependencies and stuff.

Here's the output of some curl commands I ran.

For a slightly more in depth tutorial you can see the official spring date rest getting started page. In the future I'll try to write about how to actually connect to PostgreSQL and set up API Authentication and Authorization

People are always telling me how verbose Java is, how much less typing their language (especially Perl is). I'd love to see a Perl app that can do all this in fewer lines of Perl (restriction, no line may be longer than 120 characters, and must be humanly readable), I personally don't think it can be done at this time (not with full HATEOAS and as many response codes), but I'm waiting for the day it can, and can be structured this simply.

Jul 1, 2014

Writing deprecation notices in perl, optionally with Moose

Sometimes you want to remove behavior from your code in a future version, here's the right way to do it.

Here's the quick of how it works, the before has to come after attributes because the methods aren't yet created. Using before also means it'll always run with your method, without actually touching your method, insuring no accidental consequences to your method. The @CARP_NOT ensures that the warning thrown doesn't show a line number in your package, or from within where Method Modifiers are actually run. warnings::warnif( 'deprecated', ensures that these warnings are only emitted if you have the deprecated category enabled. But what if people don't have warnings enabled? um... oh well? that's there problem because what if people do and they want to silence these until they can get to them. I highly suggest putting the name of the method being called and it's successor into the message so that people know how to correct their code.

If you don't want Moose, just don't use the method modifier and put warnings:warnif directly in your code. if you're using a different AOP before, modify @CARP_NOT to have the correct module.

Jun 3, 2014

Java Privacy, broken by design

It is worth prefixing that none of the following arguments apply to anything using the keyword static which makes things more procedural (or in some cases functional, than Object Oriented.

The suggestion in Java is to give the least required permission, but this, in my humble opinion, violates the Open-Closed Principle. Java has four privacy levels. Giving something the least permission required to function is fine in a Security context, privacy in programming however is simply there to discourage developers from doing stupid things. In most cases, unlike security, it only makes them difficult, not impossible. I believe that any SOLID principle should make your code more easily extensible, so while in fact Java's privacy is not in literal violation of Open-Closed, it does make extension more difficult than it otherwise should be, thus violating the spirit of the principle.

Before I continue on to how I think Java's design, and common usage, violates the Open Closed Principle, I should explain how I interpret the Principle, as my interpretation appears to be slightly different from what's on Wikipedia. The Principle as described on Wikipedia appears to be combining it with two other SOLID Principles, namely Liskov Substitution and Interface Segregation. So first let's assume that The principle stands alone, and that although it'd be bad design to not be completely SOLID, Open-Closed by itself does not require a subclass to support the same interface. Let's also assume that Not modifying the source to add features is also an unrealistic expectation. The purpose of Open-Closed is to ensure that your subclasses are not modifying the the structure or data of their child classes and that a child may easily add to, or change the behavior it got from its parent (Liskov says that it must be substitutable for its parent).

First let's talk about final, marking a class as final, means you can't extend it. This by the very definition is in violation of Open-Closed, because the class is not Open for extension. Classes such as UUID are marked final, you might ask, why would I want to extend a UUID? maybe I want to give it a toURISafeBase64 method. That wouldn't break any of the orignal behavior, and is almost as legitimately belonging as representing the UUID as hex. What if I wanted to extend a nested final class like an Iterator on a Map? I can't do that, which means I have to completely reimplement the Iterator to add simple functionality. In fact the way those are implemented I have to implement much more than just the Iterator.

It is recommended by the official Java Docs, and the community, to make member variables private unless otherwise necessary. Private variables are only accessible to the current class and nested classes, they are not visible to subclasses, in or out of the package. In my opinion this violates Open-Closed because now, if I subclass I need to reimplement all the fields, or use getters/setters. Getters and Setters for every single attribute are actually almost no better than the attribute itself, and an object that is nothing more than those is an Anemic. Now it could be argued that making subclasses call methods makes them more... impervious to change, because if you change the data structure you can preserve the methods. The problem is that most classes wouldn't use their own getters internally, and thus break this, because then extending that getter won't actually modify the class as completely as desired. Also remember that subclasses are by definition, tightly coupled, usually changes to the superclass require taking a look at the subclasses. So if you are using getters and setters to ensure extensibility and preserve internal/external interface changes, use them exclusively, meaning only they can have raw access, all constructors, and business logic methods must go through them. At that point they are the replacment for direct member access and private won't matter as much (I will probably advocate a variant of this in the next article). However if you still want to access some member data hidden by the class directly, you should ensure that your subclasses can easily do so as well. You should only make a member private if it would actually cause a bug in any subclass.

So if we go on to assume that all subclasses, even ones in a different package (because you know people using your code are going to extend things) then we should be making all members protected. This would mean that all subclasses could reuse the member variables. Of course the problem is now your data is not encapsulated in your package, once a member variable is not private, is is available to your entire package. To me this also seems like a bad idea, other classes in my package don't need to see my objects internals unless they're a subclass. So now you have to choose, make all classes easily extended? or protect people who are programming in your package from themselves. You can probably control who's modifying your package and how, and have static code analysis to check that you're not calling only But nothing can give you back extensibility you've taken away (outside of adding it back).

So let's look at interfaces, interfaces generally have two options, public, or protected. This is fine, but has a problem, protected interfaces are only applicable to the package that has the interface defined. Methods implementing the interface must have the same privacy level. Most of the time what I actually want is an interface which I've defined globally as a contract, but I want the implementations to only be called by their package. For example, a DAO (Data Access Object) might be able to share the same interface (with judicious generic usage), between entities. However if you do this, you may find that your interface must be public, so it can be between packages, now the DAO itself must have these methods as public, even if it's being called only by something in the same package, because the interface was public so that the interface could be shared. I don't see that you can get away with this whether you use package by feature or package by layer. If you follow this through with previous design thoughts such as everything is an Interface, and those end up being public, and you want nice subclassibility, whether through protected members or through interfaced getters/setters, now everything is public, and we've completely lost any real encapsulation.

So how could it be done better? have a privacy type subclass which makes the method or member available to only subclasses and not throughout the package. Allow interfaces that have global definitions, but implementations of the methods can be at a package or subclass level. I feel like this could still be accomplished, perhaps by creating an interface type that is a "contract", and a new privacy keyword for "subclass". Contracts could define that methods be subclass, or protected, in their implementation. At that point you could have all kinds of methods that are still hidden to the general world. You could then build package by feature, have all methods that are required within the package have contracts, but share contracts between features, so all CRUD controllers would have the same method signatures, all repositories would share signatures, etc, etc.

What if I actually want more privacy? well you could not share interfaces between packages, and then have interfaces not be public. You could also not use an interface at all unless it's for a method on your bounded context that must be public. You can also say that ease of extensibility is not a goal and continue to not use your getters/setters internally, and yet make your members private.

You could also say, privacy is irrelevant, if the language is then preventing good, SOLID, design. Specifically here, Open-Closed, Liskov Substitution, and Interface Segregation. If you go this route you'll need conventions, and to trust other developers, because a lot of things will be public or protected. I recommend Perl's convention of prefixing subclass private methods with _ and assuming that all member fields are subclass/trait private and should never be called outside of their inheritance hierarchy.

May 6, 2014

Two Hundred Posts

My blog is 6 years old and 200 posts, and over 120k hits, Probably my first interesting post is when decided I was switching to git from svn, and it's not very interesting, and I think much more poorly written than I write things now. Since then I've re-skinned the blog to new templates at least twice. I now list books that I recommend on the right side of my blog, and I've ensured that all content is clearly licensed under the creative commons. Personally I've moved from being a student, to system administrator, to Perl developer, and am now building things with Java and potentially Ruby.

Given that I'm now building things in Java and Ruby their may be posts that are about those technologies and the good and bad things I've found out about them. One thing I'll say is that some of the Java as a language hate is as unfounded as the hate for Perl. All languages have good and bad things about them, even Perl 6 has warts in its design.

Since Java is my full time job now, and I have little reason to be doing Perl as I've been unhappy with the Perl 5 Framework landscape, I'm unlikely to continue developing features for my Perl 5 modules. If you're interested in becoming a comaintainer on any of my modules, my requirement is that you show interest in the module by contributing high quality patches to that module. I'd like to see evidence that I won't have to come back and fix things later, and that your interest is sincere. If you're not interested in being a comaint patches are still welcome.

I haven't found frameworks that I'm completely happy with in any other language either, at this point I'm considering making a very minor project developing a full framework for Perl 6. This framework (probably split into components) would be built on a new Dependency Injection module, using what I've learned from Bread::Board, AngularJS, and Java's CDI. It would also include an ORM that is based on Data Mapper principles and make high use of introspection. I would like to mention I have some doubt in myself making serious traction, but we'll see

Apr 2, 2014

REST, ROA, and HATEOAS often leads to bad webservice design

This is not to say that they are bad, but I find that all too frequently the resulting API's are poorly designed due to forgetting one thing, RPC (Remote Procedure Call) is expensive. Now by RPC, I do not mean custom messaging formats such as SOAP, or XML-RPC, I mean calling a method on a remote server. Do not think that just because you are using HTTP as the message format with something like XML or JSON, that calling GET /resource, is significantly all that different from calling get_resource in a SOAP call. The frequent idempotence also does not mean that you're not actually doing RPC as often good method design server side also implies idempotence, e.g. adding an object to a Set in Java will not result in the object being added twice if you add it twice. All calls to a remote is a form of RPC. The most expensive part of RPC is creating a new connection, just how depends on the protocol. This is why web sockets, for instance, is much cheaper than repeated calls (there are other reasons and expenses too, like maintaining many connections).

I've worked with a few Resource Oriented Architecture (ROA) web services, and they each suffered from the same flawed design, an excessive number of RPC calls was required to do seemingly simple tasks. This is caused by the, misguided, belief that every single aggregate should be it's own resource and that components of the aggregate should also have it's own resource, and that those should be the only access to the underlying aggregate. In one case working with an ROA we had to do about 5 RPC calls for every single product we wanted to create, and we were bulk creating. This problem was aggravated by the lack of an idempotent PUT for most resources.

The reality is, with a good API design we could have created all, of the objects we needed with a single API call to a bulk interface. I'm talking the RESTful equivalent to a Java Collection.addAll( objs[] ). In fact if you use addAll on a Set, the result of multiple same calls is idempotent, the same object will not be added twice. It would be really easy to write this given a good ORM, and a good interface so that you could do a POST or PUT to /entities. this is a significant improvement to a design where you'd have to do a PUT or POST for every single item you wanted to create. DELETE may be the only place where I'd consider not doing a bulk request, and it is generally able to be completed asynchronously. You may of course consider limiting the number of entities acted on in a request, so if you need to create 1000 entities, it might take 10 requests doing 100 at a time, this is still better for both the client and the server than doing 1000 requests.

The choice between PUT and POST depends on whether you believe that the call to GET must return the exact same view as PUT, meaning that a PUT would delete resources not included (for a single aggregate that's probably true), or should the behavior be equivalent to addAll or replacing the reference to the collection with a new one. Remember PUT must be idempotent, this only means that subsequent calls using the exact same arguments should result in the exact same result. You may want to consider using a different URI for manipulating your entity collections in these ways.

Another problem that was encountered with a web service we encountered is it had sub resources, that had to exist prior to creating the resource we needed to create, akin to tags. Not having a idempotent put to that resource meant we were doing create on exception update. But given the simplicity of this resource it would have been even better to just allow the api to take the final object representation of that resource, instead of requiring the id, and done a lookup by name, or a create or update, under the hood. Doing this is more difficult logic wise, and impossible if there's no natural key (because you can't look it up).

You probably are asking yourself, but how do I handle errors for these things. Well, the way I see it you have three options. One requests are a transaction, so you wrap your database code with a transaction, and it either succeeds or fails, you can return a 200 on success, ensure HATEOAS, with links to any new resources in the response. Two, you could allow partial success, and return the successful objects. Three you could return a custom message envelope payload, this isn't very RESTful because it's a protocol on top of HTTP (it's more like SOAP).

I'm currently working on designing a new REST Web Service, and I've decided that no page load, or "single conceptual action" should take more than 6 API requests. This number is not arbitrary, it's the median concurrent connection amount, per host name, for consumer web browsers. Even that number is too many, but I felt that I needed to alot more than one request allowed due to some completely different actions that may need to occur on a page load.

Keep on with the Resource Oriented REST with HATEOAS, just try to think of how to minify the number of calls could by designing less granular resources

Mar 13, 2014

Matching Hex characters in a Regex

I've noticed a common problem with regular expressions and Hex Characters, so I thought I'd blog about it. The most common way to regex a UUID, or SHA1 or some other hex encoded binary value is this (and I've seen this in Perl libraries and StackOverflow answers).

[a-f0-9] or [A-F0-9]

Neither of these are correct as Hex is case insensitive and both of these regex's are. Hex is most commonly lowercase (unless you're Data::UUID), but that's an aesthetic, not a requirement. The best way to match Hex is using a POSIX character class.

[[:xdigit:]] or \x

which matches this in a more readable manner, and intent driven manner


as a side note it's this in a regex string in Java


Feb 27, 2014

The ShareDir Problem

Some of you may have noticed a while back that converted Pod::Spell to the use of File::ShareDir::ProjectDistDir instead of keeping the wordlist in Pod::Wordlist::__DATA__. This move was made in conjunction with making Pod::Wordlist an Object, and in preparation for a time when you'll be able to specify your own wordlist file. It was also made so that non technical contributors could more easily update the wordlist without going near anything that looked like code.

So why shouldn't you put them in __DATA__? According to File::ShareDir

Quite often you want or need your Perl module (CPAN or otherwise) to have access to a large amount of read-only data that is stored on the file-system at run-time. On a linux-like system, this would be in a place such as /usr/share, however Perl runs on a wide variety of different systems, and so the use of any one location is unreliable. Perl provides a little-known method for doing this, but almost nobody is aware that it exists. As a result, module authors often go through some very strange ways to make the data available to their code.

The most common of these is to dump the data out to an enormous Perl data structure and save it into the module itself. The result are enormous multi-megabyte .pm files that chew up a lot of memory needlessly.

Another method is to put the data "file" after the __DATA__ compiler tag and limit yourself to access as a filehandle.

The problem to solve is really quite simple.

1. Write the data files to the system at install time.
2. Know where you put them at run-time.

Knowing where you put them at run-time is actually still a problem, because, we don't develop in the same spot that perl installs stuff. The first portion of the problem is, "my tests can't find my sharedir file". So Test::File::ShareDir, which overrides the File::ShareDir method. People say, use Test::File::ShareDir, it solves the pain, well that's not true, they're missing a different pain. What happens if you're trying to run, say bin/podspell from the git directory? oh right now it can't find the sharedir file again. In that case I could probably work around it, but it's a mild symptom of a greater problem I've encountered, people aren't deploying CPAN modules, they're deploying from git. Now I could say, "not supported", but unfortunately I'd usually have to say that to my current boss, or coworker, whomever that may be (and I tried it, didn't work). This isn't actually the root of the problem with Pod::Spell, but I guarantee it was a problem with Business::CyberSource. Mostly I feel like leaving Pod::Spell this way is helping to weed out the issues people will have with File::ShareDir::ProjectDistDir

So what do I think the solution is? There are obviously numerous "social" problems here, that I don't think can be easily solved. I'm sure that Kent Fredric, has a better grasp than I of the technical solutions. Though I have had one reoccurring idea which is apparently not tangible without significant effort. Have a searchable sharedir path, like in unix PERL5_SHAREDIR="./share:$DETECTED_DEV_DIR:$PERL5_LIB...", and try looking for the "file in the path" until you find it, then cache that location in memory so you only have to search once per run. This is probably not a good solution for various reasons, or perhaps it's certainly grossly oversimplified in how it could work.

Ultimately, there isn't a good solution right now, and I'm not sure we've actually thought of one.

Dec 1, 2013

Advent, good idea, but problematic execution

So advent is 24 days of high quality tutorials, and it's great, and ++ too all the people who make articles. But I've got a problem... it never shows up in my feed that I read in Feedly (formerly read in Google reader). This is compounded by the fact that there are many advents, each with there own yearly feed... so each year I have to poke around at the various projects to see if they're doing advent, and if so to subscribe to the feed. The solution... we just have the advents aggregated by ironman This is a really cheap hack, but would allow the advents that are being created a greater distribution than it is probably now getting. We could also just patch the various advent sofwares to provide a feed that continues eternally year after year, instead of a new feed each year which seems not so useful, and make sure we provide that link instead of the "just this year" link, in the UI. I suppose I could go fix it... but at this time I'm not sure where the source code for advent is, nor whether each advent has it's own software backing it, Perl 6 is using Wordpress which doesn't have this problem. I suppose I could add the ones I find to ironman but maybe the advent creator's don't do that for a reason, so I'd rather not step on toes.

Nov 2, 2013

Would You Miss Autoderef in 5.20? solutions in search of a problem

This is a response to Chromatics blog post Would You Miss Autoderef in 5.20?, because I can't ever get comments to work on his MT for something like a year (500, or some blogger openid incompat).

In all honesty I don't find either particularly interesting. I've too often been targeting 5.8 or 5.10 for syntax... @{ $foo } is really the most I've ever needed, @$foo is nicer, but beyond that don't need it. I can't figure out the value of either autoderef or postfix deref, neither of these seem to be solving actual pain points, I think perhaps they're a solution in search of a problem. Maybe I just need someone to point out a good use case that this stuff is solving.

Where are the things I actually need? Here's hoping that 5.20 will get method signatures, or exception handling or maybe figure out how to get given/when out of experimental, something useful.

I really do appreciate all the hard work the people who are improving core perl are doing, and it's all needed. Things like __SUB__ and my sub {} are absolutely awesome, as well as all the work on unicode, and other general improvements. Maybe lexical subs will be moved to stable? but I doubt it. Basically I want something that I can point to my friends outside of the echo chamber, something they could look at and say, yeah that's cool, Perl is moving forward.

Oct 23, 2013

Providing with Providers and Bread::Board

So when I started using Dependency Injection the following problem happened, how do I Inject this dependency when the container is not accessible at this point. Ok, that sentence even confused me a little bit, so what do I mean. Let's say I have a Repository for Products that is injected into my controller. Each Product stored has one or more ProductVariants that is part of it's aggregate, which itself has Nested Categories. Loading this entire graph at once would be relatively expensive, so we decide to do some lazy loading via DBI in the classes. One problem, how on earth do we Inject a Database Handle all the way down to Categories. Most of these ways are against DI, but they are solutions to the problem, there are also ways to combine these. Also, your model class having a database handle is probably bad design itself, but I'm not going to get into that. Sadly I've done every one of these


Well at least you aren't hard coding the way to read your config file, or your database driver. You're smart enough to rely on an Interface rather than an Implementation. This is fraught with so many problems. Firstly if your web server (assuming it's a web application) is getting any kind of traffic at all you'll end up creating tons of database connections, you'll also be reading that config file every time (ok I forget if Config::Merge caches to memory, it might, but often when I see people design this way, they are basically slurping the file every time). Someday 5 years from now, someone is going to hate you because now they need to support replicants... and the config needs to support more connection strings, which means modifying every place you've done this. Also, you've completely lost the ability to inject your dependencies for whatever reason you may want to.


Ok, this is a little bit better than before, at least now you have Inverted your dependencies, you could provide the config or the database handle to the class. You've also put the code in a centralized place so it's easy to change when you need to. You're still reading the file fairly often, though perhaps less because it now depends on how long Product variant is alive. So what happens if your connection is lost? We still have a connection for each class, a connection that may now be held much longer. Why does Product Variant need access to the config? this is a violation of the Law of Demeter.

Naive Service Locator

We need to get rid of knowledge of the config. We can do this by using a Service Locator, which is simply a well known service to retrieve other services, usually a global singleton. In our example we're at least smart enough to allow ourselves to change the class out via injection for testing. We no longer have tons of connections or config reads. However, we now have a new problems, what happens when our Application Server forks a process and we lose the database connection? What about when our locator gets more complex, like nested containers, that could change or access, specifically with replication. Also our class is now directly dependent on Bread::Board, and its interface. At least we've stopped caring how our database handle is built. Our locator is a global singleton, and we can't change our Container class for testing.

Robust Service Locator

Ok, so this is much better we can now configure which locator instance we use at runtime. We have removed the dependency on the Bread::board interface. There is no longer a problem with database connections being dropped. However, our container is still a global singleton, and our class still knows about it, which again, law of Demeter.

Dependency Injection and Pass it down

For now I've been basically ignoring other classes because with all of these other approaches they aren't really a concern because you would do the same thing in every class, fetch your service. Much of the code is required here anyways, we always would have to do the sql, the transforms the loops. Dependency inversion is the opposite, do not think of how to retrieve the dependency instead have the dependency provided. But this becomes tricky to think of when you're 3 or more levels deep in your hierarchy. One way to do it simply pass the reference. We create a specific problem here, our Repository lifecycle is a singleton so we need to ensure re-connection, thus we must inject the connector which means we are immediately dependent on the DBIx::Connector interface. This doesn't seem that tricky until you add more than one service, which still may not seem that bad, until you have to add one later, and oh my god, now you're modifying several classes.

Dependency Injection with Providers

This next and final sample show's one way of doing this with Providers. A little context on a Provider first, a Provider is simply an object that can be used to retrieve a an instance of an object you need. It's really just a kind of factory, but tends to be specific to dependency injection, in scenarios where you need a new instance of an object each time. It seems that it might also work well for other cases, such as objects with a longer lifespan than a new instance on every request from the injector, but shorter than a permanent singleton. In short a provider should be able to provide you with an instance on request, without requiring to to depend on retrieval.

The code that I'm demonstrating will not work currently practical scenario, meaning one where variant parameters are required. I've opened a bug about resolving the issue. In the mean time, the patch is simple and you could apply it yourself. You could use BUILDARGS to rename an alternate key to the primary hashkey, in your models. You could also just define each model service one at a time instead of looping them, and actually validating their parameters.

You may note that I've removed the config, this was simply so I could build the code out so it works in completion. It maybe advantageous not to put config processing code in the Dependency injector, but rather provide the config to Bread::Board::Declare at the constructor via required services. This way of doing things requires much more code, but is also much more flexible. Every piece of the model, even those hat could not normally be accessed by the injector, can now have it's dependencies injected to it.