Setup procedure

Install git, this provides the git-http-backend. yum install git

Install httpd. yum install httpd

SetEnv GIT_PROJECT_ROOT /srv/git
SetEnv GIT_HTTP_EXPORT_ALL
ScriptAlias /git/ /usr/libexec/git-core/git-http-backend/

<Files "git-http-backend">
    AuthType Basic
    AuthName "Git Access"
    AuthUserFile /srv/git/.htpasswd
    Require expr !(%{QUERY_STRING} -strmatch '*service=git-receive-pack*' || %{REQUEST_URI} =~ m#/git-receive-pack$#)
    Require valid-user
</Files>

Create the git root and configure the users

mkdir /srv/git
htpasswd -c /srv/git/.htpasswd db57

Test procedure

On the server:

cd /srv/git
mkdir my-test-repository
cd my-test-repository
git init --bare
chown -R apache:apache /srv/git

On a client:

mkdir my-test-repository
cd my-test-repository
git init
echo "test" > README.md
git add -A
git commit -m 'initial import'
git remote add origin http://localhost/git/my-test-repository
git push -u origin master

As far as I know, all repositories need to be created server-side before you are allowed to push to them.

From outside

Where the IP of your server is 10.179.127.226,

git clone http://10.179.127.226/git/my-test-repository

Cloning is unauthenticated, only push is authenticated.

Posted 2018-11-29

An extremely minimal way to publish your Vue component. You already have your project. I assume and hope you are using vue-cli 3.0. If not, immediately switch to it, vue create mypackage and port your project into the new structure.

To get the build for your project, you use the special lib build target to the vue-cli-service build subcommand. You can invoke this as such:

amoe@cslp019129 $ ./node_modules/.bin/vue-cli-service build --target lib --name amoe-butterworth-widgets src/components/TaxonSelect.vue

Here, amoe-butterworth-widgets is the name of the library that you intend to publish it. In this case, I'm publishing it as an unscoped public package, this is just the regular form of npm publishing that you all know and (hah) love.

TaxonSelect.vue will be exposed as the default export of the build module.

The build will produce several files under the dist subdirectory. You are looking for the UMD build. You'll find a file dist/amoe-butterworth-widgets.umd.js. Now you need to add a key to package.json.

{
...
    "name": "amoe-butterworth-widgets",
    "main": "dist/amoe-butterworth-widgets.umd.js",
    "version": "0.1.0",
    "license": "MIT",
...
}

It's wise to set a license and to obey semver as appropriate.

Now you need to be logged in before you can actually publish. Run npm adduser.

Once you've done this, simply run npm publish. Your package will be pushed up and made available at http://npm.js.com/package/mypackage, where mypackage was specified in the name field of package.json.

When someone runs npm install mypackage, they'll get what is more-or-less a copy of your source tree. As far as I can see, npm doesn't attempt to clean your tree or reproduce your build in any way. So make sure that anything you don't want to be public is scrubbed before running npm publish.

When the user wants to actually use your component, TaxonSelect.vue is the default export, as mentioned above. So to use it, they just type import TaxonSelect from 'mypackage', and TaxonSelectis then a component that they can register in their components object. There are ways to export multiple components from a module, but that's outside the scope of this article.

Posted 2018-11-21

Suppose you want to encode the Australian flag, you may consider this to be one simple emoji character. Actually you're in for a surprise, perhaps, because emojis aren't always represented as a single character, many emojis are combinations of multiple Unicode code points.

For instance, the Australian flag may be represented as two 8-byte code points.

U+0001F1E6
U+0001F1FA

This happens to display as a single glyph, the Australian flag, on some platforms, but may also display as two separate glyphs.

However, some ways of representing text only support encoding of 4-byte code points, those in the range U+0000 to U+FFFF. JSON is one of these. When we attempt to escape the 8-byte characters (which, note, do not need to be escaped under the JSON spec), we get a result that looks like two different codepoints. Quoth RFC 7159:

To escape an extended character that is not in the Basic Multilingual
Plane, the character is represented as a 12-character sequence,
encoding the UTF-16 surrogate pair.  So, for example, a string
containing only the G clef character (U+1D11E) may be represented as
"\uD834\uDD1E".

So in fact, the Australian flag may be concretely represented in escaped JSON as this string:

"\ud83c\udde6\ud83c\uddfa"

As you can see, the last two hex digits of these escape pairs (e6, fa) matches to the last bytes of the 8-byte code points above.

Posted 2018-09-14

I've been holding off a bit on the food posts of late, because when I enter the kitchen these days it's often in a flurry of inspiration and I don't have the wherewithal to fetch the camera. I have been continuing to document, though. I've been repeating several recipes multiple times in different variations, particularly the cheesesteak which I've probably made about 4 or 5 times by now. I've tried it with various cuts, from the prohibitively expensive, authentic but delicious rib eye, the wonderful and reasonably priced rump steak, and the difficult-to-find skirt steak. I've also been experimenting with using beef seasoning, particularly the Rajah brand, which deepens the flavour immensely. It actually deepens the flavour so much I wonder if it might be too much, if it's disguising the flavour of the cut itself. But I can't be sure without more testing.

The biggest discovery that I made was to actually cook the steak properly as steak rather than the sauted-beef method that's popular and probably more authentic. This means the standard steak cooking method of browning in a hot pan. I cook it until medium rare. You can feel free to do it as rare as you like, because you can easily cook it more at the saute stage. Then remove from the pan and deglaze, reserving the fond. You let the steak cool and slice it into whatever consistency you like. I do chunky pieces. Then cook the rest of the recipe with a covered pan and lastly add in the steak chunks with the various juices. You'll have incredibly juicy cheesesteak mix and chunks that yield to the teeth.

I've found that toasting the ciabatta lightly in the grill before putting in the filling enhances the comfort-feeling of the sandwich. It's controversial but I've started adding cold mayonnaise after the steak mixture. I think this adds a lovely sharpness which offsets the giant umami hit from the brown stuff. This does require a bit more care in adding the cheese, though, because you have to ensure the cheese is melted properly before adding it to the sandwich, so you need to somehow melt the cheese onto the steak mix before. I'm thinking of even trying a lemony mayonnaise? Sounds crazy but perhaps it could work. I tried gherkins as a condiment and found them rather disappointing, I really expected the pickle flavour to work well but for some reason it didn't. Jalapenos work better than gherkins, although I'm not sure I'd consider this a spicy sandwich.

I've also had some thoughts on soto ayam, having re-cooked it recently with excellent results. Thigh works better than breast for this in my view, although I'd certainly remove the skin next time. Two teaspoons of sambal ulek is enough to cover about one bowl when eating with rice. I found out that I really don't like the Maggi Malaysian sauce and need to stop eating it. However, the notorious Maggi liquid seasoning does go very well with reduced-rice that's destined for a South Asian dish, such as rice that you might drown in a big bowl of soto. I can't tell what it is with these dishes but they just go down with such aplomb, I feel like I could eat bowls and bowls of them without ever being sated. This is a characteristic of both Owen's nasi goreng and Owen's soto ayam, though curiously not the pangek ikan. I think the latter may be because the fat content from the coconut milk tends to fill you up. The only common characteristic between those two dishes that I can see is that they're both lean and pungent, getting their kick from a vinegary sambal.

I got some experience in braising from attempting to cook a Mexican short rib in adobo having been inspired by the menu at a local restaurant. But although I succeeded in making edible food twice, both attempts were almost but not entirely unlike the restaurant version, and both were utterly different from each other to boot. The first was smoky, deep, but slightly bitter, having got burned from excessively high oven heat and having the sauce scraped and melded with it: I tempered it with some muscovado sugar, but it stayed questionable. The second was the opposite: done on a low heat for an extremely long time, it fell off the bone more satisfactorily than the first one, but doesn't seem to have absorbed so much flavour in comparison to the first one. Perhaps the truth lies in the middle. Who knew slow cooking could be so difficult?

Posted 2018-08-29

Goal: perform a basic query of our database through an autogenerated GraphQL backend.

For this task we use Postgraphile, a tool modelled on Postgrest that generates an API server based on introspecting the database schema.

We will also use the Apollo GraphQL client and the vue-apollo integration.
These seem to be the most widely used libraries.

Connecting to GraphQL

In your main entry point, you can use this setup code:

import { ApolloClient } from 'apollo-client';
import { HttpLink } from 'apollo-link-http';
import { InMemoryCache } from 'apollo-cache-inmemory';
import VueApollo from 'vue-apollo';

const localApi = "/api";

const client = new ApolloClient({
    link: new HttpLink({ uri: localApi }),
    cache: new InMemoryCache()
});

const apolloProvider = new VueApollo({
    defaultClient: client
});

Vue.use(VueApollo);

document.addEventListener("DOMContentLoaded", e => {
    const vueInstance = new Vue({
        render: h => h(ApplicationRoot),
        provide: apolloProvider.provide()
    });
    vueInstance.$mount('#vue-outlet');
});

Your API will be available under the "/api" prefix to avoid CORS issues. You configure this in webpack with the following stanza:

devServer: {
    port: 57242,
    proxy: {
        "/api": {
            target: "http://localhost:5000/graphql",
            pathRewrite: {"^/api": ""}
        }
    }
}

The Postgraphile server will appear on port 5000 by default.

Starting the server

This is extremely simple.

amoe@klauwier $ sudo yarn global add postgraphile
amoe@klauwier $ postgraphile -c postgres://localhost/mol_viewer

PostGraphile server listening on port 5000 🚀

  ‣ Connected to Postgres instance postgres://localhost:5432/mol_viewer
  ‣ Introspected Postgres schema(s) public
  ‣ GraphQL endpoint served at http://localhost:5000/graphql
  ‣ GraphiQL endpoint served at http://localhost:5000/graphiql

* * *

Your database here is mol_viewer. A service should start on port 5000.

Writing a sample query

Using the module graphql-tag, you can define your queries using template literals. This will syntax-check your queries at compile time.

const demoQuery = gql`
{
  allParticipants {
    nodes {
     reference
    }
  }
}`

This looks slightly weird, but the important thing to know here is that I have a table already in my database schema called participant. Postgraphile has inferred a collection of objects, therefore, called allParticipants. My table has a field called reference (which can be of any type). nodes is (I believe) a Postgraphile-specific property of the list allParticipants. That is to say, the text nodes above has nothing to do with the database schema of mol_viewer, rather it's an artifact of using Postgraphile.

Link the query to your page

When you have set up vue-apollo, you have access to a apollo property on your Vue instance. In this case it looks as such.

apollo: {
    allParticipants: demoQuery
}

Here, allParticipants is a name which must be the same as the top-level result of the query. It's not just an arbitrary identifier.

Now, allParticipants field exists in your component's data object. The query is automatically made when you load your page, and allParticipants will be populated. You can demonstrate this through the simple addition of {{allParticipants}} to your template.

To iterate through the query results, you have to consider that the results of the query are always shaped like the query itself. So your basic list of the results will look as follows.

<ol>
  <li v-for="node in allParticipants.nodes">
    {{node.reference}}
  </li>
</ol>

Filtering

Postgraphile supports filtering using the simple condition on the auto-generated allParticipants field. For instance, imagine filtering by a given reference. You'd do allParticipants(condition: {reference: "A14"}). Only simple equality is supported out of the box (!). Plugins are available.

Posted 2018-07-20

This is a bit tricky because multiple ways of doing it are documented. This is the way that eventually worked for me.

The top-level SConstruct is as normal for an out-of-source build, it reads

SConscript('src/SConscript', variant_dir='build')

You need a header so that your program can recognize the version number. In C++ this is as follows, in src/version.hh:

extern char const* const version_string;

You can define the version that you want to update in a file named version which is in the root of the repository. It should have no other content other than the version number, perhaps along with a newline.

0.0.1

Now the src/SConscript file should look like this:

env = Environment()

# The version file is located in the file called 'version' in the very root
# of the repository.
VERSION_FILE_PATH = '#version'

# Note: You absolutely need to have the #include below, or you're going to get
# an 'undefined reference' message due to the use of const.  (it's the second
# const in the type declaration that causes this.)
#
# Both the user of the version and this template itself need to include the
# extern declaration first.

def version_action(target, source, env):
    source_path = source[0].path
    target_path = target[0].path

    # read version from plaintext file
    with open(source_path, 'r') as f:
        version = f.read().rstrip()

    version_c_text = """
    #include "version.hh"

    const char* const version_string = "%s";
    """ % version

    with open(target_path, 'w') as f:
        f.write(version_c_text)

    return 0

env.Command(
    target='version.cc',
    source=VERSION_FILE_PATH,
    action=version_action
)

main_binary = env.Program(
    'main', source=['main.cc', 'version.cc']
)

The basic strategy here is to designate the version file as the source file for version.cc, but we just hardcode the template for the actual C++ definition inside the SConscript itself. Note that the include within the template is crucial, due to an 'aspect' of the C++ compilation process.

Posted 2018-06-08

This is really tricky. There are several hurdles you face.

First hurdle: importing the Leaflet CSS files from your node_modules folder and incorporating this into your Webpack build.

The canonical form for this is as follows:

@import url("~leaflet/dist/leaflet.css");

The tilde is a documented but obscure shortcut for a vendored module found under node_modules. There's no way to avoid hardcoding the path dist/leaflet.css.

Once you've done this, you'll have a non-broken map view, but you still won't be able to view marker images. You'll be seeing that the CSS attempts to load images but isn't able to load them. Then you'll try to apply file-loader, but due to a similar issue to one described on React, you'll note that file-loader or url-loader generate broken paths with strange hashing symbols in them.

Luckily, there's a fix for this! You'll notice this solution in the thread, from user PThomir:

import L from 'leaflet';

L.Icon.Default.imagePath = '.';
// OR
delete L.Icon.Default.prototype._getIconUrl;

L.Icon.Default.mergeOptions({
  iconRetinaUrl: require('leaflet/dist/images/marker-icon-2x.png'),
  iconUrl: require('leaflet/dist/images/marker-icon.png'),
  shadowUrl: require('leaflet/dist/images/marker-shadow.png'),
});

This is now getting very close. However, you'll try to adapt this, using import instead of require, because TypeScript doesn't know about require.

You'll get examples like this:

Cannot find module 'leaflet/dist/images/marker-icon-2x.png'

But you'll look for the file and it'll clearly be there. Puzzling. Until you realize you've missed a key point: Webpack's require and TypeScript's import are completely different animals. More specifically: Only Webpack's require knows about Webpack's loaders. So when you might try to import the PNG,

import iconRetinaUrl from 'leaflet/dist/images/marker-icon-2x.png';

This is actually intercepted by the TypeScript compiler and causes a compile error. We need to find some way to use Webpack's require from typescript. Luckily this isn't too difficult. You need to create a type signature for this call as such.

// This is required to use Webpack loaders, cf https://stackoverflow.com/a/36151803/257169

declare function require(string): any;

Put this somewhere in your search path for modules, as webpack-require.d.ts. Remember you don't explicitly import .d.ts file. So now just use require in your entry.ts file as before.

My eventual snippet looked as follows:

const leaflet = require('leaflet');

delete leaflet.Icon.Default.prototype._getIconUrl;

const iconRetinaUrl = require('leaflet/dist/images/marker-icon-2x.png');
const iconUrl = require('leaflet/dist/images/marker-icon.png');
const shadowUrl = require('leaflet/dist/images/marker-shadow.png');

leaflet.Icon.Default.mergeOptions({ iconRetinaUrl, iconUrl, shadowUrl })

But remember, none of this will work without that .d.ts file, otherwise tsc is just going to wonder what the hell you mean by require.

Posted 2018-05-22

The basic question is, how do we read an entire graph from a Neo4j store into a NetworkX graph? And another question is, how do we extract subgraphs from Cypher and recreate them in NetworkX, to potentially save memory?

Using a naive query to read all relationships

This is based on cypher-ipython module. This uses a simple query like the following to obtain all the data:

MATCH (n) OPTIONAL MATCH (n)-[r]->() RETURN n, r

This can be read into a graph using the following code. Note that the rows may duplicate both relationships and nodes, but this is taken care of by the use of neo4j IDs.

def rs2graph(rs):
    graph = networkx.MultiDiGraph()

    for record in rs:
        node = record['n']
        if node:
            print("adding node")
            nx_properties = {}
            nx_properties.update(node.properties)
            nx_properties['labels'] = node.labels
            graph.add_node(node.id, **nx_properties)

        relationship = record['r']
        if relationship is not None:   # essential because relationships use hash val
            print("adding edge")
            graph.add_edge(
                relationship.start, relationship.end, key=relationship.type,
                **relationship.properties
            )

    return graph

There's something about this query that is rather inelegant, that is that the result set is essentially 'denormalized'.

Using aggregation functions

Luckily there's another more SQL-ish way to do it, which is to COLLECT the relationships of each node into an array. This then returns lists which represent a distinct node and the complete set of relationships for that node, similar to something like the ARRAY_AGG() and GROUP BY combination in PostgreSQL. This seems much cleaner to me.

# this version expects a collection of rels in the variable 'rels'
# But, this version doesn't handle dangling references
def rs2graph_v2(rs):
    graph = networkx.MultiDiGraph()

    for record in rs:
        node = record['n2']
        if not node:
            raise Exception('every row should have a node')

        print("adding node")
        nx_properties = {}
        nx_properties.update(node.properties)
        nx_properties['labels'] = list(node.labels)
        graph.add_node(node.id, **nx_properties)

        relationship_list = record['rels']

        for relationship in relationship_list:
            print("adding edge")
            graph.add_edge(
                relationship.start, relationship.end, key=relationship.type,
                **relationship.properties
            )

    return graph

Trying to extend to handle subgraphs

When we have relationship types that define subtrees, which are labelled something like :PRECEDES in this case, we can attempt to materialize this sub-graph selected from a given root in memory. In the query below, the Token node with content nonesuch is taken as the root.

This version can be used with a Cypher query like the following:

MATCH (a:Token {content: "nonesuch"})-[:PRECEDES*]->(t:Token)
WITH COLLECT(a) + COLLECT(DISTINCT t) AS nodes_
UNWIND nodes_ AS n
OPTIONAL MATCH p = (n)-[r]-()
WITH n AS n2, COLLECT(DISTINCT RELATIONSHIPS(p)) AS nestedrel
RETURN n2, REDUCE(output = [], rel in nestedrel | output + rel) AS rels

And the Python code to read the result of this query is as such:

# This version has to materialize the entire node set up front in order
# to check for dangling references.  This may induce memory problems in large
# result sets
def rs2graph_v3(rs):
    graph = networkx.MultiDiGraph()

    materialized_result_set = list(rs)
    node_id_set = set([
        record['n2'].id for record in materialized_result_set
    ])

    for record in materialized_result_set:
        node = record['n2']
        if not node:
            raise Exception('every row should have a node')

        print("adding node")
        nx_properties = {}
        nx_properties.update(node.properties)
        nx_properties['labels'] = list(node.labels)
        graph.add_node(node.id, **nx_properties)

        relationship_list = record['rels']

        for relationship in relationship_list:
            print("adding edge")

            # Bear in mind that when we ask for all relationships on a node,
            # we may find a node that PRECEDES the current node -- i.e. a node
            # whose relationship starts outside the current subgraph returned
            # by this query.
            if relationship.start in node_id_set:
                graph.add_edge(
                    relationship.start, relationship.end, key=relationship.type,
                    **relationship.properties
                )
            else:
                print("ignoring dangling relationship [no need to worry]")

    return graph
Posted 2018-05-09

This is something of a pain in the arse, there are several main points to remember. These points apply to the version 8.5.8+dfsg-5, from Ubuntu universe.

Note that the Debian-derived gitlab package has been REMOVED from Debian bionic.

Install the packages from universe

Work around bug 1574349

You'll come across a bug: https://bugs.launchpad.net/ubuntu/+source/gitlab/+bug/1574349

The tell-tale sign of this bug is a message about a gem named devise-two-factor. As far as I can tell, there's no way to work around this and stay within the package system.

You have to work around this, but first:

Install bundler build dependencies

apt install cmake libmysqlclient-dev automake autoconf autogen libicu-dev pkg-config

Run bundler

Yes, you're going to have to install gems outside of the package system.

  • # cd /usr/share/gitlab
  • # bundler

And yes, this is a bad situation.

Unmask all gitlab services

[Masking] one or more units... [links] these unit files to /dev/null, making it impossible to start them.

For some reason the apt installation process installs all the gitlab services as masked. No idea why but you'll need to unmask them.

systemctl unmask gitlab-unicorn.service
systemctl unmask gitlab-workhorse.service
systemctl unmask gitlab-sidekiq.service
systemctl unmask gitlab-mailroom.service

Interactive authentication required

You're going to face this error, too. You need to create an override so that gitlab gets started with the correct user. You can do that with systemctl edit gitlab, this will create a local override.

Insert this in the text buffer:

[Service]
User=gitlab

Save and quit and now you need to reload & restart.

systemctl daemon-reload
systemctl start gitlab

Purging debconf answers

Since gitlab is sometimes an interactively configured package, sometimes stale information can get stored in the debconf database, which will hinder you. To clear this out and reset them, do the following:

debconf-show gitlab
echo PURGE | debconf-communicate gitlab

This is the first time I've had to learn about this in a good 10 years of using and developing Debian-derived distributions. That's how successful an abstraction debconf is.

Update 2018-05-04: Also pin ruby-mail package to artful version 2.6.4+dfsg1-1

Posted 2018-03-09

[Originally written 2017-09-22. I don't have time to finish this post now, so I might as well just publish it while it's still not rotted.]

While coding large backend applications in Clojure I noticed a pattern that continued to pop up.

When learning FP initially, you initially learn the basics: your function should not rely on outside state. It should not mutate it, nor observe it, unless it's explicitly passed in as an argument to the function. This rule generally includes mutable resources in the same namespace, e.g. an atom, although constant values are still allowed. Any atom that you want to access must be passed in to the function.

Now, this makes total sense at first, and it allows us to easily implement the pattern described in Gary Bernhardt's talk "Boundaries", of "Functional Core, Imperative Shell" [FCIS]. This means that we do all I/O at the boundaries.

(defn sweep [db-spec]
  (let [all-users (get-all-users db-spec)]
    (let [expired-users (get-expired-users all-users)]
      (doseq [user expired-users]
        (send-billing-problem-email! user)))))

This is a translation of Gary's example. A few notes on this implementation.

  1. sweep as a whole is considered part of the imperative shell.
  2. get-all-users and send-billing-problem-email! are what we'll loosely refer to as "boundary functions".
  3. get-expired-users is the "functional core".

The difference that Gary stresses is that the get-expired-users function contains all the decisions and no dependencies. That is, all the conditionals are in the get-expired-users function. That function purely operates on a data in, data out basis: it knows nothing about I/O.

This is a small-scale paradigm shift for most hackers, who are used to interspersing their conditionals with output; consider your typical older-school PHP bespoke system, which is bursting with DB queries that have their result spliced directly into pages. But, this works very well for this simple example. It accomplishes the goal of making everything testable pretty well. And you'd be surprised how far overall this method can take you.

It formalizes as this: Whenever you have a function that intersperses I/O with logic, separate out the logic and the I/O, and apply them separately. This is usually harder for output than for input, but it's usually possible to construct some kind of data representation of what output operation should in fact be effected -- what I'll call an "output command" -- and pipe that data to a "dumb" driver that just executes that command.

You can reconstruct most procedures in this way. The majority of problems, particularly in a backend REST system, break down to "do some input operation", "run some logic", "do some output operation". Here I'm referring to the database as the source and target of IO. This is the 3-tier architecture described by Fowler in PoEAA.

However, you probably noticed an inefficiency in the code above. Likely we get all users and then decide within the language runtime whether a given user is expired or not. We've given up the ability of the database to answer this question for us. Now we're reading the entire set of users into memory, and mapping them to objects, before we make any decision about whether they're expired or not.

Realistically, this isn't likely to be a problem, depending on the number of users. Obviously Gmail is going to have a problem with this approach. But surely you're fine until perhaps 10,000 users, assuming that your mapping code is relatively efficient.

Anyway, this isn't the problem that led me to discover this. The problem happened when I was implementing the basics of the REST API, and attempting to be as RESTfully-correct as possible, I wanted to use linking. This seems easy, when you only need to produce internal links, right? In JSON, we chose a certain representation (the Stormpath representation).

GET /users/1

{
   "name": "Dave",
   "pet": {"href": <url>},
   "age": 31
}

Now, assume we also have a separate resource for a user's pet. In REST, that's represented by the URL /pets/1 for a pet with identifier 1. We have the ability to indicate this pet through either relative or absolute URLs. Assume that our base URL for the API is https://cool-pet-tracker.solasistim.net/api.

  • The relative URL is /pets/1.
  • The absolute URL is https://cool-pet-tracker.solasistim.net/api.

If you search around a bit, you'll find that from what small amount of consensus exists, REST URLs that get returned are always required to be absolute. This pretty much makes sense, given that a link represents a concrete resource that is available at a certain point in time, in the sense of "Cool URLs Don't Change".

Now the problem becomes, say we have a function that attempt to implement the /users/:n API. We'll write this specifically NOT in the FCIS style, so we'll entangle the I/O. (Syntax is specific to Rook.)

 (defn show [id ^:injection db-spec]
   (let [result (get-user db-spec {:user (parse-int-strict id)})]
     {:name (:name result)
      :pet nil
      :age (:age result)}))

You'll notice that I left out the formation of the link. Let's add the link.

 (defn show [id request ^:injection db-spec]
   (let [result (get-user db-spec {:user (parse-int-strict id)})]
     {:name (:name result)
      :pet (make-rest-link request "/pet" (:pet_id result))
      :age (:age result)}))

Now, we define make-rest-link naively as something like this.

 (defn make-rest-link [request endpoint id]
    (format "%s/%s/%s" (get-in request [:headers "host"])
                       endpoint
                       id))

Yeah, there's some edges missed here but that's the gist of it. The point is that we use whatever Host URI was requested to send back the linked result. [This has some issues with reverse proxy servers that sometimes calls for a more complicated solution, but that's outside the scope of this document.]

Now did you notice the issue? We had to add the request to the signature of the function. Now, that's pretty much a small deal in this case: the use of the request is a key part of the function's purpose, and it makes sense for every function to have knowledge of it. But just imagine that we were dealing with a deeply nested hierarchy.

(defn form-branch []
  {:something-else 44})

(defn form-tree []
  {:something-else 43
   :branch (form-branch)})

(defn form-hole [id]
   {:something 42
    :tree (form-tree)})

(defn show [id ^:injection db-spec]
  (form-hole id))

As you can see, this is a nested structure: a hole has a tree and that tree itself has a branch. That's fine so far, but we don't really want to go any deeper than 3 layers. Now, the branch gets a "limb" (this is a synonym for "bough", a large branch). But we only want to put a link to it.

(defn form-limb [request]
  (make-rest-link request "/limbs" 1))

(defn form-branch [request]
  {:something-else 44
   :limb (form-limb request)})})

(defn form-tree [request]
  {:something-else 43
   :branch (form-branch request)})

(defn form-hole [id request]
   {:something 42
    :tree (form-tree request)})

(defn show [id request ^:injection db-spec]
  (form-hole id request))

Now we have a refactoring nightmare. All of the intermediate functions, that mirror the structure of the entity, had to be updated to know about the request. Even though they themselves did not examine the request at all. This isn't bad just because of the manual work involved: it's bad because it clouds the intent of the function.

Now anyone worth their salt will be thinking of ways to improve this. We could cleverly invert control and represent links as functions.

(defn form-branch []
   {:something-else 44
    :limb #(make-rest-link % "/limbs" 1)})

Then, though, we need to run over the entire structure before coercing it to REST and specially treat any functions. This could be accomplished using clojure.walk and it would probably work OK.

What's actually being required here? What's happened is a function deep in the call stack has a need for context that's only available in the outside of the stack. But, that information is really only peripheral to its purpose. As you can see we were able to form an adequate representation of the link as a function, which by no means obscures its purpose from the reader. If anything the purpose is clearer.

This problem can also pop up in other circumstances that seem less egregious. In general, any circumstance where you need to use I/O for a small part of the result at a deep level in the stack will result in a refactoring cascade as all intervening functions end up with added parameters. There are several ways to ameliorate this.

1: The "class" method

This method bundles up the context with the functionality as a record. The context then becomes referrable to by any function within that protocol.

(defprotocol HoleShower
  (show [id] "Create JSON-able representation of the given hole."))

(defrecord SQLHoleShower [request db-spec]
  HoleShower
  (show [this id]
    {:something 42
     :tree (form-tree id)})
  (form-tree [this id]
    {:something-else 44
     :branch (make-rest-link request "/branches" 1)}))

As you can see, we don't need to explicitly pass request because every instance of an SQLHoleShower automatically has access to the request that was used to construct it. However, it has the very large downside that these functions then become untestable outside of the context of an SQLHoleShower. They're defined, but not that useful.

2. The method of maker

This is a library by Tamas Jung that implements a kind of implicit dependency resolution algorithm. Presumably it's a topo-sort equivalent to the system logic in Component.

(ns clojure-playground.maker-demo
  (:require [maker.core :as maker]))

(def stop-fns (atom (list)))

(def stop-fn
  (partial swap! stop-fns conj))

(maker/defgoal config []
  (stop-fn #(println "stop the config"))
  "the config")

;; has the more basic 'config' as a dependency
;; You can see that 'defgoal' actually transparently manufactures the dependencies.
;; After calling (make db-conn), (count @stop-fns) = 2:
;; that means that both db-conn AND its dependency config were constructed.

(maker/defgoal db-conn [config]
 (stop-fn #(println "stop the db-conn"))
  (str "the db-conn"))

;; This will fail at runtime with 'Unknown goal', until we also defgoal `foo`
(maker/defgoal my-other-goal [foo]
  (str "somthing else"))

The macro defgoal defines a kind of second class 'goal' which is only known about by the maker machinery. When a single item anywhere in the graph is "made" using the make function, the library knows how to resolve all the intermediaries. It's kind of isomorphic to the approach taken by Claro, although it relies on more magical macrology.

https://www.niwi.nz/2016/03/05/fetching-and-aggregating-remote-data-with-urania/ https://github.com/kachayev/muse https://github.com/facebook/Haxl https://www.youtube.com/watch?v=VVpmMfT8aYw

See this: Retaking Rules for developers: https://www.youtube.com/watch?v=Z6oVuYmRgkk&feature=youtu.be&t=9m54s

And of course, the "Out of the Tar Pit" paper.

Update 2018-08-14: Two other solutions to this broad problem are the Reader monad (see this great Pascal Hartig article and what Sinclair refers to as the Effect functor.

Posted 2018-02-27

This blog is powered by coffee and ikiwiki.