Quick & Dirty Forking(edit)
I'm left with the decision do I clone the project locally and fire up my own server, look at it, then shortly after delete it as I was only planning to browse...or something else?
Sure, I know this isn't really a fork, but 5 minute clone didn't have the same ring to it!
Attend ffconf 2016
The UK's best JS and web development conference. Find out more & get tickets.
The concept is simple, but requires a non-blocking server if I wanted to make the service public - which I'll explain why in a moment:
- It needed to be simple to go from github repo url to this preview system.
- The previewed fork needed to be served from the root of the hostname, because it's likely the resources in the repo would run off absolute paths.
- I wanted the previews to automatically clean themselves up, so I didn't end up chomping loads of disk space.
- Cloning a project should not take down the server!
I knew this should be simple with Node.js mostly because I needed to send off the clone process in the background (to the main application) and only once it's returned, do I serve up the forked content.
The way it works
Using the connect library I have my own custom router that does most of the work.
Then clones are spawned out to a separate process, and once completed, a unique hash is created as a new subdomain for the server with it's own router pointing to the new clone. After 5 minutes of idle time, the clone is automatically removed and cleaned up.
When you request anything off the default root of the server, it assumes (unless the resource is found in the
public directory) that you're referring to a github repo, ie.
The problem was that usernames and repos are case sensitive, but urls aren't. So when you redirect to the subdomain (which I wanted to be
remy-5minfork), it wouldn't work, because subdomain switch to lowercase, and now it doesn't know the difference between
This subdomain then looks up the original github repo url, and kicks off the fork process.
Forking (actually cloning...)
I apologise for my interchangeable use of fork & clone - blame github (apparently).
This is the "clever bit" (given how short the code is, don't expect to be wowed).
A new router is created for this specific fork, and stored against the hashed url (so we know when we hit
http://abc321.5minfork.com it has a bunch of info associated with it).
Then with that new router, it passes the original
response objects, which then handle the entire http request - thus being able to serve up the static previewed repo.
The final bits
You might think I was using the vhosts connect module, but I'm not, a simple bit of middleware is parsing the host out of the request header and making up the
req.subdomain property, which I use to match up to the hash that links to the repo data.
A simple timer fires every 10 seconds to see when the repo was last accessed, and if that difference is more than 5 minutes, it blows away the directory.
Like I said, quick and dirty.
Of course all the code is up on github. Hopefully this write up or the tool itself is useful to you, I know I've already started using it. Now all I need is a little browser plugin to add a preview icon next on the github repo page, so it's even faster to get the 5 minute fork!