Author Archives: dnh10

8 June 2012

In attendence: Omri Berenstein, Thomas Helmuth, Zeke Nierenberg, Lee Spector, Emma Tosch

Administrative details and getting started with Clojure with Zeke

Emma’s talk

Lots of talk about Emma’s talk

Progress reports

  • Emma : talk prep, working on web interface for launching and analysing runs
  • Tom :
  • Omri : talk prep, working on scoped clojush
  • Zeke : getting settled into using Clojure
  • Lee :

1 June 2012

In attendence :  Lee, Kwaku, Tom, Zeke, Omri, Emma

kwaku’s talk
talking too fast
dynamic stuff – need to be a it clearer – did he address why bloat is a problem?
too much stuff on the results page
space out slides
practice talking without the slides – maybe write out speech and then add slides into the talk where appropriate, then ditch the talk
if there are words between slides in

the talk, make them into two seperate slides
focus on why the problem is interesting

tom status
email back from jensen on evolving classifiers – community not that interested, due to community interest in accuracy, rather than interpretability
– what about pitching marginal cases? / outliers
hod lipson – eurika – scientific theories based on data – ML guys think the overfitting part is uninteresting
modelling individual differences using LDA – connection to EFK

zeke
cycloformation? – formation of new carbon ring something something
rings = important
rings = dangerous to make, trying to make reactions happen in water, rather than something dangerous

25 May 2012

In attendence : Lee, Kyle, Tom, Emma, Kwaku, Omri

three fundamental features of biological evolution (from abstract)
1) particulate genes carry some subtle consequences for biological evolution that have not yet translated mainstream EC
2) the adaptive properties of the genetic code illustrate how both communities can contribute to a common understanding of appropriate evolutionary abstractions
3) EC exploration of representational language and its role in the genotype/phenotype debate

Regarding #3 — consider some of the points in the Language Instinct. We don’t want language to change, even though the meaning change – better said, we don’t want a faithful transcription of sound via characters because these aren’t robust to change. think about the mit article I forwarded – these loop thingies aren’t calculating exact solutions, but approximate solutions. Relationship between GP/GA and other stochastic learning systems (and probabilistic systems) and approximation algorithms.

can approximation algorithms make seemingly inherently sequential problems into parallel problems?

“conscious intelligence vs natural selection”
i.e. engineering vs science
i.e. optimal solutions vs approximate solutions

randomness not just of the RV but of the model?
meta-randomness
second-order randomness
probabilistic programming languages as second order randomness?
need padding in randomness, redundancy to ensure robustness

buidling a system that is sound and consistent but not complete – but GPs are already turing complete

need probabilistic analysis in the test cases – starting with concise representations of the test cases’ domains – think like an ML – domains are sets that can be represented in closed form or with a generating function

what is the purpose of having EC mimic biology?
(1) to make adaptive programs, following the intuition that evolution is a powerful force we need to come to understand
(2) to better understand biology and complex stochastic systems

this paper references Ostrowski and Reynolds as people who are studying EC from a search perspective

(1) PARTICULATE GENES AND POPULATION GENETICS
(2) THE ADAPTIVE GENETIC CODE
tom’s thoughts:

(3) THE DICHOTOMY OF GENOTYPE AND PHENOTYPE

genetic drift as a desirable characteristic

authors posit that particulate genes will help redefine recombination. i am still curious about the role of redundancy in EC and whether it has been investigated – connections to language, to probabilistic language classes

my angle: redundancy – it’s everywhere! doesn’t conflict with the bayesian stuff either

how to balance small populations, where mutations are more pronounced, with the dilution that occurs over larger populations over many generations – maintaining locality (and perhaps a linearity of fitness) while viewing a population-wide (ie across subpopulations) concentration of parameters/measure/etc.

IMPORTANT: “Unless offspring are infinite in number, their allele frequencies will not accurately mirror those of the parental generation, but instead will show some sampling error (genetic drift).”

“In effect, pariculate genes in finite populations improve the evolutionary heuristic from a simple hill climbing algorithm to something closer to simulated annealing under a fluctuating temperature.”

“co-adapted gene complexes” – Fisher, 1930 and O’Reilly 1999

Adaptive cookbook = REDUNDANCY

error minimizing code smoothing the fintess landscape? – comes from upublished data


lee’s stuff
tag space machine
works with stacks, all code lives in the tag space
ratio tags, denseness of numbers
cucumber – got the R spec and cucumber book
smaller GP steps into cycle
some red/green thing?
1) penumbra
2) keep cool? – climate change negotiation game between countries – can we find or quickly write a simulator?
3) Zeek Nieremberg – new lab member – filing Div 3, mostly a natural science guy
4) physical space, course partcipation – GP and intro class on creativity

tom stuff
1) size-based nodes for tournie selection – turn into journal article? tom’s not that interested, but it is low-hanging fruit; IEEE transactions on EC – publish more things like case-studies
2) evolving classifiers research – emailed Jensen
3) something more substantial with tags – Lee has faith in inexact matching – can we find a way to fit this into probabilistic matching? well, matching is probabilistic anyway via the gp mechanism; maybe the real thing
sean luke – GECCO paper – benchmarks in GP

emma stuff
get something for the evolutionary computation journal
more theoretical people are in that journal
extend to other EC systems
extend beyond EC systems to other stochastic systems

kyle stuff
dissertation – coevolution, lots of pages
kyles talk – autoconstruction
can order be reduced to ILP? We know that ILP is intractable.
what about picking a problem that’s easier with probabilistic guarantees?
what about running TSP vs clique
how do you measure “useful genetic material?”
run cosmos on the problems – look at the behavior

Clojush Environments Documentation

Overview

This update introduces environments (previously called “scopes”) to Clojush in an effort to offer encapsulation of functionality. Many times, it is nice to perform instructions and return a value but not affect the rest of the stack space, which environments allow you to do.

New Stacks

There are two new stacks, :environment and :return.

 :environment stack

The :environment stack holds environments, which are essentially Push interpreter states that can be saved and later returned to. In fact, since the Push state is stored as a map, this entire map is pushed onto the stack, and later can replace the current state when popping it off the :environment stack.

An item on the :environment stack can be popped in two ways. The first is by the instruction environment_end. The second automatically happens when the :exec stack is empty and there is something on the :environment stack. When the :environment stack is popped, that popped state replaces the current Push interpreter state. Then, the instructions remaining on the old :exec stack (if any) are placed at the beginning of the new :exec stack. Finally, everything on the :return stack is pushed onto the :exec stack, with the top item of the :return stack pushed first (so that the bottom item of the :return stack will be the top item of the :exec stack).

 :return stack

The :return stack allows both literals and instructions to be “returned” to the pushed environment when it is popped (see above). There are instructions for moving the top item of any stack to the :return stack, including code from the :exec stack. Additionally, there are specialized instructions that can be used to “consume” arguments in the pushed environment by popping them, and for tagging literals in the pushed environment (see those instructions below).

New Instructions

environment instructions

  • environment_new – Saves the next instruction or parenthetical grouping on the :exec stack and pushes the state onto the :environment stack, with the saved instruction/grouping popped from the :exec stack of the pushed state. The :exec stack is replaced by this saved instruction/grouping, the :return stack is emptied, and the rest of the state stays the same as it was before this instruction.
  • environment_begin – Pushes the state onto the :environment stack (with an empty :exec stack). The :return stack is emptied in the current state, but all other stacks (including the :exec stack) are left full.
  • environment_end – If the :environment stack is empty, no-ops. Otherwise, does the same as what happens when there is an empty :exec stack with something on the :environment stack, which is described above.

return stack instructions

  • return_from<type> – For <type> equivalent to the name of one of the stacks :boolean, :integer, :float, :string, :zip, and :exec, these instructions pop the next item off of the <type> stack and push that item onto the :return stack.
    • return_fromcode – Returning something from the :code stack is different, since normal instructions would just be performed on the :exec stack. So, return_fromcode pushes (code_quote top_code_stack_item) onto the :return stack, which will result in the top item of the current :code stack being quoted (and thus pushed to the :code stack) in the popped environment.
  • return_<type>_pop – For <type> including each of the stacks above (including :code), this instruction places the instruction <type>_pop onto the bottom of the :return stack. The effect is that in the popped environment, these popping instructions will be the first thing that is executed, and will effectively “consume” the “arguments” given on that parent environment’s stacks.
  • return_tagspace – This instruction immediately copies the tagspace of the current environment to the tagspace of the first environment on the :environment stack. This allows children environments to give their tagspace to their parent.
  • return_tag_<type>_<index> – There is a new erc, the return-tag-instruction-erc, that returns instructions of this form. These instructions, when executed, push (<literal> tag_<type>_<index>) onto the :return stack, where <literal> is the top item on the <type> stack (and is popped from that stack). These instructions allow an environment to add tags to their parent environment.

Original Proposal

The original proposal, which is somewhat accurate still, is kept here for reference:

  • Use environment_new (previously scope_enter) to enter a new scope. This instruction pushes a copy of the stacks onto the :environment stack, clears the :exec and :return stacks, and pushes the next item on the old :exec stack onto the new :exec stack, popping it from the old one.
  • Whenever the :exec stack is empty during execution, Push will first check if there is anything on the :environment stack. If so, it is popped off (replacing all the stacks with the ones stored in the top environment), and then everything on the original :return stack is pushed onto the :exec stack.
  • No instructions fetch or push literals between environments on the :environment stack. As of now, the only way to alter the environment stack is either through the instruction environment_push or by having an empty :exec stack.
  • Return instructions will allow environments (scopes) to pass return values, consume arguments, and tag things in other environments.
    • Instructions like return_integer will move the top item of the :integer stack onto the :return stack.
    • Instructions like return_integer_pop will push the instruction integer_pop onto the :return stack, so that the “parent” environment will pop the top integer before executing further. To avoid popping instructions from just popping literals that have been pushed by instructions like return_frominteger, the popping instructions will be inserted at the bottom of the :return stack, and thus executed first in the state once the environment has ended. Since ordering of pop instructions is commutative, it is inconsequential that return_pop_stack instructions that are executed later will result in pop_stack instructions that are executed sooner.
    • Instructions like return_tag_integer_123 (say 2999 is on top of the :integer stack) will push the code ‘(2999 tag_integer_123) onto the :return stack and pop 2999 from the :integer stack. This will cause the tag 123 to be associated with 2999 in the “parent” environment.
    • The instructions return_exec and return_code are legal and pull the top code item off the respective stack to the :return stack. return_code will put a code_quote before the code, so that it will be moved to the :code stack in the “parent” environment. For example, the code ‘(return_code (2 integer_add)) will put (code_quote (2 integer_add)) on the :return stack. return_exec will just pull the next instruction grouping itself, meaning that the code will be executed in the “parent” environment. Note: Since this leads to permeability of the scoping of environments, it may often be best to leave return_exec out of the instruction set to avoid undesirable messing of stacks between environments.

Using leiningen to add libraries to your Clojure project

Leiningen is a command-line tool that can be used to do many things to and with Clojure projects, including creating them and running them.

Here I will explain how to install Leiningen and use it to add libraries to your projects. The explanation here is very detailed, assuming you’ve never done anything like this before.

First, install Leiningen on your system

Leiningen can be obtained from [https://github.com/technomancy/leiningen]

The details for installing leiningen will vary a bit depending on your operating system. I’ll give full details here for how to do it in MacOS, and the process will be similar for linux. Windows will be a bit more different; there are windows instructions on the leiningen site.

The short version of how to install leiningen is something like: It’s just a shell script; put it somewhere on your path and make it executable. Done. But for those of you who haven’t done this sort of thing before, here is one way of doing this in full detail (specified for MacOS).

Go to [https://github.com/technomancy/leiningen]. Click on the relevant link and save the resulting page as lein.txt. Put it on your desktop for now.

You’re going to want to invoke the script that’s in this file from a command line. We’ll also use a commands at the command line to move it into the right place and set it up properly.

In MacOS you get a command line by launching the Terminal application which can usually be found in Applications/Utilities. When you launch Terminal you’ll get a command prompt and you’ll be “in” the top level of your user directory. The command “ls” will show you what’s there, which may help to orient you.

The script file needs to be put somewhere on your “path”, which is where the command line interface will look for commands that you type. You could put it anywhere, and add the place you put it to your path, but what I do (and I think is pretty common) is to create a directory (folder) called “bin” in my user directory. You can do this from the finder or like this at the command line:

mkdir bin

Now you need to move the script file (lein.txt) into the bin folder and rename it just “lein”. One way to do this is to use the finder: drag it into the bin folder and then rename it, although the renaming here will be a little annoying because of the way file suffixes are hidden in the finder… you have to do File > Get Info on it and change the name there, and tell it that you really don’t want a suffix, hidden or not. I found it simpler just to do this at the command line:

mv Desktop/lein.txt bin/lein

Now you have to make the script file “executable”. You can do this with the “chmod” command. I do it with:

chmod +x bin/lein

The leiningen instructions say “chmod 755 ~/bin/lein” which will work too. I prefer +x instead of 755 because it’s easier for me to remember, since it means “add execute”. The “~/” in this version means “starting from the user’s home directory”, so no matter what directory you’re currently in it will work on the file that’s in your bin directory. If you’re following my directions above then you’re already in your home directory, so you don’t need that.

Now you have to make sure that the bin folder is actually on your path. If it is, it will probably be because you previously created it and put it there, and you will know that! You can check by typing:

echo $PATH

and looking for “~/bin” in the result.

Let’s assume that it’s not there (which it won’t be unless you’d done stuff like this before). It can be added to your path with the “export” command, but that will add it only for the current terminal session. We want it to be added automatically every time you open a terminal session, so that you don’t have to think about this again. This can be done by adding a call to the export command in a file in your home directory that gets run every time you launch a terminal session. In MacOS this is a hidden file called “.profile” — it will have a different name in other operating systems.

The command that you want to add to .profile is:

export PATH=~/bin:$PATH

You can do this in a text editor, or from the command line with this command:

echo "export PATH=~/bin:$PATH" >> .profile

This won’t make a difference for the terminal session that you’re in, but open a new terminal window and .profile will be run for it. Then you can type:

lein

And it will do a bunch of downloading and configuration stuff and then, if all goes well, you’ll get a welcome message and a list of lein commands. If you get the welcome message then lein has been installed correctly on your computer.

Add a dependency to your project and use it

  1. Include an additional item in the :dependencies specification in your project’s project.clj file.
  2. Run “lein deps” at the command line (which will pull in the necessary files from the web).
  3. Make sure the new stuff is visible to the environment in which you’re running your code; for clooj I quit/restart clooj after running lein deps.

How do you find the item to include in project.clj? That depends on what dependency you want to pull in. Things that are part of Clojure itself can be found at https://github.com/clojure, and if you look at the readme text for any of the specific libraries you’ll see what you need to add. For example, clicking on math.combinatorics I see that this is the thing to add to the :dependencies specification for leiningen: [org.clojure/math.combinatorics “0.0.3”]

For example, if I make a brand new project (in clooj) called foo then the project.clj that’s made automatically for me will look like this:

(defproject foo "1.0.0-SNAPSHOT"
  :description "FIXME: write"
  :dependencies [[org.clojure/clojure "1.3.0"]])

If I want to add a dependency to math.combinatorics then I should change it to be:

(defproject foo "1.0.0-SNAPSHOT"
  :description "FIXME: write"
  :dependencies [[org.clojure/clojure "1.3.0"]
                 [org.clojure/math.combinatorics "0.0.3"]])

Once I’ve done this I should save the file, switch to Terminal, cd into the foo directory, and do:

lein deps

This will pull in the math.combinatorics library, and restarting clooj will ensure that clooj “sees” that the new file is there.

Next, make sure that your code requires or uses the new library’s namespace. This can be done with the use or require functions, or (preferably) in your code’s ns declaration. For example, in my foo project I put this in my core.clj:

(ns foo.core
  (:use [clojure.math.combinatorics]))

Then, after evaluating this, I can do things like:

(combinations [1 2 3] 2)

to get:

((1 2) (1 3) (2 3))

Find and use other libraries

There are a lot of other things out there that you might want to use, and often the best way to find them is with a general web search. For example, suppose that you want to use the “quil” library to do Processing graphics (this is a version of the library used in Processing — [http://processing.org]). A quick search for “clojure quil” finds [https://github.com/quil/quil] and this includes nice instructions with the thing that you have to add to your project.clj ([quil “1.6.0”]) and also information on how to set up your namespace and some example code.

One other specific library that may be of interest is a small library that makes it easy to refer to local files; see [https://github.com/arthuredelstein/local-file/]. (I believe that this functionality will be built in to Clojure 1.4, but as of 1.3 this library is still quite useful.)

One other site to know about is [http://clojars.org] — aka “clojars”. Clojars is a community repository for open source Clojure libraries, and that’s where a lot of commonly used libraries actually reside. You can use the site to look up the dependency information that you need to put in your project.clj for all of the projects that it houses, although clojars doesn’t itself serve documentation (so you usually have to do a general web search to find that).

Clojure Immersion

Introduction to using the cluster with tractor

What is the cluster?

The cluster is our high performance computing facility located in ASH 130, owned by the School of Cognitive Science. You can get more information about its current status by visiting fly.hampshire.edu/ganglia/ – its hardware configuration is changing all the time. The cluster is used for many things, and the nodes (as well as the classroom/ndrome Macs) can be passed arbitrary commands by its job distribution system, currently tractor. This document attempts to explain how to get a quick start with tractor. The full, non-site-specific documentation is available here: http://fly.hampshire.edu/docs/Tractor/.

Note that while the classroom and ndrome Macs are not part of the cluster, they are part of what you may hear referred to as the “farm” or “render farm”, as in they can be passed jobs by tractor, so they are discussed in this document for that reason.

The network configuration of the cluster sometimes confuses people. The head node is the only node that is directly accessible from the outside world or elsewhere on campus. Please do not ever run jobs on the head node. On the other hand, it is ok to start tractor jobs on the head node (see below). To get into the head node, you ssh <username>@fly.hampshire.edu. The first time you log in, it will ask you to set a passphrase for your SSH key. I recommend you do not set one – leave it blank. That way you will be able to SSH into the nodes without a password. Once logged into the head node on the command line, should you want to directly log into a compute node, you would do “ssh <nodename>”, for instance, “ssh compute-1-1” and you will notice your prompt change and you are now logged into compute-1-1. For an up-to-date list of nodes, see the same link as previously mentioned above, fly.hampshire.edu/ganglia/

What is tractor?

Tractor is a job distribution system, made by Pixar, that is intended primarily for use as a distribution system for software rendering of 3D animated movies. However, it is designed to be highly configurable and flexible, and many of the mechanisms it has in place are useful for arbitrary applications. It has a dependency-tree XML-style job description format, provisions for building many different requirements into what kind of node gets what jobs and when, what order things must be executed in, things that must be done and/or checked before the job can start, and much more.

The idea behind tractor is that there is a central tractor “engine”, in this case the head node of the cluster, that is the job repository and configuration center. When the blades (the computers that will actually be running jobs) start up, they find the engine via DNS entry, and ask it for the blade definitions, match themselves against them and decide which definitions(s) they fit, and then ask for the list of jobs, and decide if they can take any of them. If they do take any of them, they inform the engine that they have, and the engine takes it off the list of available jobs. The blades check in with the engine frequently, informing it of their progress and status. All of this can be seen at the tractor “dashboard” at fly.hampshire.edu:8000/tractor/tv/. The dashboard has no actual authentication – just enter a username that exists on the cluster and it will let you in. All rendering jobs are run as the user “anim”, so that’s probably the most interesting user to log in as to see what’s going on. Please don’t change anything unless you are authorized to do so. You can, however, see the blade status as any valid user.

What is a job?

So tractor (and many other job distribution systems) has a concept of a “job”, which is one big bunch of executable stuff that is interconnected in some way. You might call it a “run”, or any number of other things, but it’s the packet of stuff that you submit to tractor to tell it that you want it to do something. Each job is broken up into tasks, when can then subsequently be broken up into subtasks. The full documentation about how to script these is here: http://fly.hampshire.edu/docs/Tractor/. However, I will give you a few simple examples, and then show you which of the commands on that page might be most useful to you in our environment, and then how to actually submit a job in our environment. So, here’s a very simple job that will just return the result of running /bin/date on the blade that it gets executed on:

Job -title {high priority task} -serialsubtasks 0 -subtasks {
Task -title {/bin/date on Linux} -cmds {
RemoteCmd {/bin/date} -service {Linux}
}
}

This just says, “We’ve got a job. We’d like to call the job “high priority task”, its subtasks should run in parallel (aka they are not serial subtasks), and it’s only got one task. The title of this task is “/bin/date on Linux”. The command to be executed, which is a remote command (meaning it doesn’t have to be executed locally, most things will be RemoteCmds), is just “/bin/date”, and that has to run on a blade that matches the service “Linux”. Note that there can be multiple RemoteCmd blocks making up the -cmds {} segment of a Task.

One can submit this job by saving it as “job.alf” on fly’s head node, and running “source /etc/sysconfig/pixar” and then “/opt/pixar/tractor-blade-1.6.3/python/bin/python2.6 /opt/pixar/tractor-blade-1.6.3/tractor-spool.py –engine=fly:8000 job.alf”

You can then go to the tractor dashboard, fly.hampshire.edu:8000/dashboard/, log in as the user you submitted the job as, and see the status if your job. You’ll see the dependency tree laid out visually for you in the right pane of the “jobs” page. You can double-click the one task to open up a new window with its output, which should just be the date on one of the linux nodes. You can click on “blades” to see the status of each blade.

You can, of course, build much more complicated jobs. Mostly, folks who use tractor have their scripts generated for them by some controlling process. When we render 3D animation with tractor we do just that. Here is a snippet from an autogenerated script:

##AlfredToDo 3.0
Job -title {sunrise RENDER c1_08 full 1-239 eyeTracking} -envkey {rms-3.0.1-maya-2009} -serialsubtasks 1 -subtasks {
Task {Job Preflight} -cmds {
RemoteCmd { /helga/sunrise/tmp/c1/c1_08/render.111003.124955/jobpreflight.sh } -service {slow}
}
Task {Layers Preflight} -serialsubtasks 0 -subtasks {
Task {Layer eyeTracking} -cmds {
RemoteCmd { /helga/sunrise/tmp/c1/c1_08/render.111003.124955/eyeTracking.sh } -service {slow}
}
}
Task {Render all layers 1-239} -serialsubtasks 0 -subtasks {
Task {Render layer eyeTracking 1-239} -subtasks {
Task {Layer eyeTracking frames 1-1} -cmds {
RemoteCmd { /helga/sunrise/tmp/c1/c1_08/render.111003.124955/launchRender -batch -command {helgaBatchRenderProcedure("eyeTracking",1,1,1,"main")} -f
ile /helga/sunrise/tmp/c1/c1_08/render.111003.124955/eyeTracking.ma -proj /helga/sunrise/wc } -service {slow}
}
Task {Layer eyeTracking frames 2-2} -cmds {
RemoteCmd { /helga/sunrise/tmp/c1/c1_08/render.111003.124955/launchRender -batch -command {helgaBatchRenderProcedure("eyeTracking",2,2,1,"main")} -f
ile /helga/sunrise/tmp/c1/c1_08/render.111003.124955/eyeTracking.ma -proj /helga/sunrise/wc } -service {slow}
}
...

Note that this script features nested tasks, cases where serialsubtasks is set to both 0 and 1, and a specific service tag “slow”.

Service tags

You’ll notice that there is a “-service {Linux}” in the first example job above. This is the category of blade that the command in question can be executed on. It is not case-sensitive. We currently have the following services defined, in addition to the standard Pixar services (“PixarRender”, “PixarNRM”, “RfMRender”, “RfMRibGen”, and “PixarMTOR”, which all of our blades currently provide):

  • fast – these are blades that should be free most of the time for executing quick runs. Please do not use this service tag on a job that will take a long time. It is intended to be reserved for jobs with quick turnaround. We are not using this system right now and the fast/slow tags are currently broken and need rethinking.
  • slow – these are blades that are designated OK to submit long grinding overnight or longer jobs to. Use this tag if you don’t need your job to come back quickly and it might take a long time. Again, this idea/system is currently broken.
  • linux – these are blades that are running linux. Use this service tag if you know your job will only run on linux.
  • cluster – these are blades that are nodes in the cluster. If you only want your job to run on cluster nodes, use this tag.
  • rack1 – these are blades in Rack 1 of the cluster, which is mostly whitebox desktop hardware. At this time, they are 4 to 8-core nodes with at least 2GB of RAM per core.
  • rack2 – these are Dell PowerEdge 1855 blade servers with 4GB of RAM each. They are very old and slow, the oldest and slowest machines in our render farm, but they do well on certain tasks that aren’t easily parallelizable, because they have high clock speeds even though they don’t have a lot of cores.
  • rack4 – these are blades in Rack 4 of the cluster, which is full of what I call “cluster-in-a-box” nodes, meaning tons of processor cores and lots of RAM per node. At the time of this writing, we have 3 32GB 48-core nodes, one 24GB 8-core 16-execution-unit node, and one 64GB 64-core node in this rack. This tag is mostly for Lee, but others doing GP runs or other similar things using clojure may also find it useful. Please check in with Lee before you use this service tag.
  • macosx – these are blades that are running OSX. At the time of this writing, this means the classroom and the ndrome, with the exception of the one linux box in the ndrome.
  • shake – these are blades that can run shake. At the time of this writing, that means all the OSX machines, but we separated them in case that was ever not the case.
  • blender – these are the machines that can run blender
  • tom – this is a special service tag for Tom, that selects rack1 and rack4 except compute-4-5

Please note: if you are reading this document and you haven’t already had a conversation about using the cluster with Lee, Chris, and/or Josiah, you should do that. Also, you’ll need a shell account, which you can get from Josiah.

How do I share data?

A common question when wanting to submit jobs to tractor is “How do I give my tasks concurrent access to the same data?” There are several answers to this question. I will discuss five options here.

  • The cluster has cross-mounted home directories. The home directories are stored on the head node and shared out to all the compute nodes. This will take care of most people and you can stop right here.
  • There is a RabbitMQ server running on the head node, for use in message passing between running processes. Feel free to use it.
  • There is a large fileserver, keg, that has a share, /helga, mounted on every machine in the farm, Mac and cluster node alike. This is intended primarily for rendering purposes, and is one of the foundations of our considerable rendering infrastructure, but may possibly be able to be used in other ways if necessary. If there was compelling enough reason, we could also mirror this configuration in another way.
  • MySQL. There is a mysql server on the head node, and I (Josiah) would be happy to give you an account.
  • Roll your own. Build into your program that you want to run on the cluster some message-passing or data-sharing method that works for you.

 

Where can I store large amounts of local data temporarily?

Each node’s extra space is at /state/partition1. Please clean up after yourself.

Please feel free to contact Josiah with further questions at wjerikson@hampshire.edu or x6091.

How to use single nodes of fly for Clojush runs

0. This is ancient. You probably don’t want to do it this way. See this and this and this.

1. Set up a simple Clojush project using Clooj and Leiningen on a mac:

In Clooj: create a project called clusterdemo with namespace clusterdemo.core

Edit the project’s project.clj to include a dependency for [local-file “0.0.4”]. The one that I’m doing this with right now is a Clojure 1.2 project and my edited project.clj file looks like this:

(defproject clusterdemo "1.0.0-SNAPSHOT"
 :description "FIXME: write"
 :dependencies [[org.clojure/clojure "1.2.1"]
[org.clojure/clojure-contrib "1.2.0"]
[local-file "0.0.4"]])

In Terminal: cd into the project directory and run “lein deps”

In Finder or Terminal: put a copy of clojush.clj into the project’s src directory

Relaunch Clooj (necessary for Clooj to see all of the files added above).

Open core.clj and add code for a pushgp run; here’s a very simple thing to get things going:

(ns clusterdemo.core
(:use [clojush]))
(pushgp
 :error-function (fn [program]
(let [top-int (top-item :integer
(run-push program (make-push-state)))]
(if (number? top-int)
[(Math/abs (- top-int 10))]
[100])))
 :atom-generators (list 1 'integer_add))

In Clooj: do REPL > “Evaluate entire file” to make sure that it works.

2. Set up a directory on the cluster for your run:

Lees-MacBook-Pro:clusterdemo leespector$ ssh -l lspector fly.hampshire.edu
lspector@fly.hampshire.edu's password:
Last login: Sun Mar 11 19:11:45 2012 from c-71-192-29-61.hsd1.ma.comcast.net
[etc]

We’re going to ssh to compute-1-1 and use it as a single computer.

[lspector@fly ~]$ ssh compute-1-1
Rocks Compute Node
[etc]

[lspector@compute-1-1 ~]$ mkdir clusterdemo

[lspector@compute-1-1 ~]$ logout
Connection to compute-1-1 closed.

[lspector@fly ~]$ logout
Connection to fly.hampshire.edu closed.

We’ve now set up a folder for our runs (which is still empty). Note that the folder is actually on all nodes — your user directory is cross-mounted on all nodes. So you didn’t really have to ssh to to compute-1-1 before making the directory, because whether you’re on the head node or sshed to any node you’ll still be (initially, on login) in the same directory. But I sshed to compute-1-1 to make it feel more like I was going to that computer. Note that if I wanted to have another run going at the same time on another node that I should make a separate directory for that (e.g. clusterdemo2 for a run on compute-1-2) and repeat everything here for populating that other directory and starting that other run.

Here’s what’s in my project directory on my mac now:

Lees-MacBook-Pro:clusterdemo leespector$ ls
classes lib project.clj src

Lees-MacBook-Pro:clusterdemo leespector$ ls lib
clojure-1.2.1.jar local-file-0.0.4.jar
clojure-contrib-1.2.0.jar

Lees-MacBook-Pro:clusterdemo leespector$ ls src
clojush.clj clusterdemo

Lees-MacBook-Pro:clusterdemo leespector$ ls src/clusterdemo/
core.clj

Let’s move all of that to the clusterdemo directory on the cluster, but we’ll just put all of the files in the same top-level directory and not worry about recreating the subdirectories there:

Lees-MacBook-Pro:clusterdemo leespector$ scp lib/* lspector@fly.hampshire.edu:clusterdemo/
lspector@fly.hampshire.edu's password:
clojure-1.2.1.jar 100% 3165KB 1.0MB/s 00:03
clojure-contrib-1.2.0.jar 100% 466KB 465.9KB/s 00:00
local-file-0.0.4.jar 100% 2290 2.2KB/s 00:00

Lees-MacBook-Pro:clusterdemo leespector$ scp src/clojush.clj lspector@fly.hampshire.edu:clusterdemo/
lspector@fly.hampshire.edu's password:
clojush.clj 100% 93KB 92.6KB/s 00:00

Lees-MacBook-Pro:clusterdemo leespector$ scp src/clusterdemo/core.clj lspector@fly.hampshire.edu:clusterdemo/
lspector@fly.hampshire.edu's password:
core.clj 100% 390 0.4KB/s 00:00

3. Connect to compute-1-1 again and conduct a run:

Lees-MacBook-Pro:clusterdemo leespector$ ssh -l lspector fly.hampshire.edu
lspector@fly.hampshire.edu's password:
Last login: Sun Mar 18 20:29:43 2012 from c-71-192-29-61.hsd1.ma.comcast.net
[etc.]

[lspector@fly ~]$ ssh compute-1-1
Last login: Sun Mar 18 20:30:31 2012 from fly.local
[etc.]

[lspector@compute-1-1 ~]$ cd clusterdemo

[lspector@compute-1-1 clusterdemo]$ ls
clojure-1.2.1.jar clojush.clj local-file-0.0.4.jar
clojure-contrib-1.2.0.jar core.clj

Let’s make a “command” run script that sets up the java classpath correctly (which is hard to remember), starts a run, and sends output both to the screen and to a file called “out”:

[lspector@compute-1-1 clusterdemo]$ echo "java -cp $PWD:./*: clojure.main -i core.clj | tee out" > command

[lspector@compute-1-1 clusterdemo]$ chmod +x command

Run it:

[lspector@compute-1-1 clusterdemo]$ ./command

That should spew all of the output from a run, which should succeed quickly and leave you hanging after the program has been simplified.

Use Cntrl-C to exit.

Check the “out” file to see that it contains the output.

4. Conduct long and/or multiple simultaneous runs:

If you can keep your terminal session open for an entire run then you don’t need to do any more than what has been described above (but with a different problem file, etc.). If you want to conduct multiple runs on multiple nodes simultaneously then just open a new Terminal window for each, make a directory on your fly account for each, ssh to a different node for launching each, and cd into the node-appropriate directory before launching each run.

If you can’t keep your terminal session open (e.g. because you have to move your computer) then conduct your runs within “screen”. To do this, when you are connected to the node type “screen” and return to enter a screen session; then start your run and while it’s running type Cntrl-A-D to disconnect from the screen. You will stop seeing the output but process will keep running within screen. Then you can logout or whatever, and when you come back you can type “screen -r” to resume your screen session. When you’re totally done and you want to terminate your screen session you can do that with Cntrl-D.

How to use lein to do Clojush runs on single nodes of fly

Here are some instructions to get a run going on a single node of fly by using leiningen (lein). This page should be enough to get you there if you know what you’re doing. For example, you should know how to do a single Clojush run on your own computer. This file explains how to perform single runs of Clojush using leiningen launches. (see also this for an outdated method using direct java calls).

First, install leiningen in your fly account if you have not already done so:

  1. cd ~/bin
  2. wget –no-check-certificate https://raw.github.com/technomancy/leiningen/stable/bin/lein
  3. chmod +x lein
  4. lein

and then if everything goes well lein will be set up.

For the rest, two methods are presented, using Github or not:

Using Github:

To use this method, you should be at least somewhat familiar with Git, including creating branches, checking out branches, and pushing to a remote (such as GitHub). Otherwise, you should use the “Not Using GitHub” method below.

  1. Create a Git branch for your code on your local machine. You can use master if you want to use an example as-is; otherwise, let’s suppose your branch is named cool-problem.
  2. (optional) Make sure your code is working on your local machine. Run it by using “lein run clojush.problems.demos.cool-problem” from within your Clojush directory (that’s assuming that your problem file is in src/problems/demos/cool_problem.clj). If you don’t get errors by the first generation, you should be good and can kill the run.
  3. Push your code to GitHub. Use your own fork. This will entail something like “git push origin cool-problem”.
  4. ssh to fly. You will need a fly account to do so. If you need one, talk to Josiah. (example: “ssh yourname@fly.hampshire.edu”)
  5. ssh to a compute node (example: “ssh compute-4-2”)
  6. Start a screen session with something like “screen -S cool”
  7. (NOTE: read this whole step, including substeps, before you get started) If you alread have a git-cloned version of Clojush on fly, cd to that. Otherwise, “git clone git@github.com:lspector/Clojush.git” will get clone Clojush from GitHub. Then, “cd clojush”.
    1. If you are using your fork of Clojush, replace “lspector” with “yourname”.
    2. If you haven’t set up your fly account’s ssh keys with GitHub, you’ll have to do that before this step, but it’s totally worth it. This will entail following the directions here, though DO NOT generate new keys. Instead. just do steps 1, 3, and 4, using the keys you should already have on fly. If you delete your current fly ssh keys, you will not be able to ssh to compute nodes.
    3. If you don’t want to mess with ssh keys as in substep 2, you could instead clone Clojush using the read-only link: git://github.com/lspector/Clojush.git . If you do this, you won’t be able to push changes and new commits back to GitHub, but you won’t have to worry about ssh keys.
  8. Now you should be in your Clojush directory. Get the branch you want to run by “git fetch origin” and then “git branch cool-problem origin/cool-problem” and then “git checkout cool-problem”.
  9. You should now have the code you want to run. You can look around to make sure this is the case.
  10. Run your code. You’ll want to do something like “lein run clojush.examples.cool-problem | tee out.txt”.
  11. Make sure the run has started smoothly.
  12. Detach your screen “C-a d”.
  13. Logout of fly. Later, ssh back into the node you are running on. Resume screen with “screen -r cool”. If things are finished, you can look at the new out.txt file and do whatever else you want. When you’re done with screen, make sure you close it with “C-d”.

Not using Github:

  1. (optional) Make sure your code is working on your local machine. Run it by using “lein run clojush.problems.demos.cool-problem” from within your Clojush directory (that’s assuming that your problem file is in src/problems/demos/cool_problem.clj). If you don’t get errors by the first generation, you should be good and can kill the run.
  2. Zip up your Clojush directory and scp it to your fly account.
  3. ssh to fly. You will need a fly account to do so. If you need one, talk to Josiah. (example: “ssh yourname@fly.hampshire.edu”)
  4. ssh to a compute node. (example: “ssh compute-4-2”)
  5. Unzip your Clojush directory.
  6. Start a screen session with something like “screen -S cool”
  7. cd into your Clojush directory.
  8. Run your code. You’ll want to do something like “lein run clojush.examples.cool-problem | tee out.txt”.
  9. Make sure the run has started smoothly.
  10. Detach your screen “C-a d”.
  11. Logout of fly. Later, ssh back into the node you are running on. Resume screen with “screen -r cool”. If things are finished, you can look at the new out.txt file and do whatever else you want. When you’re done with screen, make sure you close it with “C-d”.

How to do multiple runs of Clojush using tractor

These instructions will show you how to start many runs of Clojush at once on fly using tractor. These instructions assume you have already gone through the tutorial of how to run a single run on fly using lein, since it requires things like having Clojush already on fly and installing lein. This tutorial does not cover more advanced topics like doing runs with varying parameter settings.

  1. Open your Clojush directory. Create a directory where you want your logs to go. For example, “Clojush/results/odd/”
  2. Copy the fly launcher.py to your Clojush directory. It may already be there if you cloned from GitHub. Make sure you have the latest version or it might not work.
  3. Open launcher.py in your favorite text editor.
  4. You should only need to edit things in the “# Settings” section, not below there. Set your number of runs, the location of your Clojush directory, the relative location of your output directory (e.g. “results/odd/”), and the prefix and postfix that you would like to use for your log files. Also, give your runs a title.
  5. Edit the command to be what you want to call. Make sure you change the lein call to match the location of lein on your fly directory. And change the problem file to something besides “clojush.examples.odd” (unless you want to run the odd problem!).
  6. Back in your Clojush directory, run “python launcher.py”. You should get a message that says something like “OK loaded job /thelmuth/1207180001”.
  7. Point your web browser to http://fly.hampshire.edu:8000/tractor/tv/. Sign in with the username you used to submit the job (no password required). This page should show your new job and info about it. If not, something went wrong.
  8. At this point, you’re done! Wait until tre tractor dashboard says that all your runs are done, and then gather your data from the log files.