Tutorial



1 Introduction

These tutorials show the capabilities of Distem by putting the user in two real situations. The experiments are supposed to be launched on the Grid’5000 platform in a Debian based environment.

As you probably read in the Distem presentation, the software is working following a server-client architecture. The server is based on the REST technology. To interact with the Distem server, different methods are available:

  1. Using the command-line executable
  2. Using the ruby client library
  3. Contacting the REST API directly
  4. Using a configuration file (XML)

In this tutorial we will present the usage of the first two methods. The documentation on the usage of this different methods can be found on the documentation page, further information is available on the FAQ page. At any moment if you have a question, find a bug or just want to give us some advice, you can contact the Distem team throught the mailing list.

1.1 Make a reservation

To start an experiment, you have to use some physical nodes on the testbed. As Distem requires some administrator privileges, the nodes must be reserved in deployment mode. Furthermore, on Grid’5000 some IP ranges are dedicated to addresses for virtual machines. However, these ranges must be reserved with g5k-subnets to be used. Throughout the tutorial the values that have to be replaced are written in upper case.

To perform this tutorial, you can reserve your nodes for two hours with the following command:

    frontend> oarsub -t deploy -l slash_22=1+nodes=NUMBER_OF_PHYSICAL_MACHINES, \
              walltime=TIME_OF_THE_RESERVATION -I
  • NUMBER_OF_PHYSICAL_MACHINES is the number of physical machines you want to get in your reservation, their address are stored in the $OAR_NODE_FILE environment variable.
  • TIME_OF_THE_RESERVATION Time of the reservation, number of hour or time in HH:MM:SS format.

Once the reservation is performed, you can get the range of virtual IPs associated to your reservation with:

    frontend> g5k-subnets -sp

You will need this address for your experiments.

1.2 Preparing physical machines

Now that you got your reservation, the next step is to install an operating system on your allocated physical machines. You can deploy a Debian/Wheezy environment -with NFS support, useful not to have to copy Distem filesystem images- using:

    frontend> kadeploy3 -f $OAR_NODE_FILE -e jessie-x64-nfs -k

Now, if all goes well, the nodes are ready to for the next step.

1.3 Distem Installation

Distem can easily be installed on the Grid’5000 nodes thanks to a help script called distem-bootstrap. This script is designed to install the Distem debian package and all its required dependencies.

distem-bootstrap can be launched as follows:

    frontend> distem-bootstrap

In addition to installing Distem on the physical nodes, distem-bootstrap also starts a coordinator daemon on the first node (griffon-1 for example) and initializes the nodes involved. The coordinator is used to perform all the operations on Distem, whatever the node targeted by the operation.

2 Network experiment: SCP VS RSync

This first tutorial aims at becoming familiar with Distem. The final goal is to compare the efficiency of SCP and RSync when transferring several files over the network. We will see that the behavior of each tool differs upon different network latency values.

This tutorial is split in 2 steps :

  1. Simple experiment using the command-line tool
  2. Scripted experiment using the ruby library

To start this experiment, you have to use two physical nodes on the testbed, please take a look at Make a reservation, Preparing physical machines and Distem Installation to get and configure your physical machines.

For the tutorial, we assume that the reserved nodes are griffon-1 and griffon-2.

2.1 Simple shell experiment

2.1.1 Platform setup

This step is dedicated to the emulated platform configuration. The following operations must be executed on the coordinator node, with the root user.

    frontend> ssh root@griffon-1

First of all, we must create a virtual network. In this tutorial, the virtual network will be named vnetwork and its address corresponds to the output of the g5k-subnets -sp command, previously executed on the frontend. We assume that the network address is 10.144.0.0/22.

    coord> distem --create-vnetwork vnetwork=vnetwork,address=10.144.0.0/22

Then, we must create the virtual nodes. We will create one node on each physical node, called node-1 and node-2. To create a virtual node, you must also provide a root file-system. Finally, to allow a password-less connection between the nodes, you can specify a pair of password-less ssh keys. In the example we use the following file (on the Nancy Grid’5000 site): /home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz.

Let’s create the virtual nodes:

    coord> distem --create-vnode vnode=node-1,pnode=griffon-1,\
           rootfs=file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz,\
           sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub
    coord> distem --create-vnode vnode=node-2,pnode=griffon-2,\
           rootfs=file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz,\
           sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub

Now we create the network interfaces on each virtual node:

    coord> distem --create-viface vnode=node-1,iface=if0,vnetwork=vnetwork
    coord> distem --create-viface vnode=node-2,iface=if0,vnetwork=vnetwork

Command output example:

[{"name" => "node-1",
  "filesystem" =>
   {"sharedpath" => nil,
    "vnode" => "node-1",
    "shared" => nil,
    "path" => "/tmp/distem/rootfs-unique/node-1",
    "image" => "file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz"},
  "id" => "0",
  "vifaces" =>
   [{"name" => "if0",
     "address" => "10.144.0.1/22",
     "vinput" => nil,
     "voutput" => nil,
     "vnode" => "node-1",
     "vnetwork" => "vnetwork",
     "id" => "0"}],
  "host" => "172.16.65.1",
  "gateway" => false,
  "status" => "INIT",
  "vcpu" => nil},
 {"name" => "node-2",
  "filesystem" =>
   {"sharedpath" => nil,
    "vnode" => "node-2",
    "shared" => nil,
    "path" => "/tmp/distem/rootfs-unique/node-2",
    "image" => "file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz"},
  "id" => "1",
  "vifaces" =>
   [{"name" => "if0",
     "address" => "10.144.0.2/22",
     "vinput" => nil,
     "voutput" => nil,
     "vnode" => "node-2",
     "vnetwork" => "vnetwork",
     "id" => "-1"}],
  "host" => "172.16.65.2",
  "gateway" => false,
  "status" => "INIT",
  "vcpu" => nil}]

In particular you can note for each node the assigned IP address, in our example:

  • node-1: 10.144.0.1/22
  • node-2: 10.144.0.2/22

Finally, we start the virtual nodes:

    coord> distem --start-vnode node-1
    coord> distem --start-vnode node-2

At this time, you can connect to your nodes and perform your experiment. Either you can connect in your nodes, or you can run a command on a node:

  • connection in a node to get a shell (user:root, password:root):

    coord> distem --shell node-1
    This option uses lxc-console, so the exit key sequence (ctrl-a + q by default) can be modified if you are running inside a screen session.
  • execution of a command:

    coord> distem --execute vnode=node-1,command="hostname"

2.1.2 Experiment

We assume that the IP of the virtual nodes are:

  • node-1: 10.144.0.1
  • node-2: 10.144.0.2

To run the experiment we connect on the first node:

    coord> distem --shell node-1

We create 100 files of 50KB on the first node. This set of commands is performed inside the first node:

    node-1> mkdir /tmp/src
    node-1> for i in `seq 1 100`; do dd if=/dev/zero of=/tmp/src/$i bs=1K count=50; done

Still inside the first node, here is the core experiment (transfer of the 100 files with scp and rsync from node-1 to node-2):

    node-1> ssh 10.144.0.2 "rm -rf /tmp/dst"
    node-1> time scp -rq /tmp/src 10.144.0.2:/tmp/dst
    node-1> ssh 10.144.0.2 "rm -rf /tmp/dst"
    node-1> time rsync -r /tmp/src 10.144.0.2:/tmp/dst

Here, you can compare the time of execution for scp and rsync.

Now, we will modify the latency of the network links. In this experiment, we want to set a limitation on the interface of each virtual nodes (in output for each interface) in order to simulate a latency of 20ms on the network link. From the coordinator, let’s run:

    coord> distem --config-viface vnode=node-1,iface=if0,latency=20ms,\
           direction=OUTPUT
    coord> distem --config-viface vnode=node-2,iface=if0,latency=20ms,\
           direction=OUTPUT

Once the new latency is configured, you can run again the core experiment and observe the time required to perform scp or rsync. You can try to repeat the experiment by setting the link latency to 40, 60, 80 and 100ms.

2.2 Scripted experiment

All the operations performed before can be scripted in Ruby since Distem provide a Ruby API. Distem also provides users with a REST interface but this is out of the scope of this tutorial.

2.2.1 Platform setup

After having installed Distem with:

    frontend> distem-bootstrap --node-list $OAR_NODE_FILE

you can deploy the virtual platform by running a platform script on the coordinator:

    coord> ruby platform_setup.rb "10.144.0.0/22"

platform_setup.rb is a script that:

  1. takes the virtual network address assigned by the OAR scheduler as a parameter ;
  2. builds the virtual network ;
  3. builds the virtual nodes node-1 and node-2,and their network interfaces.

Here is the source of this script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
#!/usr/bin/ruby
# Import the Distem module
require 'distem'
# The path to the compressed filesystem image
# We can point to local file since our homedir is available from NFS
FSIMG="file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz"
# Put the physical machines that have been assigned to you
# You can get that by executing: cat $OAR_NODE_FILE | uniq
pnodes=["pnode1","pnode2", ... ]
raise 'This experiment requires at least two physical machines' unless pnodes.size >= 2
# The first argument of the script is the address (in CIDR format)
# of the virtual network to set-up in our platform
# This ruby hash table describes our virtual network
vnet = {
  'name' => 'testnet',
  'address' => ARGV[0]
}
nodelist = ['node-1','node-2']
# Read SSH keys
private_key = IO.readlines('/root/.ssh/id_rsa').join
public_key = IO.readlines('/root/.ssh/id_rsa.pub').join
sshkeys = {
  'private' => private_key,
  'public' => public_key
}
# Connect to the Distem server (on http://localhost:4567 by default)
Distem.client do |cl|
  puts 'Creating virtual network'
  # Start by creating the virtual network
  cl.vnetwork_create(vnet['name'], vnet['address'])
  # Creating one virtual node per physical one
  puts 'Creating virtual nodes'
  # Create the first virtual node and set it to be hosted on
  # the first physical machine
  cl.vnode_create(nodelist[0], { 'host' => pnodes[0] }, sshkeys)
  # Specify the path to the compressed filesystem image
  # of this virtual node
  cl.vfilesystem_create(nodelist[0], { 'image' => FSIMG })
  # Create a virtual network interface and connect it to vnet
  cl.viface_create(nodelist[0], 'if0', { 'vnetwork' => vnet['name'], 'default' => 'true' })
  # Create the first virtual node and set it to be hosted on
  # the second physical machine
  cl.vnode_create(nodelist[1], { 'host' => pnodes[1] }, sshkeys)
  cl.vfilesystem_create(nodelist[1], { 'image' => FSIMG })
  cl.viface_create(nodelist[1], 'if0', { 'vnetwork' => vnet['name'] })
  puts 'Starting virtual nodes'
  # Starting the virtual nodes using the synchronous method
  nodelist.each do |nodename|
    cl.vnode_start(nodename)
  end
end

As a result, two virtual nodes are created and started on our virtual platform and connected to the same virtual network

2.2.2 Experiment

The experiment setup can be launched from root on the coordinator node with the following command:

    coord> ruby experiment.rb

Here is the code of experiment.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
#!/usr/bin/ruby
require 'distem'
# Function that perform the calculation of the average
# of an array of values
def average(values)
  sum = values.inject(0){ |tmpsum,v| tmpsum + v.to_f }
  return sum / values.size
end
# Function that perform the calculation of the standard deviation
# of an array of values
def stddev(values,avg = nil)
  avg = average(values) unless avg
  sum = values.inject(0){ |tmpsum,v| tmpsum + ((v.to_f-avg) ** 2) }
  return Math.sqrt(sum / values.size)
end
# Describing the resources we are working with
ifname = 'if0'
node1 = {
  'name' => 'node-1',
  'address' => nil
}
node2 = {
  'name' => 'node-2',
  'address' => nil
}
# The parameters of our experimentation
latencies = ['0ms', '20ms', '40ms', '60ms']
results = {
  'scp' => {},
  'rsync' => {}
}
iterations = 5
Distem.client do |cl|
  # Getting the -automatically affected- address of each virtual nodes
  # virtual network interfaces
  node1['address'] = cl.viface_info(node1['name'],ifname)['address'].split('/')[0]
  node2['address'] = cl.viface_info(node2['name'],ifname)['address'].split('/')[0]
  # Creating the files we will use in our experimentation
  cl.vnode_execute(node1['name'],
    'mkdir -p /tmp/src ; cd /tmp/src ; \
     for i in `seq 1 100`; do \
      dd if=/dev/zero of=$i bs=1K count=50; \
     done'
  )
  # Printing the current latency
  start_time = Time.now.to_f
  cl.vnode_execute(node1['name'], 'hostname')
  puts "Latency without any limitations #{Time.now.to_f - start_time}"
  # Preparing the description structure that will be used to
  # update virtual network interfaces latency
  desc = {
    'output' => {
      'latency' => {
        'delay' => nil
      }
    }
  }
  # Starting our experiment for each specified latencies
  puts 'Starting tests'
  latencies.each do |latency|
    puts "Latency #{latency}"
    results['scp'][latency] = []
    results['rsync'][latency] = []
    # Update the latency description on virtual nodes
    desc['output']['latency']['delay'] = latency
    cl.viface_update(node1['name'],ifname,desc)
    cl.viface_update(node2['name'],ifname,desc)
    iterations.times do |iter|
      puts "\tIteration ##{iter}"
      # Launch SCP test
      # Cleaning target directory on node2
      cl.vnode_execute(node2['name'], 'rm -rf /tmp/dst')
      # Starting the copy from node1 to node2
      start_time = Time.now.to_f
      cl.vnode_execute(node1['name'],
        "scp -rq /tmp/src #{node2['address']}:/tmp/dst"
      )
      results['scp'][latency] << Time.now - start_time
      # Launch RSYNC test
      # Cleaning target directory on node2
      cl.vnode_execute(node2['name'], 'rm -rf /tmp/dst')
      # Starting the copy from node1 to node2
      start_time = Time.now
      cl.vnode_execute('node-1',
        "rsync -r /tmp/src #{node2['address']}:/tmp/dst"
      )
      results['rsync'][latency] << Time.now - start_time
    end
  end
end
puts "Rsync results:"
results['rsync'].keys.sort {|a,b| a.to_i <=> b.to_i}.each do |latency|
  values = results['rsync'][latency]
  avg = average(values)
  puts "\t#{latency}: [average=#{avg},standard_deviation=#{stddev(values,avg)}]"
end
puts "SCP results:"
results['scp'].keys.sort {|a,b| a.to_i <=> b.to_i}.each do |latency|
  values = results['scp'][latency]
  avg = average(values)
  puts "\t#{latency}: [average=#{avg},standard_deviation=#{stddev(values,avg)}]"
end

3 HPC experiment: HPCC benchmark

The goal of this experiment is to run the HPCC benchmark on 16 virtual nodes and to evaluate the impact of using slower nodes in different configurations. This tutorial is split in 3 steps for two different methods (shell and scripted) :

  1. requirements ;
  2. platform setup ;
  3. experiment.

3.1 Requirements

In this experiment, we use 4 physical nodes of the same cluster. In this example, we will use the Graphene cluster since each physical node has 4 cores. However, any cluster with at least 4 nodes can do the job. You can reserve the nodes for two hours as follows:

    frontend> oarsub -t deploy \
              -l slash_22=1+{"cluster='graphene'"}nodes=4,walltime=2:00:00 -I

For the tutorial, we assume that the reserved nodes are graphene-1, graphene-2, graphene-3 and graphene-4.

You can check the reserved network addresses as specified in Make a reservation. Deployment of the physical nodes is working the same way it is working in Preparing physical machines. Finally you should install Distem on the physical machines as specified in Distem Installation.

In the example, we assume that the coordinator is graphene-1 and the virtual network obtained is 10.144.0.0/22.

3.2 Simple shell experiment

3.2.1 Platform setup

We decided to set up our platform using the shell way, but it could also be done with a script (such as we did in Platform setup).

To setup the platform, we start by connecting on the coordinator node with the root user:

    frontend> ssh root@graphene-1

First of all, we must create a virtual network with the virtual network address obtained.

    coord> distem --create-vnetwork vnetwork=vnetwork,address=10.144.0.0/22

Then, we must create the virtual nodes. We will create 4 nodes on each physical node, called node-1 to node-16.

    coord> export FS_IMG=file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz
    coord> for i in `seq 1 4`; do \
            distem --create-vnode vnode=node-${i},pnode=graphene-1,rootfs=${FS_IMG},\
            sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub; \
           done
    coord> for i in `seq 5 8`; do \
            distem --create-vnode vnode=node-${i},pnode=graphene-2,rootfs=${FS_IMG},\
            sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub; \
           done
    coord> for i in `seq 9 12`; do \
            distem --create-vnode vnode=node-${i},pnode=graphene-3,rootfs=${FS_IMG},\
            sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub; \
           done
    coord> for i in `seq 13 16`; do \
            distem --create-vnode vnode=node-${i},pnode=graphene-4,rootfs=${FS_IMG},\
            sshprivkey=/root/.ssh/id_rsa,sshpubkey=/root/.ssh/id_rsa.pub; \
           done

Now we create the network interfaces on each virtual node:

    coord> for i in `seq 1 16`; do \
            distem --create-viface vnode=node-${i},iface=if0,vnetwork=vnetwork; \
           done

Next, we create the virtual processors on the virtual nodes (here we define 1 core per virtual node that runs at full speed):

    coord> for i in `seq 1 16`; do \
            distem --set-vcpu vnode=node-${i},corenb=1,cpu_speed=unlimited; \
           done

To ensure that all goes well, you can get the information about the configured virtual nodes.

Finally, we start the virtual nodes:

    coord> for i in `seq 1 16`; do distem --start-vnode node-${i}; done

3.2.2 Experiment

We assume that the IP for the virtual node node-X is 10.144.0.X.

Lets connect on node-1 (distem –shell node-1) and add the following lines to ~/.ssh/config:

Host *
  StrictHostKeyChecking no
  HashKnownHosts no

Now, you can perform the following commands:

node-1> for i in `seq 1 16`; do echo 10.144.0.$i >> iplist; done
node-1> time mpiexec -machinefile iplist hpcc

This will launch the HPCC benchmark over the virtual nodes. You observe the global execution time for the benchmark and compare it to experimentation where you choose a different CPU virtual frequency. You can also have a look to the hpccoutf.txt generated file that contains the details of the results for each sub-benchmark.

You can now run the test with other frequencies by first updating them this way :

    coord> for i in `seq 1 16`; do \
            distem --config-vcpu vnode=node-${i},cpu_speed=0.5,unit=ratio; \
           done

Note that you don’t have to restart the virtual node for this update to take effect, it’s done on-the-fly

Now relaunch your experience by first connecting on node-1 (distem –shell node-1) and then performing the following commands:

    node-1> time mpiexec -machinefile iplist hpcc

3.3 Scripted experiment

3.3.1 Platform setup

    frontend> distem-bootstrap --node-list $OAR_NODE_FILE
    coord> ruby platform_setup.rb "10.144.0.0/22" 0.5

The first parameter of platform_setup.rb is the virtual network address allocated to your reservation and the second parameter is the coefficient applied to the CPU frequency (for instance a coefficient of 0.5 means that the virtual cores will work at the half of their real speed).

Please note that this time we will start the virtual nodes in asynchronous mode to win some time since there is a lot of virtual node to start and since the start operation is the longest operation.

Here is the source of this script:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
#!/usr/bin/ruby
# Import the Distem module
require 'distem'
# The path to the compressed filesystem image
# We can point to local file since our homedir is available from NFS
FSIMG="file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz"
# Put the physical machines that have been assigned to you
# You can get that by executing: cat $OAR_NODE_FILE | uniq
pnodes=["pnode1","pnode2", ... ]
# The first argument of the script is the address (in CIDR format)
# of the virtual network to set-up in our platform
vnet = {
  'name' => 'testnet',
  'address' => ARGV[0]
}
# The second argument of the script is the is the coefficient
# applied to the CPU frequency of the physical machine
cpu_limit = ARGV[1].to_f
nodelist = []
# Connect to the Distem server (on http://localhost:4567 by default)
Distem.client do |cl|
  puts 'Creating virtual network'
  # Start by creating the virtual network
  cl.vnetwork_create(vnet['name'], vnet['address'])
  puts 'Creating virtual nodes'
  count = 0
  # Read SSH keys
  private_key = IO.readlines('/root/.ssh/id_rsa').join
  public_key = IO.readlines('/root/.ssh/id_rsa.pub').join
  sshkeys = {
    'private' => private_key,
    'public' => public_key
  }
  # Iterate on every physical nodes
  pnodes.each do |pnode|
    # Create 4 virtual nodes per physical machine (one per core)
    4.times do
      nodename = "node-#{count}"
      # Create the first virtual node and set it to be hosted on 'pnode'
      cl.vnode_create(nodename, { 'host' => pnode }, sshkeys)
      # Specify the path to the compressed filesystem image
      # of this virtual node
      cl.vfilesystem_create(nodename, { 'image' => FSIMG })
      # Create a virtual CPU with 1 core on this virtual node
      # specifying that its frequency should be 'cpu_limit'
      cl.vcpu_create(nodename, cpu_limit, 'ratio', 1)
      # Create a virtual network interface and connect it to vnet
      cl.viface_create(nodename, 'if0', { 'vnetwork' => vnet['name'], 'default' => 'true' })
      nodelist << nodename
      count += 1
    end
  end
  puts 'Starting virtual nodes ...'
  nodelist.each do |nodename|
    cl.vnode_start(nodename)
  end
  puts 'done'
end

3.3.2 Experiment

The experiment can be launched from root on the coordinator node with the following command:

    coord> ruby experiment.rb

Here is the code of experiment.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
#!/usr/bin/ruby
require 'distem'
# Function that perform the calculation of the average
# of an array of values
def average(values)
  sum = values.inject(0){ |tmpsum,v| tmpsum + v.to_f }
  return sum / values.size
end
# Function that perform the calculation of the standard deviation
# of an array of values
def stddev(values,avg = nil)
  avg = average(values) unless avg
  sum = values.inject(0){ |tmpsum,v| tmpsum + ((v.to_f-avg) ** 2) }
  return Math.sqrt(sum / values.size)
end
# Describing the resources we are working with
ifname = 'if0'
# The virtual nodes list
nodelist = []
16.times do |count|
  nodelist << "node-#{count}"
end
iplist = []
results = []
iterations = 5
Distem.client do |cl|
  # Getting the -automatically affected- address of each virtual nodes
  # virtual network interfaces
  nodelist.each do |nodename|
    iplist << cl.viface_info(nodename,ifname)['address'].split('/')[0]
  end
  # Creating a string with each ip on a single line
  ipliststr = iplist.join("\n")
  # Copying the iplist in a file on the first node
  cl.vnode_execute(nodelist[0], "echo '#{ipliststr}' >> iplist")
  puts 'Starting tests'
  iterations.times do |iter|
    puts "\tIteration #{iter}"
    start_time = Time.now
    cl.vnode_execute(nodelist[0], 'mpiexec -machinefile iplist hpcc')
    results << Time.now - start_time
  end
end
avg = average(results)
puts "Results: [average=#{avg},standard_deviation=#{stddev(results,avg)}]"

As in the non-scripted version, to perform experiment with several CPU speeds, you can update the CPU speed using the vcpu_update method.

4 Large scale experiment

The goal of this experiment is to run a large scale experiment on 1000 virtual nodes. This will illustrate some Distem features that help to create in a short time a large virtual platform.

First, we reserve 10 physical nodes for two hours as follows:

    frontend> oarsub -t deploy \
              -l slash_18=1+{"cluster='graphene'"}nodes=10,walltime=2:00:00 -I

We are going to create a script, platform_setup.rb, to set up the platform:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/ruby
require 'distem'
require 'thread'
net,netmask = ARGV[0].split('/')
nb_vnodes = ARGV[1].to_i
img = "file:///home/ejeanvoine/public/distem/distem-fs-jessie.tar.gz"
nodes = []
iplist = []
Distem.client { |cl|
  cl.vnetwork_create('vnet', "#{net}/#{netmask}")
  (1..nb_vnodes).each { |i| nodes << "node#{i}" }
  res = cl.vnodes_create(nodes,
                        {
                          'vfilesystem' =>{'image' => img,'shared' => true},
                          'vifaces' => [{'name' => 'if0', 'vnetwork' => 'vnet', 'default' => 'true'}]
                        })
  # Not used further, but could be useful in such script
  res.each { |r| iplist << r['vifaces'][0]['address'].split('/')[0] }
  puts "Starting vnodes..."
  cl.vnodes_start(nodes)
  sleep(30)
  puts "Waiting for vnodes to be here..."
  if cl.wait_vnodes({'timeout' => 600, 'port' => 22})
    puts "Setting global /etc/hosts"
    cl.set_global_etchosts()
    puts "Setting global ARP tables"
    cl.set_global_arptable()
  else
    puts "vnodes are unreachable"
    exit 1
  end
}

You can notice that this script uses vnodes_create() and vnodes_start() instead of the vnode_create() and vnode_start() functions. Actually these are the vectorized versions of the previous functions that avoid a lot of HTTP requests, and thus that drastically speed-up the platform creation when dealing with hundreds or thousands of nodes.

This script also call two functions that may help for your experiment

  • set_global_etchosts() fills the /etc/hosts of every virtual node in order to be able to use directly the name of the virtual nodes instead of their IP. Indeed, when using several physical nodes, or not shared filesystem, /etc/hosts are not globally filled.
  • set_global_arptable() fills the ARP table of every virtual node with all the MAC addresses of the virtual nodes in the platform. This is useful for large scale experiments since it avoids a lot of ARP requests that may lead to connection failure.

Then, you can deploy the platform with:

    frontend> distem-bootstrap --node-list $OAR_NODE_FILE --max-vifaces 150
    coord> ruby platform_setup.rb "10.144.0.0/18" 1000

Note that –max-vifaces is used to specify the maximum number of virtual interfaces that can be launched on a physical node, by default it is 64. As we asked to deploy 1000 virtual nodes on 10 physical nodes, in average 100 virtual interfaces will be created, so we set the parameter to 150 just in case of a bad random distribution of the virtual nodes.

Finally, it is up to you to run a large scale experiment :)

5 Fault injection experiment

Distem provides users with an event manager to automatically modify the virtual platform in a deterministic way. Supported modifications are:

  • modification of network interfaces capabilities (bandwidth and latency)
  • modification of the CPU frequency
  • start and stop virtual nodes
  • freeze and unfreeze virtual nodes

Events can be specified in two ways. First, it is possible to use an event trace that specifies which modification occurs at which date (relatively to the start of the experiment). Second, it is possible to define automatically the date of event arrival according to various probability distributions. Currently, uniform, exponential and Weibull distributions are supported.

The goal of this experiment is to evaluate a fault-tolerant file broadcast tool called kascade. In particular, we will evaluate its behavior when introducing node failures.

First, we reserve 10 physical nodes for two hours as follows:

    frontend> oarsub -t deploy \
              -l slash_18=1+{"cluster='graphene'"}nodes=10,walltime=2:00:00 -I

You can check the reserved network addresses as specified in Make a reservation.

Deployment of the physical nodes is working the same way it is working in Preparing physical machines.

We create a script, platform_setup.rb, to set up the platform:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
#!/usr/bin/ruby
require 'distem'
require 'thread'
net,netmask = ARGV[0].split('/')
nb_vnodes = ARGV[1].to_i
img = "file:///home/ejeanvoine/public/distem/distem-fs-wheezy.tar.gz"
nodes = []
iplist = []
Distem.client { |cl|
  cl.vnetwork_create('vnet', "#{net}/#{netmask}")
  (1..nb_vnodes).each { |i| nodes << "node#{i}" }
  res = cl.vnodes_create(nodes,
                        {
                          'vfilesystem' =>{'image' => img,'shared' => true},
                          'vifaces' => [{'name' => 'if0', 'vnetwork' => 'vnet', 'default' => 'true'}]
                        })
  # Not used further, but could be useful in such script
  res.each { |r| iplist << r['vifaces'][0]['address'].split('/')[0] }
  puts "Starting vnodes..."
  cl.vnodes_start(nodes)
  sleep(30)
  puts "Waiting for vnodes to be here..."
  if cl.wait_vnodes({'timeout' => 600, 'port' => 22})
    puts "Setting global /etc/hosts"
    cl.set_global_etchosts()
    puts "Setting global ARP tables"
    cl.set_global_arptable()
  else
    puts "vnodes are unreachable"
    exit 1
  end
}

Then, you can deploy a 50 virtual nodes platform with:

    coord> ruby platform_setup.rb "10.144.0.0/18" 50

The experiment can be launched from the coordinator node with the following command:

    coord> ruby experiment.rb /path/to/kascade

Here is the code of experiment.rb:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
#!/usr/bin/ruby
require 'pp'
require 'distem'
require 'tempfile'
require 'rubygems'
require 'net/ssh'
NBVNODES = 50
REPS = 3
PATH_TO_KASCADE = ARGV[0]
nodes = (1..NBVNODES).collect {|i| "node#{i}"}
# 3 experiments are launched here:
# - run without failure
# - run with simultaneous failures of 5% of the nodes
# - run with sequential failures of 5% of the nodes
EXP = [ { :name => 'no_failure', :trace => nil },
        { :name => 'simult_5percent', :trace => [ [10, [4,14,24,34,44]] ] },
        { :name => 'seq_5percent', :trace => [ [10, [4]], [14, [14]], [18, [24]],
                                               [22, [34]], [26, [44]] ] } ]
results = {}
# Create a node file for Kascade
f = Tempfile.new('kascade_ft')
nodes.drop(1).each { |node| f.puts(node) }
f.close
# Copy the node file into the first virtual node
system("scp #{f.path} root@node1:nodes")
# Copy Kascade into the first virtual node
system("scp #{PATH_TO_KASCADE} root@node1:kascade")
# Add execution rights to kascade
system("ssh root@node1 'chmod +x ~/kascade'")
# Generate a 500MB file
system("ssh root@node1 'dd if=/dev/zero of=/tmp/file bs=1M count=500'")
Distem.client { |cl|
  EXP.each { |experiment|
    results[experiment[:name]] = []
    # Run the experiments several time
    REPS.times.each { |iter|
      puts "### Experiment #{experiment[:name]}, iteration #{iter}"
      nodes_down = []
      trace = experiment[:trace]
      # Check if events have to be injected
      if trace
        trace.each { |dates|
          date,node_numbers = dates
          nodes_down += node_numbers.collect { |number| "node#{number}" }
          node_numbers.each { |number|
            cl.event_trace_add({ 'vnodename' => "node#{number}", 'type' => 'vnode' },
                               'churn',
                               { date => 'down' })
          }
        }
        cl.event_manager_start
      end
      # Perform a run
      Net::SSH.start('node1', 'root', :password => 'root') {|ssh|
        start = Time.now.to_f
        ssh.exec('/root/kascade -n /root/nodes -i /tmp/file -o /dev/null -D taktuk -v fatal')
        ssh.loop
        results[experiment[:name]] << Time.now.to_f - start
      }
      # Clean
      if trace
        cl.event_manager_stop
        puts "Let's restart #{nodes_down.join(',')}"
        cl.vnodes_start(nodes_down)
        ret = cl.wait_vnodes({'timeout' => 120, 'port' => 22, 'vnodes' => nodes_down})
        if not ret
          puts "Some nodes are unreachable"
          exit 1
        end
      end
    }
  }
}
pp results

In this experiment the failure consists in stopping some virtual nodes at a given time. Other strategies could have been used, for instance we could have simulated a network issue where the network interface of some virtual nodes would have become unresponsive (without being completely shut down). This could have been achieved by setting a high latency to those network interfaces. The following code:

cl.event_trace_add({ 'vnodename' => "node#{number}", 'type' => 'vnode' },
                     'churn',
                     { date => 'down' })

could have been replaced for instance with:

cl.event_trace_add({ 'vnodename' => "node#{number}",
                     'type' => 'viface',
                     'vifacename' => 'if0',
                     'viface_direction' => 'output' },
                     'latency',
                     { date => '200000ms' })

In this case, we have added a latency of 200s on some network interfaces (up-link way), leading to almost unresponsive nodes.

6 Toward an SDN experiment

Software-defined networking has been the focus of a lot of attention in the network community for the last years. Without entering in the details of SDN, this tutorial show how you can deploy a topology where the network in managed in the SDN way.

SDN infrastructures can be deployed leveraging specific hardware or software components. Here we will focus on a software based infrastructure. In order to perform the network interconnection, we will use OpenSWitch and in order to control the behavior of the network, we will use POX as an OpenFlow controller.

6.1 Prerequisites

In order to run OpenVSwitch inside the vnodes, it must also be installed on the pnodes.

To ease the installation, we will deploy the physical nodes with an environment that already contains OpenVSwitch. Its description can be found at the following place: /home/ejeanvoine/public/kadeploy/jessie-x64-nfs-ovs.env.

We also need to have an LXC image that contains OpenVSwitch and POX. Such pre-built image can be found on Grid’5000 at the following place: /home/ejeanvoine/public/distem/distem-fs-jessie-ovs.tar.gz.

6.2 Bridging virtual nodes in a L2 network

6.2.1 Platform deployment

For this experiment, we will use 4 nodes:

    frontend> oarsub -t deploy -l nodes=4,walltime=2 -I

We assume that the reserved nodes are graphene-1,graphene-2,graphene-3,graphene-4

Those nodes will be deployed with:

    frontend> kadeploy3 -f $OAR_NODE_FILE -k -a http://public.nancy.grid5000.fr/~ejeanvoine/kadeploy/jessie-x64-nfs-ovs.env

Then, we install distem:

    frontend> distem-bootstrap --enable-admin-network --vxlan-id 0

Note that we use the –enable-admin-network option. This allows to automatically create an isolated network containing all the virtual nodes of the platform. Thus, whatever the network configuration you have, all the vnodes will be reachable from the coordinator. Furthermore, the –vxlan-id is optional if only one Distem instance using VXLAN is launched on the same L2 network. Otherwise, each instance must have a different id.

We will create a topology with 4 virtual nodes, like in the following picture:

Star topology for the first SDN experiment

Star topology for the first SDN experiment

In this topology, we will create 3 isolated networks using the “VXLAN mode”, as follows:

  • vnet1 that will contain n1 and n0
  • vnet2 that will contain n2 and n0
  • vnet3 that will contain n3 and n0

Here is the script of the platform:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
#!/usr/bin/ruby
require 'distem'

img_ovs = "file:///home/ejeanvoine/public/distem/distem-fs-jessie-ovs.tar.gz"
hosts = ARGV[0].split(',')
Distem.client { |cl|
  cl.vnetwork_create('vnet1', '10.144.128.0/24', {'network_type' => 'vxlan'})
  cl.vnetwork_create('vnet2', '10.144.128.0/24', {'network_type' => 'vxlan'})
  cl.vnetwork_create('vnet3', '10.144.128.0/24', {'network_type' => 'vxlan'})
  nodes = [ 'n0', 'n1', 'n2', 'n3' ]
  cl.vnode_create('n1',
                     {
                       'host' => hosts[1],
                       'vfilesystem' =>{'image' => img_ovs,'shared' => true},
                       'vifaces' => [
                                     {'name' => 'if0', 'vnetwork' => 'vnet1', 'address' => '10.144.128.1'},
                                    ]
                     })
  cl.vnode_create('n2',
                     {
                       'host' => hosts[2],
                       'vfilesystem' =>{'image' => img_ovs,'shared' => true},
                       'vifaces' => [
                                     {'name' => 'if0', 'vnetwork' => 'vnet2', 'address' => '10.144.128.2'},
                                    ]
                     })
  cl.vnode_create('n3',
                     {
                       'host' => hosts[3],
                       'vfilesystem' =>{'image' => img_ovs,'shared' => true},
                       'vifaces' => [
                                     {'name' => 'if0', 'vnetwork' => 'vnet3', 'address' => '10.144.128.3'},
                                    ]
                     })
  cl.vnode_create('n0',
                     {
                       'host' => hosts[0],
                       'vfilesystem' =>{'image' => img_ovs,'shared' => true},
                       'vifaces' => [
                                     {'name' => 'if1', 'vnetwork' => 'vnet1', 'address' => '10.144.128.4'},
                                     {'name' => 'if2', 'vnetwork' => 'vnet2', 'address' => '10.144.128.5'},
                                     {'name' => 'if3', 'vnetwork' => 'vnet3', 'address' => '10.144.128.6'}
                                    ]
                     })
  puts "Starting vnodes..."
  cl.vnodes_start(nodes)
  puts "Waiting for vnodes to be here..."
  sleep(30)
  ret = cl.wait_vnodes({'timeout' => 1200, 'port' => 22})
  if ret
    puts "Setting global /etc/hosts"
    cl.set_global_etchosts
  else
    puts "vnodes are unreachable"
  end
}

The script can be launched like that:

    coord> ruby platform.rb graphene-1,graphene-2,graphene-3,graphene-4

At this point, you can have a look at the /etc/hosts file on the coordinator. You will see that each vnode has a different entry for all the vnetworks it is attached to. Furthermore, you can see an additional entry for all the vnodes with a -adm suffix. This is related to the global administration network created with the previous distem-bootstrap execution. So, from the coordinator, you will be able to connect to any virtual nodes, like:

    coord> ssh root@n0-adm

6.2.2 Bridging vnodes together

Since the vnodes n1, n2, and n3 are in different isolated networks, you will not be able to reach one node from another one. You can try for instance:

    coord> ssh root@n1-adm
    n1> ping n3-vnet3

Thus, we will bridge n1, n2, and n3 into the same network using OpenVSwitch in n0.

    coord> ssh root@n0-adm
    # Add an OVS bridge
    n0> ovs-vsctl add-br OVSbr
    # Shutdown interfaces linked to vnet1, vnet2, and vnet3
    n0> ifconfig if1 0
    n0> ifconfig if2 0
    n0> ifconfig if3 0
    # Add interfaces into the bridge
    n0> ovs-vsctl add-port OVSbr if1
    n0> ovs-vsctl add-port OVSbr if2
    n0> ovs-vsctl add-port OVSbr if3
    # Set promiscious mode
    n0> ifconfig if1 promisc up
    n0> ifconfig if2 promisc up
    n0> ifconfig if3 promisc up

At this point, you should be able to reach any vnode from the others: n1, n2, and n3 are in the same L2 network, even if underneath the traffic takes isolated networks. You can try again:

    coord> ssh root@n1-adm
    n1> ping n3-vnet3

6.2.3 OpenFlow control

By default, OpenVSwitch includes a controller that behaves like a classical Ethernet switch. We will see here how to modify this behavior by connecting an external controller.

Here is the OpenVSwitch setup:

    coord> ssh root@n0-adm
    # Tell OpenVSwitch to be inactive when no external controller is plugged
    n0> ovs-vsctl set-fail-mode OVSbr secure
    # Define the port on which OpenVSwitch is supposed to listen to
    n0> ovs-vsctl set-controller OVSbr tcp:0.0.0.0:6633

Now we can deal with the OpenFlow controller. For the sake of simplicity, POX has been installed in the Distem image. It can be executed remotly, or directly on n0. We will choose the second configuration.

We will use the POX script included in the Distem image (/root/pox/tutorial.py) to control OpenVSwitch in order to act either like a hub, or like a learning switch.

First, you will see how POX can be executed.

    coord> ssh root@n0-adm
    n0> cd pox
    n0> python pox.py log.level --DEBUG tutorial

By default, this script asks OpenVSwitch to behave like a hub. You can try to ping n3 from n1. Meanwhile, you can listen to the interface of n0 connected to n2 (if2):

    coord> ssh root@n0-adm
    n0> tcpdump -XX -n -i if2

You can have a look to the code of /root/pox/tutorial.py, actually, every received packet on any port is resent to all the ports.

When interrupting the POX script, you can observe that no packets are transmitted anymore.

Now, modify the code to execute the v2_packet_handler method instead of v1_packet_handler. This way, OpenVSwitch will now behaves like a learning switch. So, run again the POX script. n3 should be reachable again from n1 and if2 on n0 shouldn’t see packets anymore.

The v3_packet_handler method is another version of the learning switch. Instead of handling every packets in the user space (inside POX), this method install flows to direct packets. Actually, this is the OpenFlow way to achieve packet management since it is performed in the kernel space. You can compare the performance of v2 and v3 in terms of latency and bandwidth.

    coord> ssh root@n0-adm
    n1> ping -c 10 n3-vnet3
    n3> iperf -s
    n1> iperf -c n3-vnet3

7 Mapping virtual nodes using Alevin

Here, we will use Alevin for mapping vnodes into the physical infrastructure under bandwidth and CPU constrains.

7.1 Platform deployment

For this experiment, we will use 3 nodes, if you do not have a reservation yet you can have a look at Make a reservation. Then, we deploy an environment using Kadeploy:


    frontend> kadeploy3 -f $OAR_NODE_FILE -k -e jessie-x64-big

7.2 Deploying Distem with support for Alevin

We have to use the latest revision of Distem, so let’s clone the repository somewhere:


     frontned> mkdir repositories
     frontend> cd repositories
     frontend> git clone https://github.com/madynes/distem.git
     frontend> cd distem

We have to download Alevin, there is already a compiled version provided by Distem. This version has been modified to read DOT files (thanks to hardik.soni@inria.fr):


     frontned> wget https://gforge.inria.fr/frs/download.php/file/35944/alevin-ext.jar

Once you have cloned the repository and cd into it, deploy Distem in the nodes using distem-bootstrap. We have to make sure that we activate the support for Alevin by passing the parameter –alevin with the PATH to the Alevin jar that you have just downloaded.


     frontned> scripts/distem-bootstrap -g --debian-version jessie --ci $PWD --alevin $PATH_TO_ALEVIN -p default-jdk,graphviz

We need to generate a physical topology, for that we have created a small Python script that uses Execo to get the physical topology of machines in Grid’5000. This script can be retrieved from Distem forge. You can use it like this (adapt the example to your specific machines):


     frontend> wget https://gforge.inria.fr/frs/download.php/file/35980/get_physical_topo.py
     frontned>  ~/get_physical_topo.py FILE_TOPO.dot $OAR_NODEFILE

Then, transfer the generated file to the coordinator:

     frontned> scp FILE_TOPO.dot root@COORDINATOR:~/

7.3 Creating a virtual platform with CPU and bandwidth constrains

Log into the coordinator and create the following file:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53

require 'distem'
require 'yaml'

IMAGE_FILE ="/home/cruizsanabria/jessie-mpich-lxc.tar.gz" # TO BE MODIFIED
NETWORK = "10.144.0.0/22"

vnode_topo = YAML.load(File.read(ARGV[0]))
physical_topo = ARGV[1]

Distem.client do |cl|

  puts 'Creating virtual network'

  cl.vnetwork_create("testnet",NETWORK)

  puts 'Creating containers'

  private_key = IO.readlines('/root/.ssh/id_rsa').join
  public_key = IO.readlines('/root/.ssh/id_rsa.pub').join

  ssh_keys = {'private' => private_key,'public' => public_key}

  vnode_topo.each do |vnode|

    res = cl.vnode_create(vnode["name"],{
                                         'vfilesystem' =>{'image' => IMAGE_FILE,'shared' => true},
                                         'vifaces' => [{'name' => 'if0', 'vnetwork' => "testnet",
                                                        'output' =>{"bandwidth" =>{"rate" => vnode["bandwidth"]} }}]
                                        }, ssh_keys)

    if vnode["cpu"] > 0
      cl.vcpu_create(vnode["name"], 1, 'ratio', vnode["cpu"])
    end

  end

  puts 'Starting containers'

#  cl.vnodes_to_dot("vnodes.dot") #uncomment this line for generating a dot file with the topology of the virtual platform
  cl.load_physical_topo(physical_topo)
  cl.run_alevin()

  vnodes_list = vnode_topo.map{ |vnode| vnode["name"]}
  cl.vnodes_start(vnodes_list)

  puts 'Waiting for containers to be accessible'
  start_time = Time.now

  cl.wait_vnodes()
  puts "Initialization of containers took #{(Time.now-start_time).to_f}"

end

This script will create several vnodes with different CPU and bandwidth constrains. Two new things to remark: we load the physical topology Distem is running on using the method load_physical_topo(file.dot) and we run Alevin using the method run_alevin(). Additionally, you have to create another file to specify the virtual nodes to create. This file has to be written in YAML and it has a simple format that goes something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19

- name: node0
  bandwidth: "3000mbps"
  cpu: 1
- name: node1
  bandwidth: "1000mbps"
  cpu: 2
- name: node2
  bandwidth: "3000mbps"
  cpu: 1
- name: node3
  bandwidth: "1000mbps"
  cpu: 3
- name: node4
  bandwidth: "4000mbps"
  cpu: 2
- name: node5
  bandwidth: "1000mbps"
  cpu: 1

This way of creating the virtual platform is not in any way specific to Alevin. You can use Alevin in the previous examples, just make sure of calling the two methods load_physical_topo(file.dot) and run_alevin() before the vnodes_start() method.

7.4 Deploying the virtual platform

Finally, you run the aforementioned script like this:

     coord> ruby test_alevin.rb vnodes_list.yml topo_grisou.dot

This will run Alevin in order to find the proper mapping and then Distem will deploy the virtual nodes accordingly. You should have an output that looks like this:


     root@grisou-20:~# ruby test_alevin.rb vnodes_list.yml topo_grisou.dot
     Creating virtual network
     Creating containers
     Starting containers
     Waiting for containers to be accessible
     Initialization of containers took 10.026698105

If you create many virtual nodes with different constraints that cannot be fulfilled, Distem will exit with an error that looks like this:


    root@grisou-20:~# ruby test_alevin.rb vnodes_list_big.yml topo_grisou.dot
    Creating virtual network
    Creating containers
    Starting containers
    /usr/lib/ruby/vendor_ruby/distem/netapi/client.rb:744:in `check_error': HTTP Status: 500, (Distem::Lib::ClientError)
    Description: "",
    Body: <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0//EN">
    <HTML>
      <HEAD><TITLE>Internal Server Error</TITLE></HEAD>
        <BODY>
            <H1>Internal Server Error</H1>
            Alevin could not map all the vnodes, aborting ...
            <HR>
            <ADDRESS>
                 WEBrick/1.3.1 (Ruby/2.1.5/2014-11-13) at
                 localhost:4567
                </ADDRESS>
    </BODY>
       </HTML>

        from /usr/lib/ruby/vendor_ruby/distem/netapi/client.rb:799:in `block (2 levels) in raw_request'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:228:in `call'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:228:in `process_result'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:178:in `block in transmit'
        from /usr/lib/ruby/2.1.0/net/http.rb:853:in `start'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:172:in `transmit'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:64:in `execute'
        from /usr/lib/ruby/vendor_ruby/restclient/request.rb:33:in `execute'
        from /usr/lib/ruby/vendor_ruby/restclient/resource.rb:67:in `post'
        from /usr/lib/ruby/vendor_ruby/distem/netapi/client.rb:798:in `block in raw_request'

We have to quit Distem in order to clean its state. This can be achieved by typing:


     coord> distem -q

However, you have to launch again Distem using distem-boostrap (from the frontend):


     frontned> scripts/distem-bootstrap -g --debian-version jessie --ci $PWD --alevin $PATH_TO_ALEVIN -p default-jdk,graphviz

You can omit the parameter -p given that the packages have already been installed previously.