NETinVM

A tool for teaching and learning about systems, networks and security

Authors: Carlos Perez & David Perez
Date: 2014-02-18

Contents

Introduction

NETinVM is a VMware virtual machine image that provides the user with a complete computer network. For this reason, NETinVM can be used for learning about operating systems, computer networks and system and network security.

In addition, since NETinVM is a VMware image, it can be used for demonstrations (i.e. in classrooms) that can be reproduced by students either in a laboratory or on their own laptop and thus, at home, at the library... For these reasons we present NETinVM as an educational tool.

Description of NETinVM

NETinVM is a VMware virtual machine image that contains, ready to run, a series of User-mode Linux (UML) virtual machines. When started, the UML virtual machines create a whole computer network; hence the name NETinVM, an acronym for NETwork in Virtual Machine. This virtual network has been called 'example.net' and has fully qualified domain names defined for the systems: 'base.example.net', 'fw.example.net', etc.

All of the virtual machines use the Linux operating system. The VMware virtual machine is called 'base' and it runs openSUSE 12.1. User-mode Linux machines use Debian 6.0 and they have different names depending on their network location, because they are grouped into three different subnets: corporate, perimeter and external. The subnetworks are named 'int' (for internal network), 'dmz' (for DMZ or demilitarized zone, usually used as a synonym for perimeter network) and 'ext' (for external network).

One of the UML machines, 'fw', interconnects the three networks ('int, 'dmz' and 'ext'), allowing for communication and packet filtering. The rest of the UML machines have only one network interface, connected to the network they are named after:

int<X>
UMLs connected to the internal network. <X> can take values from 'a' to 'f', both inclusive. These machines only offer SSH service by default.
dmz<X>

UMLs connected to the perimeter network (DMZ). They are supposed to be bastion nodes. Two preconfigured bastion nodes are provided, each one with its appropriate alias:

  • 'dmza' is aliased as 'www.example.net' and it offers HTTP and HTTPS services.
  • 'dmzb' is aliased as 'ftp.example.net' and it offers FTP.
ext<X>
UMLs connected to the external network (ie: Internet).

Because a picture paints a thousand words, or so they say, the following figure shows NETinVM with all of the virtual machines running inside.

img/netinvm_general.png

General view of NETinVM in VMware. The document example-net.pdf offers a detailed view.

All of the elements referenced before are shown in the image with their IP and ethernet addresses. The following rules have been used for assigning addresses:

In addition to the computers and networks already described, the figure also shows the real computer where NETinVM runs ('REAL COMPUTER') and VMware Player's typical network interface ('vmnet8'), which optionally interconnects NETinVM's networks with the external word.

When they boot, all UML virtual machines get their network configuration from 'base', which provides DHCP and DNS services to the three NETinVM networks through its interfaces 'tap0', 'tap1' and 'tap2'.

Routing works as follows:

Thus, IP traffic exchanged among the three networks goes through 'fw', while traffic going out from NETinVM to the external world goes through 'fw' if (and only if) it comes from the internal or perimeter networks. All traffic going to the real world (outside NETinVM) exits through 'base' which, as 'fw' does, applies IP forwarding and NAT to this outgoing traffic.

Communication between 'base' and any UML machine, in both directions, is direct, without going through 'fw'. (When the communication is started from a UML machine, the IP address of the interface of 'base' in the corresponding network must be used.) This configuration permits access from 'base' to all UML machines using SSH independently of the packet filtering configuration at 'fw'.

As an additional consideration, please note that the SNAT configuration in 'fw' described above is necessary for responses to outgoing connections to the Internet originating from the internal or perimeter networks to come back through 'fw'. Otherwise they would be routed directly from 'base' to the UML machine through 'tap1' or 'tap2' without traversing 'fw'.

Working with NETinVM

Initial start-up

To start NETinVM you need to download the VMware image, uncompress it and run it with the VMware Player program, which can be downloaded free of charge from VMware. (Alternatively, you can also use the VirtualBox image and run it with VirtualBox).

Once the VMware has been started, base.example.net is running, offering a standard KDE desktop for the unprivileged user user1 Its password is "You should change this passphrase". The same password is valid for root, also. (The same users and passwords are also valid for UML machines).

Although base runs connected to the host's VMware virtual network vmnet8, which allows for connectivity between the virtual machine and the world beyond the host by means of NAT, the firewall (iptables) in base is configured so that the only incoming traffic allowed from that connection (eth0) is that destined to port 22/tcp (SSH). All other incoming traffic is denied. Outgoing traffic, on the other hand, is allowed, and the network interfaces tap0, tap1 and tap2 are considered internal and no restriction is applied to their traffic.

The idea is for base to be a desktop in which to work while doing exercises and that's why it includes OpenOffice.org and other similar tools. It is also designed to be the best place to monitor the traffic in the internal networks (through tap0, tap1 and tap2) and that is why it also includes wireshark and tcpdump. Other tools can, of course, be added by the user.

Graphical interface

Starting in 2010, NETinVM includes a "Folder View" component labelled "UML Desktop Folder" with graphical links to applications:

img/UML_desktop_folder.jpeg

Links to applications to perfom most usual tasks with UML machines.

When clicked on them, the links perform the following actions:

"Run all"
Brings to life NETinVM (see Full startup process).
"Shutdown all"
Shuts down all UML machines.
"Backup UML machines"
Creates a backup of the whole NETinVM network. (All machines must be shut down before backing up.) The backup is stored in a "tar.gz" file whose name can be set during the process. By default, backups are stored in "~user1/uml/backups" and are named "uml_machines_yyyy-mm-dd_hh-mm.tgz", where "yyyy-mm-dd_hh-mm" stands for date (year, month, day of month) and time (hours, minutes).
"Restore UML machines"
Deletes current UML machine's state and restores a previous one. The backup file can be selected during the process. (All machines must be shut down before restoring a backup.)
"NETinVM Documentacion"
Launches a browser which shows a local copy of this documentation.

Full startup process

The command uml_run_all.sh is the magic word that brings to life almost everything in NETinVM. Specifically, it launches the following elements:

  • the virtual hubs that make up the external (ext), internal (int) and screened (dmz) networks
  • the UML virtual machines: fw, exta, inta, dmza and dmzb.

Although NETinVM is ready to run up to six UML virtual machines per network ('a' through 'f'), with just the four mentioned above it is possible to develop a wide range of activities although in practice the less UML virtual systems running the faster the entire system will run.

Each UML virtual machine starts up on a different KDE virtual desktop:

  • exta on desktop 2
  • fw on desktop 4
  • dmza on desktop 5
  • dmzb on desktop 6
  • inta on desktop 8

On each desktop the following elements, all shown in the figure below, can be identified:

  • A xterm window that appears at the very end of the booting process and allows the user to log into the UML virtual sytem. It is the equivalent of a serial terminal hardwired to the UML virtual system.
  • Two more xterm windows which also work as terminals of the UML virtual system, but start minimized.
  • A minimized xterm window that works as the console of the UML virtual system, displaying system messages right from the start.
img/netinvm_exta_60.jpeg

View of desktop 2 after booting exta.

Once all UML virtual systems have been started it is easy to locate their corresponding windows by using the list of windows from KDE, which can be accessed by clicking on the "Window list" icon in the panel or pressing 'ALT-F5'. The result should be similar to the following figure:

img/netinvm_lista_ventanas.jpeg

List of windows after all UML virtual systems have been started.

All UML virtual systems have an unprivileged user user1. The default password for both user1 an root in all UML machines is (as with "base") "You should change this passphrase".

Usage example: capturing an HTTP connection

As an example of how NETinVM can be used for teaching and learning, this section will show how the network traffic of an HTTP connection could be generated, captured and analyzed within NETinVM.

Once all virtual systems have been started within NETinVM by executing the 'uml_run_all.sh' shell script, the procedure to follow is just as simple as if all systems were real. We will establish an HTTP connection from 'exta' to 'www.example.net' and we will capture and analyze that traffic using 'wireshark' running on 'base'.

Note

Note that 'base' has access both to the external network ('ext') and the screened subnet ('dmz') through its 'tap0' and 'tap1' network interfaces, respectively. Thus, we could capture the traffic on either network (or on both at the same time using two instances of 'wireshark') since packets will travel from 'exta' to 'fw' through the external network and then from 'fw' to 'www.example.net' ('dmza') through the 'dmz' network, and back. For this exercise, we will assume that we are interested only in the traffic in the 'dmz' network.

  1. First, start 'wireshark' on 'base' and set it to listen on the 'dmz' network (interface 'tap1')

    img/netinvm_wireshark_interfaces_60.jpeg
  2. Then, log into 'exta' using one of its 'konsole' windows on desktop 2 and request the home page of the web server running on 'www.example.net' by executing the following command:

    $ wget www.example.net
    
  3. Finally, go back to 'wireshark', stop the capture and analyze the traffic at your leisure. The session captured in 'wireshark' will look similar to the following figure, which shows the HTTP session that was established between 'exta' (10.5.0.10) and 'www.example.net' (dmza, 10.5.1.10)

    img/netinvm_wireshark_captura_60.jpeg

A More Detailed Description

Storage in the UML systems

To save drive space all UML machines (including 'fw') use the same image for their root file system, contained in the file '/home/user1/uml/data/uml_root_fs' of 'base'. This file system includes a complete installation of a Debian 5.0 distribution and it takes up about 788 MiB of space:

user1@base:~> ls -lsh /home/user1/uml/data/uml_root_fs
788M -rw-r--r-- 1 user1 users 1,0G 2010-10-05 15:45 /home/user1/uml/data/uml_root_fs
user1@base:~>

The maximum size of this file system is 1 GB, which leaves enough room for the UML systems to store programs and temporary data (350 MiB, available on /dev/ubda):

exta:~# df -h
Filesystem            Size  Used Avail Use% Mounted on
/dev/ubda            1008M  608M  350M  64% /
tmpfs                  62M     0   62M   0% /lib/init/rw
tmpfs                  62M  4.0K   62M   1% /dev/shm
tmpfs                 768M     0  768M   0% /tmp
none                   19G  4.3G   14G  24% /mnt/tmp
none                   19G  4.3G   14G  24% /mnt/config
none                   19G  4.3G   14G  24% /mnt/data
exta:~#

Also, the UML systems can use up to an additional 14 GiB on 'base', accessible through the network (e.g. scp) or through the '/mnt/tmp' folder that is mounted automatically in every UML system and which corresponds to the folder '/home/user1/uml/mntdirs/tmp' on 'base'.

The changes that each UML system makes on its root file system is stored, using UML's copy-on-write mechanism, in a separate file, '/home/user1/uml/machines/<machine>/cow' on 'base', where <machine> is the hostname of each UML system. Obviously, the size of these files will depend on the amount of changes that happen on each system, but if these are limited to the normal booting process and minor configuration changes then they tend to take up very little space:

user1@base:~> ls -lsh /home/user1/uml/machines/exta/cow
2.6M -rw-r--r-- 1 user1 users 1.1G 2010-11-11 16:14 /home/user1/uml/machines/exta/cow
user1@base:~>

Note that the size really occupied in the hard drive is just 2,6 MiB and not 1.1 GiB, since it is a so called sparse file.

Finally, each UML system uses a further 256 MiB for swap space, on a file named '/home/user1/uml/machines/<machine>/swap' on 'base'. The space occupied by this file in the hard drive is, however, much smaller, unless the systems need to start paging memory to disk:

user1@base:~> ls -lsh /home/user1/uml/machines/exta/swap
8.0K -rw-r--r-- 1 user1 users 256M 2010-11-11 16:06 /home/user1/uml/machines/exta/swap
user1@base:~>

Again, please note that the size really occupied by the file is just 8.0 KiB and not 256 MiB, since this is another sparse file.

Configuration of the UML systems

The fact that all the UML systems share the same root file system, let us call it "reference file system" (RFS), is very convenient, for several reasons:

  1. It saves space. Using the copy-on-write mechanism it is possible to have up to 19 UML systems running and taking up as little as 0.8 GiB on 'base'
  2. It simplifies the usage. All UML systems are similar and have the same software installed.
  3. It simplifies maintenance. Patching or adding a package to all UML systems just requires doing so to one of them. (Needs to be explained).

This uniformity however, could work against us if there wasn't a way to tailor the configuration of each system independently from the others to perform particular tasks; fortunately, NETinVM provides a way to do so. For example, out of the box, 'fw' has three network interfaces and performs IP packet routing and filtering, 'dmza' runs a web server on port 80, and 'exta' only offers SSH on port 22.

The way each UML system gets its own configuration in NETinVM works as follows:

  • The file '/etc/fstab' of the Reference File System (RFS) contains entries so that all UML systems mount the subfolders 'config' and 'data' from under '/home/user1/uml/mntdirs' on 'base'. They are mounted on subfolders of the same name under '/mnt' on the UML systems:

    exta:~# mount | egrep 'config|data'
    none on /mnt/config type hostfs (ro,/home/user1/uml/mntdirs/config)
    none on /mnt/data type hostfs (ro,/home/user1/uml/mntdirs/data)
    exta:~#
    
  • The shell script '/etc/init.d/configure_uml_machine.sh' from the RFS checks at boot time if a script named '/mnt/config/bin/configure.sh' exists and is executable. If so, it executes it.

  • The above script 'configure.sh' performs the following actions:

    1. Checks if it is the first time that the script is being invoked on that system. If it is not, it exits.
    2. Then, it marks the UML system as having been already configured.
    3. Then, it applies the default configuration (if any is available, see below)
    4. Then, it applies the configuration corresponding to the network the system is connected to (if any is available, see below)
    5. Finally, it applies the configuration corresponding to that particular system (if any is available, see below)

    By editing those configurations, which are stored in 'base', it is possible to tailor the configuration of all systems at once, the systems of one of the networks, or an individual system.

    All these configurations (default, network, system) consist of two parts:

    1. Enabling Services: A file containing a list of boot scripts ('/etc/init.d/*') that need to be enabled by 'configure.sh'. The scripts may already exist in Debian (e.g. 'apache') or may be located in the subfolder 'init.d' of the folder where the file with the list is. Further details are given below.

      Please note that 'configure.sh' uses 'update-rc.d' to permanently change the configuration of the UML system, so that it will survive reboots. The configuration is not re-made on each reboot.

    2. Executing Additional Commands: The current configuration does not make use of this feature, but it is possible to specify a list of commands to be executed by 'configure.sh' to make any modifications that would not fit into the structure of a boot up service.

  • Thus, the configuration of the UML systems depends exclusively on a set of files stored in the '/home/user1/uml/mntdirs/config' folder of 'base' and can be prepared or modified before booting any UML system, all with the identity of the unprivileged user 'user1' in 'base'. The structure of the folders containing the configuration is as follows:

    bin

    Folder that contains 'configure.sh'. No need to modify it.

    templates

    Templates to be used when creating new files in the other folders.

    default

    Default configuration. This will be applied to all UML systems.

    network_{ext,dmz,int}

    Network dependent configuration which will be applied to UML systems on the corresponding network ('ext', 'dmz' or 'int').

    <machine_name> (e.g. 'exta' or 'fw')

    Machine-specific configuration which will be applied to the corresponding UML system.

    Inside, all these folders follow the same structure for their contents. Using 'fw' as an example:

    fw/services_to_enable.sh

    List of services to be enabled in the UML system. Each service declared here must be either a script that already exists in the RFS under '/etc/init.d' (e.g. 'apache' or 'vsftpd'), or a script stored under 'fw/init.d' (e.g. 'add_ips.sh').

    For each script that exists under 'fw/init.d/', configure.sh will create a symbolic link in '/etc/init.d' with the same name, pointing to '/mnt/config/fw/init.d/<script_name>' (e.g. for 'fw/init.d/add_ips.sh' a link named '/etc/init.d/add_ips.h' will be created pointing to '/mnt/config/fw/init.d/add_ips.h').

    If a file or link with that name already existed under '/etc/init.d' then during the configuration the old file or link would be renamed to '/etc/init.d/<script_name>.replaced before creating the new link.

    fw/init.d

    Folder containing the scripts to be added during configuration (see above).

    fw/commands_to_run.sh

    Commmands to be executed during the configuration process. This script is simply executed by 'configure.sh', so there is total freedom in what the script can do. Remember, though, that this script will only be run once, and not every time the UML system boots up. In order to have its commands executed on every reboot 'commands_to_run.sh' will have to add those commands to one of the booting scripts of the UML system, but in that case it would probably be better to just add a new service, like 'add_ips.sh', for example.

After this somewhat long description, you may be wondering if the configuration could not have been made simpler, and you may be right, but think of the advantages that this setup offers:

  1. The configuration can be completely changed without even booting any UML system, including the configuration process itself, since the UML systems just blindly execute 'configure.sh', which resides in 'base'.
  2. The configuration process is only carried out once per system and therefore all configuration changes that need to be permanent are done via modifications on the system's file system, exactly where one would expect to find them in any real system. For example, if a new service is added to system 'fw', the directory '/etc/rc<X>.d' (<X> is the execution level) will contain the links to that service, just as one would expect in a real system.
  3. It is possible to store different configurations for totally different exercises in different directories (e.g. 'config_ex1/' and 'config_ex2/') or even tar files (e.g. 'config_ex1.tgz') and thus start each exercise with totally clean systems configured in totally different ways (e.g. moving 'config/' to 'config.old' and then moving or copying 'config_ex1/' to 'config/').

The main alternative to this setup, which we found in the Netkit [*] project, is that external commands get executed on the UML systems every time they are booted. In our modest opinion, our approach is more educational because if, on every boot, external commands are executed without following the standard Debian structure then the illusion of working on a real system is broken. For some other activities this fact will just be irrelevant, but since we think of NETinVM mainly as a teaching and learning platform, we have decided to use our setup described above.

[*]Please note that we truly see Netkit as a great project. We decided to create our own platform not because we didn't see its value but simply because we felt it didn't exactly cover our need for a specific network with IPs, MACs, hostnames, etc. that were easy to remember, both for teaching and for learning.

Sample Exercises

The following documents describe some sample exercises that can be carried out using NETinVM.

Port Scanning with Nmap
Perform a full port scan against "www.example.net" from "exta.example.net" .

Download information

NETinVM can be downloaded directly from its main page: index.html