This is the README file for Gibraltar v 1.0
Author - Arati Baliga (aratib@cs.rutgers.edu)

Table of Contents
-----------------
1. Overview and Component Interaction
2. Directory and File Layout
3. Installation
-----------------

1. Overview and Component Interaction
General Architecture, refer to the paper. Some implementation level information

# CIL module extracts data type definitions and list of global variables from the kernel source code.
  Output files are globavars.txt and typedefs.txt. Two more are output, which are currently unused.
  These are to be put in the input directory
# Type Mapper (maptypes.c) - Does two tasks - (a) Flattens the definitions available in input/typedefs.txt
				and generates a flattenned definition file input/typedefs.gen.
			   (b) Generates a memory map for static memory, mapping primitive types. This is 
				stored in input/memory.map.static.
# Gibraltar requires all the input files mentioned in the above two steps to start training.

# Gibraltar runs in the training mode. Training load has to be started on the target system
	=> It collects snapshots and dumps them in directory <snapshots>
	=> Snapshots are in an intermediate format, represented by files *.ir.*. 
	=> Two types of files are created for each composite data type encountered in the traversal (*.ir.decls, *.ir.dtrace)
	=> These intermediate files have to be converted to a Daikon format before running Daikon on them for invariant inference.

# Snapshots Conversion is done with code in directory <daikon-iface>.
	=> This code converts intermediate trace files generated during training to final Daikon trace files
	=> At this point, one can choose either memory addresses or pathnames to identify a single object 
	=> This code also splits objects that are very large across multiple program points, because Daikon cannot
		handle too many variables at a single program point.
	
# Daikon uses the Daikon trace files to generate an invariant list. 
	=> The scripts to run Daikon are available in scripts/rundaikon.pl and scripts/rundaikononll.pl
	=> The generated invariant list needs to merge all fields that were split. To do this, run script scripts/merge.pl
	=> Put the invariant lists in directory <invariant-list>
 
# Gibraltar can be run in detection mode once the invariant list is available

2. Directory and File Layout

Current directory
-------------------
gibraltar - the monitor executable.
runcmd - contains a sample command which shows the arguments that should be passed to gibraltar
monitor.h - main include file for the monitor
global.h - global definitions to be included in other .c files as well
gm.h - Header file required for the Myrinet page fetching code (taken directly from firmware code)
gm_new.h - Header file with some definitions created by me, required for the page fetching code
log2.c - Log functions required for the page fetching code
bytes.c - Functions for manipulation of bytes
defs.c - Functions to deal with kernel data definitions
daikon.c - Creates Daikon related declarations and Daikon data types corresponding to C primitive types
inv.c - All functions for loading and checking class/object/list invariants
libgm.a - The GM library file. Needed for page fetching functionality
monitor.c - The main monitor C file. Contains static memory scan and dynamic memory traversal functions.
maptypes.c - Generates the flattened typedefs.gen file from the typedefs.txt file and the static memory map file
preprocess.c - Some helper functions for the monitor
utils.c - Some general purpose functions used throughout

sha1.c, sha1.h - Contains code for calculating the secure hash. Currently unused.
 
Subdirs
--------
<daikon-iface>
	daikonTrace.c - Generates the actual Daikon tracesi (*.decls, *.dtrace) from the intermediate files (*.ir.decls, *.ir.dtrace)
					generated by Gibraltar.
	newgenDaikonTrace.c - This was a slightly older version of daikonTrace.c. Kept only for reference.

<generator>
	gen_typedef_headers_uniq.c - Code to generate unique definitions from the typedefs.txt file output by CIL module
	typedefs.txt.h - Output of the above code that contains unique definitions and can be used as a .h file
	gen_calc_offsets.c - Creates the generator program, which should be run on the target machine to calculate offsets.
	print_offsets.c - The output of the above code.
	newoffsets.cpp - Used for some sort of temporary fix for the offset issue. Might not be required now. 

<input>
	globvars.txt - Consists of all global and local static variables and their types - output from the CIL module
	typedefs.txt - Consists of all type definitions used in the kernel - output from the CIL module
	typedefs.gen - Flattened type definition file. All nested structs are flattened out
	offsets.txt - Offsets of each field for all type definitions. This corresponds to definitions in the typedefs.txt file
	offsets.gen - Offsets of each field for all type definitions. This corresponds to the definitions in the typedefs.gen file
	System.map - System.map file of the kernel running on the target system
	System.map.types - Types assigned to data structures in the System.map file
	memory.map.static - Static memory map of the target kernel
	static,invariants - List of constant invariants in the static memory of the kernel.	

<loads>
	runtestload.sh - Script that runs the test load on the target
	runtrainingload.sh - Script that runs the training load on the target.

<scripts>
	genfpfile.pl - Script takes the *.alert file generated by the monitor and generates a list of unique false positives.
	merge.pl - Script merges seperate record generated by Daikon output. Daikon cannot handle a large number of fields at a 
				single program point. Therefore, fields have to split across program points for very large structs. The output
				generated by Daikon has to be finally merged before the detection code in the monitor can process it.
	rename.pl - Script used for renaming files
	rundaikon.pl - Scripts to run Daikon on memory snapshots. Outputs an invariant list.
	rundaikononll.pl - Script to run Daikon on memory snapshots of linked lists. These are processed in a slightly different fashion.	

<sensitivity>
	commonobjs.c - Code that was built to check for common invariants across reboots.

<verify>
	check_static_map.c - Code to check for holes in the static memory region.

<cil-modules>
	CIL modules to extract data type definitions and global variables from the kernel source code
	Generates the two files "input/typedefs.txt" and "input/globvars.txt" used by the monitor and maptypes.c

<snapshots>
	Stores the snapshots created during the training period.

<invariant-list> 
	Stores the invariant files to be used by the detection engine

--------


3. Installation
# Target must run the modified version of the GM firmware.
# System.map file of the target should be available to the monitor program in directory <input>
# Generator program to calculate offsets should be run on the target,


# Observer runs the standard version of GM firmware.
# Gibraltar runs like a standard application on the observer 
