Modified:
docs/publications/thesis/msc-thesis-2008/lemona-thesis-conclusion.tex
docs/publications/thesis/msc-thesis-2008/lemona-thesis-discussion.tex
docs/publications/thesis/msc-thesis-2008/lemona-thesis-introduction.tex
docs/publications/thesis/msc-thesis-2008/lemona-thesis-methodology.tex
docs/publications/thesis/msc-thesis-2008/template-thesis.tex
Log:
- added hyperref module
- added some changes by Laurent (intro, discussion, conclusion)
- completed methodology part
Modified:
docs/publications/thesis/msc-thesis-2008/lemona-thesis-conclusion.tex
==============================================================================
--- docs/publications/thesis/msc-thesis-2008/lemona-thesis-conclusion.tex
(original)
+++ docs/publications/thesis/msc-thesis-2008/lemona-thesis-conclusion.tex
Sun Nov 16 21:30:16 2008
@@ -6,7 +6,7 @@
% 2- Future Work
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
-Our paper posed the general environment of computer forensics
+Our thesis presented the general environment of computer forensics
analysis, and introduced \textbf{Lemona}, our solution for a
monitoring architecture relying on open standards and implementations,
and aiming towards the post-mortem investigation of compromised
@@ -18,11 +18,21 @@
\section{Outcomes}
As we demonstrated with the presentation of \textbf{Lemona}'s
-performance and settings, its design allows it to theoretically trace
-and record the complete activity at the lowest architectural level of
-an operating system, permitting a global review of the system's life,
-while managing to impact the system with an acceptable overhead, thus
-remaining satisfyingly usable and available.
+performance and settings, its design makes it possible to
+theoretically trace and record the complete activity at the lowest
+architectural level of the operating system; thus allowing a global
+review of the system's life, while managing to impact the system with
+an acceptable overhead, thus remaining satisfyingly usable and
+available.
+
+In its current state, \textbf{Lemona} uses a fairly verbose
+development approach, forcing programmers to hook themselves into each
+\emph{system call} by means of \emph{kernel} patches. However, this
+is an efficient approach from a performance standpoint, as shown by
+our proof of concept. Feature-wise, \textbf{Lemona} is currently
+incomplete and performs raw monitoring of the activity, without
+providing user-friendly tools for information processing and
+data-mining. This is the obvious next step of our research.
In the long run, \textbf{Lemona} will not only allow a forensics
investigator to determine how and when an attack occured, but also
@@ -31,7 +41,6 @@
records it generates, replaying the compromised system's lifecycle
step by step.
-
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Future Work
\section{Future Work}
@@ -40,7 +49,14 @@
which are listed below. It could benefit of numerous other variants,
which are currently being studied.
-\subsection{IDS/IPS Integration}
+\subsection{Software Integration}
+
+We think \textbf{Lemona} could benefit from other software suites, and
+vice-versa. There are many possible bridges between our solutions and
+others, which we could use to augment a system's surveillance.
+
+
+\subsubsection{IDS/IPS Integration}
Conceptualy, \textbf{Lemona} might as well be used as an
\emph{IPS}/\emph{IDS} or interoperate with one. This would of course
@@ -54,7 +70,7 @@
require \textbf{Lemona} to be packaged and/or communicate with a
database of exploits' workflows.
-\subsection{Improvements to the Static Statistical Analyzer}
+\subsubsection{Improvements to the Static Statistical Analyzer}
There are many methods to analyze traces of a monitored system. The
flow of \emph{system calls} can for instance be compared to a database
@@ -72,7 +88,10 @@
then be compared to a database of preset forms matching the exploits'
database.
-\subsection{Design Decisions}
+Not only might this solution be more efficient to detect attacks, it
+could also prove itself more performant in terms of speed.
+
+\subsection{Software Design}
\textbf{Lemona} is a brand new project and its design is still
morphing, both at the low and high levels of the
@@ -84,3 +103,28 @@
generic solutions, that would alleviate both the burden of the
developers to integrate new patches and the amount of complexity
required to set up the system.
+
+\subsubsection{Automated Patches' Generation}
+
+Such approaches include \emph{Aspect Oriented Programming} techniques
+to automatically generate the \emph{kernel} patches, thus minimizing
+the amount of coding required to trace all the system's calls.
+
+This would also make \textbf{Lemona} able to support brand new
+system-calls without touching the source code, as long as they conform
+to the Linux \emph{kernel} naming conventions.
+
+Though this design might have a performance impact, it is an
+interesting approach to consider.
+
+
+\subsubsection{Libraries}
+
+Our approach to logging, monitoring and reporting is quite modular and
+abstract, and it occured to us during the development phase that this
+part of \textbf{Lemona} could live as a project of its own.
+
+By defining \emph{APIs} for this purpose, we would also allow other
+projects to come up with their own implementation, and come one step
+closer towards the drafting of a open monitoring architecture.
+
Modified:
docs/publications/thesis/msc-thesis-2008/lemona-thesis-discussion.tex
==============================================================================
--- docs/publications/thesis/msc-thesis-2008/lemona-thesis-discussion.tex
(original)
+++ docs/publications/thesis/msc-thesis-2008/lemona-thesis-discussion.tex
Sun Nov 16 21:30:16 2008
@@ -6,11 +6,42 @@
% 2- Limitations
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
+We analyze here the results we provided at the end of our
+experimentation, and discuss their relevance and applicability within
+the scope of our reseach.
+
+We also list \textbf{Lemona}'s limitations and drawbacks in regards to
+our objectives.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Findings
\section{Findings}
+\subsection{Applicability}
+
+Our technical proof of concept showed that it is possible to
+extensively monitor a system's activity using a solution like
+\textbf{Lemona}. We managed to create records of each executed
+\emph{system call} within the scope of our limited implementation.
+
+\textbf{Lemona} hooks itself at the entry and exit points of
+\emph{system calls} and stores their parameters for future review,
+thus allowing us to follow the system's activity step by step.
+
+This means we could develop future versions of \textbf{Lemona}
+monitoring a complete set of system calls, and implement forensics
+analysis tool capable of reconstructing an attack by querying the
+datastore.
+
+\subsection{Performance}
+
+Our experimentation demonstrates that a fully monitored system can
+still remain usable and responsive enough for normal use, both for
+end-users and production environments in enterprise. However,
+intensive monitoring is probably not recommended for high-level
+computational servers, which means \textbf{Lemona} might not be an
+adequate solution for CPU servers and processing clusters.
+
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Limitations
@@ -21,6 +52,8 @@
foolproof. These are the various ways that we could think of so far to
circumvent its surveillance, or render it inefficient.
+Also, Lemona is not yet complete, and needs some improvements.
+
\subsection{Break the Pipe or Break the Storage Point}
If the connection between \textbf{Lemona}'s storage point and the
@@ -60,8 +93,8 @@
be delayed, thus allowing an attacker to exploit a vulnerability and
crash the system without it showing in the logs.
-A correct adjustement to the load-balancing taking into consideration
-the web-server's network load theoretically overcomes this problem.
+A correct adjustement to the load-balancing, taking into consideration
+the web-server's network load, theoretically overcomes this problem.
\subsection{Break Lemona}
Modified:
docs/publications/thesis/msc-thesis-2008/lemona-thesis-introduction.tex
==============================================================================
--- docs/publications/thesis/msc-thesis-2008/lemona-thesis-introduction.tex
(original)
+++ docs/publications/thesis/msc-thesis-2008/lemona-thesis-introduction.tex
Sun Nov 16 21:30:16 2008
@@ -18,8 +18,8 @@
projects referenced in this document are based mostly on Linux- and
UNIX-based systems, although most of the concepts are applicable to
other operating systems. Thus, references made to the "\emph{kernel}"
-or "system" should be interpreted as being Linux when not otherwise
-specified.
+or "\emph{system}" should be interpreted as being Linux when not
+otherwise specified.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
@@ -163,7 +163,7 @@
\begin{figure}
\begin{centering}
\centering
-
\includegraphics[scale=0.40]{images/security/lemona-security-concepts-cia.png}
+
\includegraphics[scale=0.40]{images/security/lemona-security-concepts-cia.eps}
\caption{The CIA Elements of Security}
\label{fig:fig1}
\end{centering}
@@ -232,7 +232,7 @@
\begin{figure}
\begin{centering}
\centering
-
\includegraphics[scale=0.40]{images/security/lemona-security-concepts-threats.png}
+
\includegraphics[scale=0.40]{images/security/lemona-security-concepts-threats.eps}
\caption{Common Threats}
\label{fig:fig2}
\end{centering}
@@ -324,7 +324,7 @@
\begin{figure}
\begin{centering}
\centering
-
\includegraphics[scale=0.60]{images/architecture/lemona-architecture-2.png}
+
\includegraphics[scale=0.60]{images/architecture/lemona-architecture-2.eps}
\caption{Lemona's Standard Use Case}
\label{fig:fig3}
\end{centering}
@@ -335,17 +335,22 @@
% Outline
\section{Outline}
-Our goal in the first part of this article is to present the current
-state of the forensics software market regarding post-mortem analysis
-tools. We will briefly review in the "Related Work" section the latest
-literature and project research that we have been basing our research
-on to confirm or infirm our results and design decisions.
-
-We will then expose this design by detailing in the "Lemona" section
-our software solution and its architecture, as well as the current
-state of its implementation. We will then expose and explain the
-results we have obtained so far.
-
-Finally, we would like to discuss the possible improvements that could
-be brought to \textbf{Lemona} and other alternatives directions we are
-in the process of experimenting with.
+Our goal in the first part of this thesis is to present the current
+state of the forensics software market regarding forensics analysis
+tools, and more specifically post-mortem analysis solutions. We will
+briefly review in the "Related Work" chapter the latest literature and
+project experiments that we have been basing our research on to
+produce our solution (\textbf{Lemona}).
+
+We will then present our design and approach to the realisation of
+this project in the "Methodology" chapter, as well as provide detailed
+insight about our implementation and its current state.
+
+In the "Experimentation" chapter, we describe how we tested
+\textbf{Lemona} and publish our results.
+
+Finally, we discuss the relevance and impacts of our results, and open
+the discussion towards the presentation of the problems and
+limitations of our solution. This leads us to listing possible
+improvements we could bring to \textbf{Lemona} and other alternative
+directions we are in the process of experimenting with.
Modified:
docs/publications/thesis/msc-thesis-2008/lemona-thesis-methodology.tex
==============================================================================
--- docs/publications/thesis/msc-thesis-2008/lemona-thesis-methodology.tex
(original)
+++ docs/publications/thesis/msc-thesis-2008/lemona-thesis-methodology.tex
Sun Nov 16 21:30:16 2008
@@ -130,7 +130,7 @@
facilitate debugging and initial testing, or for end-users
convenience. However, administrators of production systems should
build the module statically for security reasons. This will for
-instance avoid early logs to be lost.
+instance avoid some early logs to be lost.
So far, the modules can only transmit the generated logs via two
different methods:
@@ -195,8 +195,8 @@
modification. Once a file is full, another one is created. The second
application will read already filled files, decrypt them and insert
the data in the database. This architecture avoids heavy slowdowns
-upon data reception by delegating the load of the decryption (if any)
-on the receiving application.
+upon data reception by delegating the load of the logs processing and
+decryption (if any) on the receiving application.
\subsection{Forensics Component}
@@ -218,15 +218,78 @@
\section{Algorithms}
-% TODO
+\subsection{Vocabulary}
+
+\begin{itemize}
+ \item \textbf{zest}: Name given to a single log entry generated by
\textbf{Lemona}
+ \item \textbf{blade}: Name given to the function in charge of processing
a given \emph{syscall} argument type (e.g. integer, string, iovec, ...)
+ \item \textbf{mixer}: Name given to the data structure describing which
\emph{blades} should be used for a given \emph{syscall} argument and how
many of those are expected upon entrance and exit
+ \item \textbf{mixers}: Name of the array containing the mixer or all
existing \emph{syscalls}
+\end{itemize}
+
+\subsection{Init}
+
+On init \textbf{Lemona} simply call the initialization method of every
+enabled backend. If none of them manage to initialize properly, the
+loading of lemona is interrupted. This done, a variable
+(\verb=lemona_activated=) is atomically set to signal that
+\textbf{Lemona} can now be invoked for logging purpose.
+
+\subsection{Hooks}
+
+From the hook point of view, little is done. A check is made on
+\verb=lemona_activated= to assert the fact that the logging facility
+is active and if \textbf{Lemona} has been built as an external module
+the logging method address is retrieved using the \textbf{kallsyms}
+kernel facility.
+
+\subsection{Logging}
+
+\textbf{Lemona} logging works in a simple but nevertheless effective
+way. Upon each monitored \emph{syscall} entry or exit, a counter
+variable (\verb=lemona_clients=) indicated are many concurrent logging
+are undertaken is first atomically incremented then the
+\verb=lemona_log_{in,out}= is called with the system call number and
+its arguments. Lemona will then fetch the corresponding entry for this
+\emph{syscall} in the \emph{mixers} array. The creation of a
+\emph{zest} is done in two phases:
+
+\begin{enumerate}
+ \item The \emph{blades} are invoked to determine how much space is
needed for a given argument
+ \item The \emph{blades} are invoked to copy the argument data in the
\emph{zest}
+\end{enumerate}
+
+Once the \emph{zest} have been created, it is passed to every logging
+backend supported by \textbf{Lemona}. It is important to note that the
+success of these backends is not tested, since little can be done upon
+error; the backends will, however, report the problems by printing a
+succinct message in the logs.
+
+\subsection{Unloading}
+
+To avoid crash during the unloading process, the
+\verb=lemona_activated= variable is first unset then \textbf{Lemona}
+will wait until all logging activity currently executed finish they
+work by checking the \verb=lemona_clients= variable. The actual
+cleanup being made once this value reach zero.
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Implementation
\section{Implementation}
-\lstset{language=C}
-\begin{lstlisting}
+\lstset{language=C,basicstyle=\footnotesize,tabsize=2}
+
+A set of macros (figure \ref{listing:Macros}) are used when patching
+\emph{syscall} in order to call \textbf{Lemona} logging
+facilities. They principal advantage is to ease the reading of the
+patched code. All actions related with the logging will be enclosed
+inside the macros \verb=lemona_block_start= and
+\verb=lemona_block_end=.
+
+\begin{figure}
+ \begin{centering}
+ \begin{lstlisting}
extern atomic_t lemona_activated;
static lemonalogfn _lemona_log = NULL;
@@ -251,12 +314,30 @@
_lemona_log = (lemonalogfn)kallsyms_lookup_name("lemona_log"); \
_lemona_log(sysnr, in, argnr, extnr, ## __VA_ARGS__); \
}
-\end{lstlisting}
-
-\begin{lstlisting}
+ \end{lstlisting}
+ \caption{Lemona's Logging Macros}
+ \label{listing:Macros}
+ \end{centering}
+\end{figure}
+
+As for the definition of a \emph{zest} it is quiet simple (figure
+\ref{listing:Zest}). Common values are defined directly within the
+structure whereas the others are expected to be find following the end
+of the structure. To facilitate the access to the data following the
+structure the \verb=argsz=, \verb=args=, \verb=extsz= and \verb=exts=
+pointer are used during logging althought they had a little
+unnecessary weight to the structure; they might be removed later.
+
+\begin{figure}
+ \begin{centering}
+ \begin{lstlisting}
struct lemona_zest {
- char magic[4];/* magic number */
+ /*
+ need always to be the first member. This facilitate parsing since a
+ zest can find itself cut across two log files
+ */
int size; /* size taken by this zest and args sz/value */
+ char magic[4];/* magic number */
int in; /* input or output ? */
struct timespec time; /* call start/end time (getnstimeofday) */
@@ -277,18 +358,40 @@
int *extsz; /* size of each extension */
void *exts; /* extra values. located after the last arg */
} __attribute__((packed));
-\end{lstlisting}
-
-\begin{lstlisting}
+ \end{lstlisting}
+ \caption{Lemona's Zest Structure}
+ \label{listing:Zest}
+ \end{centering}
+\end{figure}
+
+
+To create such a \emph{zest}, \textbf{Lemona} relies on a set of what
+we call a \emph{mixer} (figure \ref{listing:Mixer}). It is a structure
+compound of three data members:
+
+\begin{enumerate}
+ \item \verb=sysnr=: \emph{syscall} number affiliated to this \emph{mixer}
+ \item \verb=in=: values to be used when logging entry to a \emph{syscall}
+ \item \verb=out=: values to be used when logging exit from a
\emph{syscall}
+\end{enumerate}
+
+The \verb=in= and \verb=out= data fields are themselves data structure
+containing information about the number of arguments needed by the
+logging method for this given \emph{syscall} and which \emph{blade} to
+use to process each of them.
+
+\begin{figure}
+ \begin{centering}
+ \begin{lstlisting}
struct lemona_mixer {
int sysnr; /* system call number */
struct __lemona_mixer in; /* call entrance mixer */
struct __lemona_mixer out; /* call exit mixer */
-}
+};
struct __lemona_mixer {
int argnr; /* number of syscall parameters */
- int extnr; /* number of extra parameters */
+ int extnr; /* number of extra parameters */
struct __lemona_mixer_handler handlers[6]; /* pre-defined handlers */
};
@@ -303,9 +406,37 @@
int off, /* mem. offset */
void *fruit1, /* 1st data arg */
void *fruit2); /* 2nd data arg */
-\end{lstlisting}
-
-\begin{lstlisting}
+ \end{lstlisting}
+ \caption{Lemona's Mixer Structure}
+ \label{listing:Mixer}
+ \end{centering}
+\end{figure}
+
+To facilitate the retrieving of the mixer needed for a given
+\emph{syscall}, we packed them in single read-only array
+\emph{lemona\_mixers}. The entry for the \verb=open= \emph{syscall} is
+given in figure \ref{listing:OpenMixer}. For this \emph{syscall} three
+arguments are logged upon entry:
+
+\begin{enumerate}
+ \item The path to the file being opened (a user null terminated string)
+ \item The access rights requested (an integer)
+ \item The mode of creation if the file is to be created (an integer)
+\end{enumerate}
+
+Upon exit, two things need to be recorded:
+
+\begin{enumerate}
+ \item The return value of the syscall (an integer)
+ \item The resolved path (a string obtained from the file descriptor
returned by the \emph{syscall} when sucessful)
+\end{enumerate}
+
+The later is qualified as un ``external'' argument since it is not
+part of the parameter of the \verb=open= \emph{syscall}
+
+\begin{figure}
+ \begin{centering}
+ \begin{lstlisting}
const struct lemona_mixer lemona_mixers[]= {
/* ... */
{
@@ -330,22 +461,63 @@
},
/* ... */
};
-\end{lstlisting}
+ \end{lstlisting}
+ \caption{Lemona's Mixer Array}
+ \label{listing:OpenMixer}
+ \end{centering}
+\end{figure}
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% Controls
-\section{Controls}
+\section{Using Lemona}
+
+\subsection{Installation}
+
+Building \textbf{Lemona} is as simple as building any path for a
+\textbf{Linux} kernel. All that need to be done is to:
+
+\begin{itemize}
+ \item switch to your kernel directory tree
+ \item download the full patch
+ \item apply the patch
+ \item activate the Lemona feature in the config file % TODO see next
listing
+ \item compile and install the new kernel
+\end{itemize}
\begin{verbatim}
$> cd $(PATH_TO_KERNEL_SRC)
$> wget http://lemona.googlecode.com/svn/trunk/patchs/patch-2.6.26.3
$> patch -p1 < patch-2.6.26.3
$> make menuconfig
-$> make && makes modules_install && make install
+$> make && make modules_install && make install
\end{verbatim}
+\begin{verbatim}
+ General Setup ->
+ [*] Kernel->user space relay support (formerly relayfs)
+ [ ] Configure standard kernel features (for small systems) ->
+ -*- Load all symbols for debugging/ksymoops
+ [*] Include all symbol in kallsyms
+ Kernel Hacking ->
+ [*] Debug Filesystem
+ [*] Kernel debugging
+ Lemona ->
+ <*> Enable Lemona
+ [*] Enable relaying of log to user-land
+ (lemona) Name of the debugfs directory
+ [*] Transmit log via network
+ (10.0.42.1) Server Address
+ (4242) Server Port
+\end{verbatim}
+
+\subsection{Loading}
+
+If \textbf{Lemona} has been built as a module, it can be loaded using
+the \verb=modprobe= or \verb=insmod= commands. Whatever build method
+has been choose, the following informative messages will be displayed
+upon loading and unloading.
\begin{verbatim}
$> cd $(PATH_TO_MODULES)
@@ -359,5 +531,27 @@
-==Lemona==- Done.
\end{verbatim}
+\subsection{Fetching \& Analyzing the Logs}
+
+When using \textbf{Lemona} with the \emph{net} module, a server is
+needed in order to retrieve and store the \emph{zest} sent over the
+network. A tool such a \textbf{netcat} can be used. Optionally, we
+have developed a litte server named \emph{basket}. This server simply
+collect the log and write them back to the local harddrive without
+further processing.
+
+For the analyze of the log, a simple script written in python
+(\emph{picker.py}) can be found on the repositorie. For the moment, it
+only allow to display the content off each \emph{zest} on the standard
+output from which it is executed.
+
+\begin{verbatim}
+$> ./picker.py ../basket/
+>>>Examining File ../basket/00000
+ >>Parsing new entry at ../basket/00000:0
+{'fsuid': 0, 'argnr': 2, 'egid': 0, 'usec': 471962171, 'uid': 0, 'tgid':
4153, 'args': ['\x01\x00\x00\x00', 't'], 'pid': 4153, 'argsz': [4,
1], 'fsgid': 0, 'exts': [], 'sysnr': 3, 'gid': 0, 'sec':
1226734216, 'euid': 0, 'extnr': 0, 'magic': 'ZeSt', 'extsz': [], 'parsed':
93, 'inout': 0, 'size': 93}
+ <<Done
+<<<Done
+\end{verbatim}
% TODO
Modified: docs/publications/thesis/msc-thesis-2008/template-thesis.tex
==============================================================================
--- docs/publications/thesis/msc-thesis-2008/template-thesis.tex (original)
+++ docs/publications/thesis/msc-thesis-2008/template-thesis.tex Sun Nov 16
21:30:16 2008
@@ -4,6 +4,18 @@
\usepackage[sectionbib]{natbib}
\usepackage{chapterbib}
\usepackage{listings}
+\usepackage[pdftex,pdfstartview=FitH,bookmarks=true]{hyperref}
+
+%hyperef options
+\hypersetup{
+ bookmarksopen=false,
+ pdffitwindow=true,
+ pdfborder={0 0 0},
+ colorlinks={false, true},
+ linkcolor=blue,
+ citecolor=green,
+ filecolor=blue,
+ urlcolor=blue}
% Set equal margins on book style
\setlength{\oddsidemargin}{33pt} % was: 53pt