Hagai Bar-El

Information Security Architect


HBAREL.COM  
 
 
 

How and what to log to enable forensic analysis



Message # 1

Date: Fri, 24 Mar 2006
From: Stephan Neuhaus
To: PracticalSecurity at hbarel.com
Subject: [PracticalSecurity] How and what to log to enable forensic analysis

Hi list,
when you analyze a break-in, you often rely on log files to tell you what happened. Of course, the cause-effect chain that emerges from log file analysis will often have holes, because the logs won't tell you everything you need. Also, you don't necessarily have the programs' source code and you don't necessarily know the workflow that the program is implementing.
Here is what a forensic examiner would want from logging:
* Log all network requests, verbatim, in full * Log all network responses, verbatim, in full * Log all authentication attempts, and their results * Log all programs that were executed, with full argument list, path names, and environment.
* Use NTP to keep log files timestamped properly.
* Use UTC for timestamps (or at least use full timezone designations).
* Document the workflow that the program implements and allude to that workflow in the logs.
Some suggestions might clash with privacy concerns. The logs might have to be protected from casually prying eyes. Others might expose sensitive data, such as passwords, or might not be practical for legacy applications. Some suggestions might not be practical because they increase the volume of logged traffic beyond any reasonable size.
So how do you balance these different needs? I have a hunch that most logging is done in an ad-hoc manner, without any regard to forensic analysis, except perhaps where logs are used for accouting purposes.
Fun,
Stephan


Message # 2

Date: Sat, 25 Mar 2006
To: Stephan Neuhaus
From: Hagai Bar-El
Subject: Re: [PracticalSecurity] How and what to log to enable forensic analysis
Cc: PracticalSecurity at hbarel.com

Hi Stephan,
I will add my US$0.02.
At 24/03/06 09:23, Stephan Neuhaus wrote:
when you analyze a break-in, you often rely on log files to tell you
what happened. Of course, the cause-effect chain that emerges from
log file analysis will often have holes, because the logs won't tell
you everything you need. Also, you don't necessarily have the
programs' source code and you don't necessarily know the workflow
that the program is implementing.
Here is what a forensic examiner would want from logging:
* Log all network requests, verbatim, in full
(snip)
Indeed, as you imply, the logging mechanisms, as they are today, were not made for forensics analysis. System logs, such as the ones maintained by the OS are mainly so you know when you were attacked, and hopefully, if you read them on a periodic basis, be able to tell if you are subject to attack attempts or pre-attack data collection, along with some minimal information that allows you to investigate the source. Logs that are kept by most applications are primarily for the purpose of accountability when it comes cheap (mapping DB modifications to user accounts), and for reversal ability - that is - so one can revert any malicious or erroneous operation; similar to having incremental backups updated in the highest frequency possible.
Complete forensics today thus require not only information from these logs, but also revealing and analyzing traces left on the disk, etc.
If you want to generate a logging mechanism that will be of higher benefit for forensics analysis, you will need what you wrote, plus perhaps a log of each and every write (and in lower priority - read) of blocks of secondary storage. You need this in case the adversary bypasses the application that does the logging and makes modifications directly, either by accessing the disk or by exploiting the database back-end.
It is worth distinguishing between logging you expect from an application and one you expect from the platform (OS). For example, an OS cannot easily log internal processes of the program, but only their effect on the resources the OS manages. For example: It cannot log recorded transactions, but only write operations to disk. On the other hand, program logs are easier to circumvent (e.g. by writing transactions directly to disk) and are application-specific. I guess a good system needs both: Application-level logging for verbatim log records, and OS-level logging to detect attacks that involved bypassing/circumventing the application and that shall be used in cases where the application-logs cannot explain something, or if a system-level compromise is dealt with rather than an application-level violation.
Also note that in most cases logging the entire context of application execution, as you noted, does not necessarily provide all the information to determine the impact of the application execution - the environment and parameters are the initial inputs of the program, but most programs will get other inputs in runtime. If you do not do the logging within the application then the only way to get all this information is by logging system calls, but this is probably way too much. You might want it activated only when an unrecognized application is executed.
With regards to NTP: Consulting an NTP server for each entry may overload the system. Actually, given each network access is logged and this log requires a network query to the NTP server, an endless loop can occur :). What I would suggest instead is to make sure the time is updated once a day or so, and log every manual time change through the OS API, recording (within the log entry) the existing time and the changed-to time. We can assume a natural daily drift is something we can tolerate.
All this gets to be quite a lot of logging, which I believe crosses the feasibility barrier. I think that if you want to implement a logging mechanism that is suitable for forensic analysis and yet feasible in terms of the amount of data it logs, it must be mission-specific - tailored for a specific system in mind.
Hagai.