CA BD NY
  • Categories

  • Recent Posts

  • RSS MySQL Hacker

  • RSS CentOS Hacker

  • RSS Editor's Lists

    • An error has occurred; the feed is probably down. Try again later.
  • Meta

  • Writing a Custom Apache Log File Using a PHP Script in Real Time

    Published December 14th, 2008

    Problem Statement

    When you are developing Web applications, Apache logs — access log and/or error log — can be really useful tool. However, by default Apache logs a lot of stuff that gets in the way of debugging if you are focused on solving a specific problem with your Web app. In this article, we will show you how you can write a very simple PHP script to customize what is logged or not.

    Creating a PHP Script for Processing Apache Log in Real Time

    Instead of creating a PHP based Apache log parser, which would be not in real-time, we will create a simple PHP script as shown below to process Apache log entries in real-time as they are created by Apache log module. Look at the following PHP script in Listing 1.

    Listing 1: simple_apache_logger.php

    #!/bin/env php
    <?
    $logDir  = '/var/data/logs';
    $logFile = $logDir . '/access.log';
    $fp      = fopen($logFile,"a+");
    $stdin   = fopen("php://stdin", "r");
    
    // Use unbuffered output
    ob_implicit_flush (true);
    
    while ($line = fgets($stdin))
    {
       fwrite($fp, $line);
    }
    
    fclose($fd);
    fclose($stdin);
    ?>
    

    To run this script, modify your CustomLog entry to be:

    CustomLog "|/path/to/simple_apache_logger.php" common
    

    Make sure that the script is executable by running chmod 750 simple_apache_logger.php.
    What this script does is as follows:

    1. Opens a log file called $logFile in append mode in $logDir using $fp file pointer
    2. Open the STDIN (standard input) as a file called $stdin
    3. Tells PHP to flush output every time a file I/O is done
    4. In a while loop, reads a line of data from $stdin into a variable called $line
    5. The $line is then appended to the log file

    When run, this script is loaded once and keeps on running as long as Apache runs. So there is no load cost per log entry. It runs and appends the same log data given by Apache to a file. This means nothing interesting is being done in this version of the script as it is simply writing a log file, which would be identical to the original Apache log file. So to makes this interesting, lets update Listing 1 as shown in Listing 2.

    Listing 2: modified while() loop for simple_apache_logger.php

    while ($line = fgets($stdin))
    {
       // Ignore all log requests for common image, cascading stylesheets, JavaScripts, and flash video
       if (preg_match("/(\w+)\.(gif|png|jpg|css|js|swf)/", $line))
       {
           continue;
    }
    

    If you replace the original while() loop that simply wrote the Apache log entry in a file to the above-mentioned while() loop that ignores the common image, cascading style sheets, and JavaScript requests from being logged, you end up with a clean log of requests for pages instead of recording all the external components (images, JavaScripts, CSS) that make up an HTML page. The reduction in log entries makes it easier to deal with debugging GET parameters or other SEO related matter much easier.

    Creating a PHP Script for Rotating Apache Log

    Unfortunately, you cannot pipe multiple programs with CustomLog to do something like:

    CustomLog "|/usr/local/sbin/cronolog /logs/%Y/%m/%d/access.log|/path/to/simple_apache_logger.php" common
    

    So when using a PHP logging tool, you cannot use CronoLog. In such a case, you might need to invent your own log rotation schema. For example, Listing 3 shows an updated version of simple_apache_logger.php that does exactly that.
    Listing 1: simple_apache_logger.php

    #!/bin/env php
    <?
    
    $logDir  = '/var/data/logs/ekblogs';
    $logFile = $logDir . '/access.'. date("d-m-Y") . '.log';
    $fp      = fopen($logFile,"a+");
    $stdin   = fopen("php://stdin", "r");
    
    // Use unbuffered output
    ob_implicit_flush (true);
    
    $lastLogDate = date('Ymd');
    
    while ($line = fgets($stdin))
    {
       // Ignore images, javascripts, css, ico and flash file requests
       if (preg_match("/(\w+)\.(gif|png|jpg|css|js|swf|ico)/", $line))
       {
           continue;
       }
    
       // Following section is for rotating log when day change is detected
       // This will *only rotate* if requests are coming in daily (which is expected)
    
       $today = date('Ymd');
    
       // If today is different than last log date, time to write a new log file
       if ($today > $lastLogDate )
       {
           $lastLogDate = $today;
           fclose($fp);
           $logFile = $logDir . '/access.'. date("d-m-Y") . '.log';
           $fp      = fopen($logFile, "a+");
       }
    
       fwrite($fp, $line);
    }
    
    fclose($fd);
    fclose($stdin);
    ?>
    

    Every time Apache injects a log entry into the STDIN of this script, it checks if current date is same as the last date it had. If the current date has changed, it then creates a new log file and sets the last date to current date. This allows it to rotate the logs by day.

    However, this works only when you have reasonable expectations that your site will get hit every day. If your site does not get hit every day, you will have log entries going into previous log file as the script only gets data to process when there is a new request. But this is not too bad as we only recommend this kind of PHP script based logging in development environments.

    Get a Trackback link

    No Comments Yet

    Be the first to comment!

    Leave a comment

    Comment Policy: First time comments are moderated. Please be patient.

    You must be logged in to post a comment.