Skip to main content

Rotate the logs in redhat linux

FAQ-COM120 : HOW TO ROTATE MY LOGS WITHOUT LOSING DATA
PROBLEM:
I want to archive/rotate my logs using my server system (for example logrotate) options or a third software (rotatelog, cronolog) but I don't want to lose any visits information during the rotate process.
SOLUTION:
If your config file is setup with a LogFile parameter that point to your current running log file (required if you want to use the AllowToUpdateStatsFromBrowser option to have "real-time" statistics), to avoid losing too much records during the rotate process, you must run the AWStats update JUST BEFORE the rotate process is done.
The best way to do that on 'Linux like' OS is to use the linux built-in logrotate feature. You must edit the logrotate config file used for your web server log file (usually stored in /etc/logrotate.d directory) by adding the AWStats update process as a preprocessor command, like this example (bold lines are lines to add for having a prerotate process):
/usr/local/apache/logs/*log
{
notifempty
daily
rotate 7
compress
sharedscripts
prerotate
/usr/local/awstats/wwwroot/cgi-bin/awstats.pl -update -config=mydomainconfig
endscript
postrotate
/usr/bin/killall -HUP httpd
endscript
}

If using a such solution, this is sequential steps that happens:
Step Description Step name Date/Time example
A logrotate is started (by cron) Start of logrotate 04:02:00
B
awstats -update is launched by logrotate
Start of awstats 04:02:01
C
awstats start to read the log file file.log
04:02:02
D
awstats has reached the end of log file so now it starts to save its database on disk.
04:05:00
E
awstats has finished to save its new database, so it stops
End of awstats 04:06:00
F logrotate moves old log file file.log to a new name file.log.sav. Apache now logs in this file file.log.sav since log file handle has not been changed (only log file name has been renamed). Log move 04:06:01
G logrotate sends the -HUP or -USR1 signal to Apache.
With -HUP, Apache immediatly kills all its child process/thread, close log file file.log.sav, and reopen file file.log. So now, ALL hits are written to new file.
With -USR1, Apache only ask its child process/thread to stop only when HTTP request will be completely served. However it closes immediatly log file file.log.sav, and reopen file file.log. So only NEW hits are written to new log file. HTTP requests that are still running will write in old one. Apache restart 04:06:02
H logrotate starts compress the old log file file.log.sav into file.log.gz Start compress 04:06:03
I
If some apache threads/processes are still running (because the kill sent was -USR1, so child process are waiting end of request before to stop), then those threads/processes are still writing to file.log.sav.
If kill -HUP was used, all process are already restarted so all writes in new file.log.

J logrotate has finished to compress log file into file.log.gz. File file.log.sav is deleted. End of compress
End of logrotate 04:07:03
K
If signal was -USR1, some old childs can still run (when serving a very long request for example). So the log writing, still done in same file handle are going to a file that has been removed. So log writing are lost nowhere (this is only if -USR1 was used and if request was very long).


The advantage of this solution is that it is a very common way of working, used by a lot of products, and easy to setup. You will notice that you can "lose" some hits:
If you use the -HUP signal, you will only lose all hits that were written during D and E. Note that you will also break all requests still running at G. In the example, it's a 1 minute lost (for small or medium web sites, it will be less than few seconds), so this give you an error lower than 0.07% (less for small web sites). This is not significant, above all for a "statistics" progam.
If you use the -USR1 signal, you will not kill any request. But you will lose all hits that were wrote during D and E (like with -HUP) but also all hits that are still running after H (all very long request that requires several minutes to be served). If hit ends during I, it is wrote in a log file already analyzed, if hit ends at K, it is wrote nowhere. In the example, it's also a 0.07% error plus error for other not visible hits that were finished during I or K, but number of such hits should be very low since only hits that started before G and not finished after H are concerned. In most cases a hits needs only few milliseconds to be served so lost hits could be ignored.

Note also that if you have x logrotate config files, with each of them a postrotate with a kill -HUP, you send a kill x times to your server process. So try to include several log files in same logrotate config file. You can have several awstats update command in the same prerotate section and you will send the -HUP only once, after all updates are finished. However, doing this, you will have a lap time between D and F (were some hits are lost) that will be higher.

Another common way of working is to choose to run the AWStats update process only once the log file has been archived.
This is required for example if you use the cronolog or rotatelog tools to rotate your log files. For example, Apache users can setup their Apache httpd config file to write log file through a pipe to cronolog or rotatelog using Apache CustomLog directive:
CustomLog "|/usr/sbin/cronolog [cronolog_options] /var/logs/access.%Y%m%d.log" combined
If you use a such feature, you can't trigger AWStats update process to be ran just BEFORE the rotate is done, so you must run it AFTER the rotate process, so on the archived log file.
To setup awstats to always point to last archive log file, you can use the 'tags' available for LogFile.
The problem with that is that your data are refreshed only after a rotate has done. However, you will miss absolutely nothing (no hits) and your server processes are never killed.

So, if you really want to not lose absolutely no hit and want to have updates more frequently than the rotate frequency, the best way is still an hybrid solution (i am not sure that it worth the pain, and remember that statistics are only statistics):
You run the awstats update process from you crontab frequently, every hour for example, and half and hour before the rotate has done. See next FAQ to know how to setup a scheduled job.
Then, once the rotate has been done (by the logrotate or by a piped cronolog log file), and before the next scheduled awstats update process start, you run another update process on the archived log file using the -logfile option to force update on the archived log file and not the current log file defined in awstats config file. This will allow you to update the half hour missing, until the log rotate (AWStats will find the new lines). However don't forget that this particular update MUST be finished before the next croned update.

Comments

Popular posts from this blog

Docker Container Management from Cockpit

Cockpit can manage containers via docker. This functionality is present in the Cockpit docker package. Cockpit communicates with docker via its API via the /var/run/docker.sock unix socket. The docker API is root equivalent, and on a properly configured system, only root can access the docker API. If the currently logged in user is not root then Cockpit will try to escalate the user’s privileges via Polkit or sudo before connecting to the socket. Alternatively, we can create a docker Unix group. Anyone in that docker group can then access the docker API, and gain root privileges on the system. [root@rhel8 ~] #  yum install cockpit-docker    -y  Once the package installed then "containers" section would be added in the dashboard and we can manage the containers and images from the console. We can search or pull an image from docker hub just by searching with the keyword like nginx centos.   Once the Image downloaded we can start a contai

Remote Systems Management With Cockpit

The cockpit is a Red Hat Enterprise Linux web-based interface designed for managing and monitoring your local system, as well as Linux servers located in your network environment. In RHEL 8 Cockpit is the default installation candidate we can just start the service and then can start the management of machines. For RHEL7 or Fedora based machines we can follow steps to install and configure the cockpit.  Following are the few features of cockpit.  Managing services Managing user accounts Managing and monitoring system services Configuring network interfaces and firewall Reviewing system logs Managing virtual machines Creating diagnostic reports Setting kernel dump configuration Configuring SELinux Updating software Managing system subscriptions Installation of cockpit package.  [root@rhel8 ~] #  dnf   install cockpit cockpit-dashboard  -y  We need to enable the socket.  [root@rhel8 ~] #  systemctl enable --now cockpit.socket If firewall is runnin

Containers Without Docker on RHEL/Fedora

Docker is perfectly doing well with the containerization. Since docker uses the Server/Client architecture to run the containers. So, even if I am a client or developer who just wants to create a docker image from Dockerfile I need to start the docker daemon which of course generates some extra overhead on the machine.  Also, a daemon that needs to run on your system, and it needs to run with root privileges which might have certain security implications. Here now the solution is available where we do not need to start the daemon to create the containers. We can create the images and push them any of the repositories and images are fully compatible to run on any of the environment.  Podman is an open-source Linux tool for working with containers. That includes containers in registries such as docker.io and quay.io. let's start with the podman to manage the containers.  Install the package  [root@rhel8 ~] # dnf install podman -y  OR [root@rhel8 ~] # yum