Showing posts with label linux. Show all posts
Showing posts with label linux. Show all posts

Wednesday, November 18, 2015

Process and filter log files by date using AWK

Unix AWK doesn't have strptime function, which makes life a little messy.
An example to calculate the Epoch time is as fallows but it is a little ugly too.
works for me as a quick way to get things done.

cat /var/log/yum.log | awk 'BEGIN { mnts="JANFEBMARAPRJUNJULAUGSEPOCTNOVDEC"; ST=mktime("2015 11 18 11 32 00"); EN=mktime("2015 12 18 11 33 00");}{ MON=$1; DAY=$2 ; split($3,TM,":"); M=((index(mnts,toupper(MON)))+2)/3+1; EPOCH=mktime("2015 "M" "DAY" "TM[1]" "TM[2]" "TM[3]); if ((ST < EPOCH)&&(EN > EPOCH)) { print $0; }}'


Saturday, August 01, 2015

Notes about Filesystems

Well, Filesystems may sound very simple concepts specially in Unix environments but there are some interesting limitations and tricks which a Unix admin must keep in mind working with.
I'm taking some notes about filesystems and very simple definitions specifically for common file systems in Linux. I'll also include my references in here. Some are interesting.



A bit of reading about Inodes and Unix filesystem features

Unix Inodes
 
Many Unix filesystems (Berkeley Fast Filesystem, Linux ext2fs, Sun ufs, ...) take an approach that combines some of the ideas above.







  • each file is indexed by an inode
  • inodes are special disk blocks set aside just for this purpose (see df -i to see how many of these exist on your favorite Unix filesystem)
  • they are created when the filesystem is created
  • the number of inodes limits the total number of files/directories that can be stored in the filesystem
  • the inode itself consists of
  • administrative information (permissions, timestamps, etc.)
  • a number of direct blocks (typically 12) that contain pointers to the first 12 blocks of the file
  • a single indirect pointer that points to a disk block which in turn is used as an index block, if the file is too big to be indexed entirely by the direct blocks
  • a double indirect pointer that points to a disk block which is a collection of pointers to disk blocks which are index blocks, used if the file is too big to be indexed by the direct and single indirect blocks
  • a triple indirect pointer that points to an index block of index blocks of index blocks...
  • interesting reading on your favorite FreeBSD system: /sys/ufs/ufs/dinode.h
  • small files need only the direct blocks, so there is little waste in space or extra disk reads in those cases
  • medium sized files may use indirect blocks
  • only large files make use of (and incur the overhead of) the double or triple indirect blocks, and that is reasonable since those files are large anyway
  • since the disk is now broken into two different types of blocks - inodes and data blocks, there must be some way to determine where the inodes are, and to keep track of free inodes and disk blocks. This is done by a superblock, located at a fixed position in the filesystem. The superblock is usually replicated on the disk to avoid catastrophic failure in case of corruption of the main superblock 

  • Disk Allocation Considerations
  • limitations on file size, total partition size
  • internal, external fragmentation
  • overhead to store and access index blocks
  • layout of files, inodes, directories, etc, as they affect performance - disk head movement, rotational latency - many unix filesystems keep clusters of inodes at a variety of locations throughout the file system, to allow inodes and the disk blocks they reference to be close together
  • may want to reorganize files occasionally to improve layout (see hw7 question) 
  • Free Space Management
    With any of these methods of allocation, we need some way to keep track of free disk blocks.
    Two main options:



  • bit vector - keep a vector, one bit per disk block
  • 0 means the corresponding block is free, 1 means it is in use
  • search for a free block requires search for the first 0 bit, can be efficient given hardware support
  • vector is too big to keep in main memory, so it must be on disk, which makes traversal slow
  • with block size 212 or 4KB, disk size 233 or 8 GB, we need 221 bits (128 KB) for bit vector
  • easy to allocate contiguous space for files
  • free list - keep a linked list of free blocks
  • with linked allocation, can just use existing links to form a free list
  • with FAT, use FAT entries for unallocated blocks to store free list
  • no wasted space
  • can be difficult to allocate contiguous blocks
  • allocate from head of list, deallocated blocks added to tail, both O(1) operations
  • Performance Optimization
    Caching is an important optimization for disk accesses.
    A disk cache may be located:



  • main memory
  • disk controller
  • internal to disk drive



  • Safety and Recovery

    When a disk cache is used, there could be data in memory that has been "written" by programs, which which has not yet been physically written to the disk. This can cause problems in the event of a system crash or power failure.
    If the system detects this situation, typically on bootup after such a failure, a consistency checker is run. In Unix, this is usually the fsck program, and in Windows, scandisk or some variant. This checks for and repairs, if possible, inconsistencies in the filesystem.

    Journaling Filesystems
    One way to avoid data loss when a filesystem is left in an inconsistent state is to move to a log-structured or journaling filesystem.



  • record updates to the filesystem as transactions
  • transactions are written immediately to a log, though the actual filesystem may not yet be updated
  • transactions in the log are asynchronously applied to the actual filesystem, at which time the transaction is removed from the log
  • if the system crashes, any pending transactions can be applied to the filesystem - main benefits are less chance of significant inconsistencies, and that those inconsistencies can be corrected from the unfinished transactions, avoiding the long consistency check
  • Examples:
  • ReiserFS, a linux journaling filesystem - I recommend reading this page
  • ext3fs, also for linux
  • jfs, IBM journaling filesystem, available for AIX, Linux
  • Related idea in FreeBSD's filesystem: Soft Updates
  • Journaling extensions to Macintosh HFS disks, called Elvis, supposedly coming in OS X 10.2.2
  • NTFS does some journaling, but some claim it is not "fully journaled"
  • the term "journaling" may also refer to systems that maintain the transaction log for a longer time, giving the ability to "undo" changes and retrieve a previous state of a filesystem


  • From: Ext4 filesystem layout

    Overview


        An ext4 file system is split into a series of block groups. To reduce performance difficulties due to fragmentation, the block allocator tries very hard to keep each file's blocks within the same group, thereby reducing seek times.  The size of a block group can be calculated as 8 * block_size_in_bytes. With the default block size of 4KiB, each group will contain 32,768 blocks, for a length of 128MiB. ( It's a good to group things. )

    Blocks

        ext4 allocates storage space in units of "blocks". A block is a group of sectors between 1KiB and 64KiB, and the number of sectors must be an integral power of 2. Blocks are in turn grouped into larger units called block groups. Block size is specified at mkfs time and typically is 4KiB. You may experience mounting problems if block size is greater than page size (i.e. 64KiB blocks on a i386 which only has 4KiB memory pages). By default a filesystem can contain 2^32 blocks; if the '64bit' feature is enabled, then a filesystem can have 2^64 blocks. 


    32-bit mode 64-bit mode
    Item 1KiB 2KiB 4KiB 64KiB 1KiB 2KiB 4KiB 64KiB
    Blocks 2^32 2^32 2^32 2^32 2^64 2^64 2^64 2^64
    Inodes 2^32 2^32 2^32 2^32 2^32 2^32 2^32 2^32
    File System Size 4TiB 8TiB 16TiB 256PiB 16ZiB 32ZiB 64ZiB 1YiB

     ( Nice, so It's actually the System's architecture dictating the maximum filesystem's size.  2 to the power of 32 or 64 and you may have some playroom with your block size but you must stick to blocks as big as your memory page size. But, Can't I mount the file systems created on bigger machines on smaller ones ? Seems we may have difficulties. So don't be sure unless you've tried it. )


    Layout

    The layout of a standard block group is approximately as follows (each of these fields is discussed in a separate section below):
    Group 0 Padding ext4 Super Block Group Descriptors Reserved GDT Blocks Data Block Bitmap inode Bitmap inode Table Data Blocks
    1024 bytes 1 block many blocks many blocks 1 block 1 block many blocks many more blocks
    For the special case of block group 0, the first 1024 bytes are unused, to allow for the installation of x86 boot sectors and other oddities. The superblock will start at offset 1024 bytes, whichever block that happens to be (usually 0). However, if for some reason the block size = 1024, then block 0 is marked in use and the superblock goes in block 1. For all other block groups, there is no padding.




    Still to continue.....






    Monday, April 06, 2015

    Finding Hardlinks

    Well, this may be something very primitive but may  be useful to someone.

    How do you now how many  hardlinks you have to a file ? There is a system call: fstat for open files and stat for close files. the structure returned by this system call is as follows:

               struct stat {
                   dev_t     st_dev;     /* ID of device containing file */
                   ino_t     st_ino;     /* inode number */
                   mode_t    st_mode;    /* protection */
                   nlink_t   st_nlink;   /* number of hard links */
                   uid_t     st_uid;     /* user ID of owner */
                   gid_t     st_gid;     /* group ID of owner */
                   dev_t     st_rdev;    /* device ID (if special file) */
                   off_t     st_size;    /* total size, in bytes */
                   blksize_t st_blksize; /* blocksize for file system I/O */
                   blkcnt_t  st_blocks;  /* number of 512B blocks allocated */
                   time_t    st_atime;   /* time of last access */
                   time_t    st_mtime;   /* time of last modification */
                   time_t    st_ctime;   /* time of last status change */
               };

    And it is very interesting that one of the parameters of the fstat is the number of hard links. Just refer to "man 2 fstat" for more details.

    The utility in shell to read these values is simply the stat command:

    user@localhost ~]$ stat ./scr.sh
      File: ‘./scr.sh’
      Size: 72            Blocks: 8          IO Block: 4096   regular file
    Device: fd02h/64770d    Inode: 137155      Links: 2
    Access: (0755/-rwxr-xr-x)  Uid: ( 1000/    user)   Gid: ( 1000/    user)
    Context: unconfined_u:object_r:user_home_t:s0
    Access: 2015-04-06 19:32:44.410014623 +1000
    Modify: 2015-04-06 19:32:42.703024143 +1000
    Change: 2015-04-06 20:12:21.379767352 +1000
     Birth: -
    [user@localhost ~]$

    and ls -ial  shows you the inode number. Two files with the same inode number means they are the same file and referring to the same inode on the filesystem.

    To find all the hard links on the mount point, you should simply find the files with the same inode number. Find has made it simeple. Since hardlinks are only possible on the same mount point,  to make life easier, we can use -xdev option with the find comand:

    find ./ -xdev -samefile ./scr.sh


    [user@localhost ~]$ find ./ -xdev -samefile ./scr.sh
    ./scr-1.sh
    ./scr.sh

    or use the inode number:
    [user@localhost ~]$ find ./ -xdev -inum 137155
    ./scr-1.sh
    ./scr.sh



    That's all falks :)

    Friday, June 28, 2013

    RHEL4 or an old Linux but you want to roll back a LVM snapshot

    LVM has supported snapshot since long time ago but it is so recent ( 2-3 years) since merge support is introduced in RHEL6 which means you can roll back to the snapshot points. Before this, I think snapshots were only a solution when it was needed to freeze a filesystem to be backed up.

    Well ... what about an old kernel which only knows how to snapshot but not to merge !?

    Easy, boot the system with a kernel which knows how to do this. like, booting from a RHEL6 DVD in rescue mode. LVM is a LVM anyway.

    Or, you can dd the snapshot device file to an equally sized disk to create an equivalent filesystem to the snapshot.

    OK ... to show you how, follow the following guidelines:

    1. Create a snapshot:
    There should be enough free space in the VG for the snapshot. It was nice if you could use other VGs to host the snapshot LV, but this feature is not available. Add a need disk to the VG to make some room if you need to.


    #lvcreate -L<size> -s -n LogVol00-backup

     2.Do changes
    Cool, now you have the snapshot. Do your changes

    3.Clean up
    If it went well, remove the snapshot LV and release the disk:

    #lvremove LogVol00-backup

    3-1. If you are on RHEL6 or a new version or a high end kernel:

    #lvconvert --merge LogVol00-backup

    If your KVs are mounted nothing will happen unless you reboot or unmount, deactivate and reactivate the LV.

    #umount /mountpoint
    #lvchange -an LogVol00-backup
    #lvchange -ay LogVol00-backup
    #mount /mountpoint

     Obviously  you can't do this if your LV id the root partition. Just reboot if you are stuck.

    3-2. What if your OS does not support  merge ? Boot from a RHEL6 DVD and go into rescue mode and feel free to do the step3.

    Or, alternatively, add a disk or create a LV equally sized to the original LV and do the following:

    dd if=/dev/MV/snapshot_lv of=/dev/DV/DestinationLV

    It will generate an LV with the contents of the snapshot LV. Don't be scared of dd. You know what you are doing, aren't you ?






     

    Thursday, May 16, 2013

    OpenVPN and XOR obfuscation

    UPDATED: 13/09/2016

    I patched the the current version 2.3.10 and pushed it in to my git hub: 
    https://github.com/shenavaa/openvpn


    UPDATED 15/07/2014:

    I managed to patch and compile the latest version of Openvpn-2.3.4 for Windows. Please download compiled windows Autoinstaller binaries from here and the sources from here.

    ----


    I went somewhere for a while and during my visit, I had a chance to play around OpenVPN. During a lazy afternoon I came up with a silly idea adding a layer of XOR obfuscation on top of whatever already exists in OpenVPN. I even managed to compile the windows client of openVPN and run it on windows.

    The good thing about XOR obfuscation is that, it has no overhead on top of packets and it is so fast and easy.

    The bigger an organization is, the harder it would be for LI/Security layers to detect the algorithm or the protocol of the packets on the network. I have seen AI engines learning protocols and used to block unwanted and recently undetected packets ! - Their solution is sillier than what I just did. Trust me. ;)

    I've generally done it by adding one simple function obviously and couple of hacks in other source files.

    ## in xor.h
    #ifndef _XOR_H
    #define _XOR_H

    void encbuffer(unsigned char* buf,int size ,unsigned char key);
    #endif /* _XOR_H */
    ## in xor.c
    #include "xor.h"
    void encbuffer(unsigned char* buf,int size ,unsigned char key) {
        int i;
        for (i = 0; i < size ; i++) {
            *(buf + i) = *(buf + i) ^ key;
        }
    }


    So my OpenVPN configutation file simply turns to something as follows:

    ## On the server
    local X.X.X.X
    dev tap
    verb 4
    #mute 10
    port 36
    tun-mtu-extra 32
    tun-mtu 1500
    up-delay
    ifconfig 172.16.4.1 255.255.255.0
    ping 10
    comp-lzo yes
    fragment 1100
    xorkey 52


    ## On the client
    remote X.X.X.X
    dev tap
    verb 4
    #mute 10
    port 36
    tun-mtu-extra 32
    tun-mtu 1500
    up-delay
    ifconfig 172.16.4.2 255.255.255.0
    ping 10
    comp-lzo yes
    fragment 1100
    xorkey 52


    My sources are here for whoever is interested to see. It's Openvpn 2.3.1. I've cleaned it up and all you need to compile the source, after unpacking, is "./configure; make; make install"


    This is the beauty of open source software. Feel free to distribute the love.

    Thursday, March 14, 2013

    I/O Scheduler algorithms in Linux

     I just remembered once someone asked me about available I/O scheduler algorithms in Linux during a job interview. I still have not come across a situation to change any system's I/O scheduler algorithm during the 14 years of Linux experience I have. and personally I don't believe this is needed to be changed on any normal server environment. But may be this is a good reviews by RedHat.
     Really, What that guy has been thinking of !
     http://www.redhat.com/magazine/008jun05/features/schedulers/

    Choosing an I/O Scheduler for Red Hat® Enterprise Linux® 4 and the 2.6 Kernel

    The Linux kernel, the core of the operating system, is responsible for controlling disk access by using kernel I/O scheduling. Red Hat Enterprise Linux 3 with a 2.4 kernel base uses a single, robust, general purpose I/O elevator. The 2.4 I/O scheduler has a reasonable number of tuning options by controlling the amount of time a request remains in an I/O queue before being serviced using the elvtune command. While Red Hat Enterprise Linux 3 offers most workloads excellent performance, it does not always provide the best I/O characteristics for the wide range of applications in use by Linux users these days. The I/O schedulers provided in Red Hat Enterprise Linux 4, embedded in the 2.6 kernel, have advanced the I/O capabilities of Linux significantly. With Red Hat Enterprise Linux 4, applications can now optimize the kernel I/O at boot time, by selecting one of four different I/O schedulers to accommodate different I/O usage patterns:
    • Completely Fair Queuing—elevator=cfq (default)
    • Deadline—elevator=deadline
    • NOOP—elevator=noop
    • Anticipatory—elevator=as
    Add the elevator options from Table 1 to your kernel command in the GRUB boot loader configuration file (/boot/grub/grub.conf) or the eLILO command line. Red Hat Enterprise Linux 4 has all four elevators built-in; no need to rebuild your kernel.
    The 2.6 kernel incorporates the best I/O algorithms that developers and researchers have shared with the open-source community as of mid-2004. These schedulers have been available in Fedora Core 3 and will continue to be used in Fedora Core 4. There have been several good characterization papers on using evaluating Linux 2.6 I/O schedulers. A few are referenced at the end of this article. This article details our own study based on running Oracle 10G in both OLTP and DSS workloads with EXT3 file systems.

    Red Hat Enterprise Linux 4 I/O schedulers

    Included in Red Hat Enterprise Linux 4 are four custom configured schedulers from which to choose. They each offer a different combination of optimizations.
    The Completely Fair Queuing (CFQ) scheduler is the default algorthim in Red Hat Enterprise Linux 4. As the name implies, CFQ maintains a scalable per-process I/O queue and attempts to distribute the available I/O bandwidth equally among all I/O requests. CFQ is well suited for mid-to-large multi-processor systems and for systems which require balanced I/O performance over multiple LUNs and I/O controllers.
    The Deadline elevator uses a deadline algorithm to minimize I/O latency for a given I/O request. The scheduler provides near real-time behavior and uses a round robin policy to attempt to be fair among multiple I/O requests and to avoid process starvation. Using five I/O queues, this scheduler will aggressively re-order requests to improve I/O performance.
    The NOOP scheduler is a simple FIFO queue and uses the minimal amount of CPU/instructions per I/O to accomplish the basic merging and sorting functionality to complete the I/O. It assumes performance of the I/O has been or will be optimized at the block device (memory-disk) or with an intelligent HBA or externally attached controller.
    The Anticipatory elevator introduces a controlled delay before dispatching the I/O to attempt to aggregate and/or re-order requests improving locality and reducing disk seek operations. This algorithm is intended to optimize systems with small or slow disk subsystems. One artifact of using the AS scheduler can be higher I/O latency.

    Choosing an I/O elevator

    The definitions above may give enough information to make a choice for your I/O scheduler. The other extreme is to actually test and tune your workload on each I/O scheduler by simply rebooting your system and measuring your exact environment. We have done just that for Red Hat Enterprise Linux 3 and all four Red Hat Enterprise Linux 4 I/O schedulers using an Oracle 10G I/O workloads.
    Figure 1 shows the results of running an Oracle 10G OLTP workload running on a 2-CPU/2-HT Xeon with 4 GB of memory across 8 LUNs on an LSIlogic megraraid controller. The OLTP load ran mostly 4k random I/O with a 50% read/write ratio. The DSS workload consists of 100% sequential read queries using large 32k-256k byte transfer sizes.

    Figure 1. Red Hat Enterprise Linux 4 IO schedulers vs. Red Hat Enterprise Linux 3 for database Oracle 10G oltp/dss (relative performance)

    The CFQ scheduler was chosen as the default since it offers the highest performance for the widest range of applications and I/O system designs. We have seen CFQ excel in both throughput and latency on multi-processor systems with up to 16-CPUs and for systems with 2 to 64 LUNs for both UltraSCSI and Fiber Channel disk farms. In addition, CFQ is easy to tune by adjusting the nr_requests parameter in /proc/sys/scsi subsystem to match the capabilities of any given I/O subsystem.
    The Deadline scheduler excelled at attempting to reduce the latency of any given single I/O for real-time like environments. A problem which depends on an even balance of transactions across multiple HBA, drives or multiple file systems may not always do best with the Deadline scheduler. The Oracle 10G OLTP load using 10 simultaneous users spread over eight LUNs showed improvement using Deadline relative to Red Hat Enterprise Linux 3's I/O elevator, but was still 12.5% lower than CFQ.
    The NOOP scheduler indeed freed up CPU cycles but performed 23% fewer transactions per minute when using the same number of clients driving the Oracle 10G database. The reduction in CPU cycles was proportional to the drop in performance, so perhaps this scheduler may work well for systems which drive their databases into CPU saturation. But CFQ or Deadline yield better throughput for the same client load than the NOOP scheduler.
    The AS scheduler excels on small systems which have limited I/O configurations and have only one or two LUNs. By design, the AS scheduler is a nice choice for client and workstation machines where interactive response time is a higher priority than I/O latency.

    Summary: Have it your way!

    The short summary of our study indicates that there is no SINGLE answer to which I/O scheduler is best. The good news is that with Red Hat Enterprise Linux 4 an end-user can customize their scheduler with a simple boot option. Our data suggests the default Red Hat Enterprise Linux 4 I/O scheduler, CFQ, provides the most scalable algorithm for the widest range of systems, configurations, and commercial database users. However, we have also measured other workloads whereby the Deadline scheduler out-performed CFQ for large sequential read-mostly DSS queries. Other studies referenced in the section "References" explored using the AS scheduler to help interactive response times. In addition, noop has proven to free up CPU cycles and provide adequate I/O performance for systems with intelligent I/O controller which provide their own I/O ordering capabilities.
    In conclusion, we recommend baselining an application with the default CFQ. Use this article and its references to match your application to one of the studies. Then adjust the I/O scheduler via the simple command line re-boot option if seeking additional performance. Make only one change at a time, and use performance tools to validate the results.

     

    /proc/cpuinfo, what cpu flags mean?

    I alwas keep forgetting what CPU flags mean in Linux and what CPU architecture I have.

    http://www.gentoo-wiki.info/Gentoo:/proc/cpuinfo

    Intel flags (This table is currently identical with /usr/include/asm/cpufeature.h. Hopefully some hardware god will share his wisdom and expand this table. )
    FlagDescriptionCommon in processor types
    fpuOnboard (x87) Floating Point Unit
    vmeVirtual Mode Extension
    deDebugging Extensions
    psePage Size Extensions
    tscTime Stamp Counter: support for RDTSC and WRTSC instructions
    msrModel-Specific Registers
    paePhysical Address Extensions: ability to access 64GB of memory; only 4GB can be accessed at a time though
    mceMachine Check Architecture
    cx8CMPXCHG8 instruction
    apicOnboard Advanced Programmable Interrupt Controller
    sepSysenter/Sysexit Instructions; SYSENTER is used for jumps to kernel memory during system calls, and SYSEXIT is used for jumps back to the user code
    mtrrMemory Type Range Registers
    pgePage Global Enable
    mcaMachine Check Architecture
    cmovCMOV instruction
    patPage Attribute Table
    pse3636-bit Page Size Extensions: allows to map 4 MB pages into the first 64GB RAM, used with PSE.
    pnProcessor Serial-Number; only available on Pentium 3
    clflushCLFLUSH instruction
    dtesDebug Trace Store
    acpiACPI via MSR
    mmxMultiMedia Extension
    fxsrFXSAVE and FXSTOR instructions
    sseStreaming SIMD Extensions. Single instruction multiple data. Lets you do a bunch of the same operation on different pieces of input in a single clock tick.
    sse2Streaming SIMD Extensions-2. More of the same.
    selfsnoopCPU self snoop
    accAutomatic Clock Control
    IA64IA-64 processor Itanium.
    htHyperThreading. Introduces an imaginary second processor that doesn't do much but lets you run threads in the same process a bit quicker.
    nxNo Execute bit. Prevents arbitrary code running via buffer overflows.
    pniPrescott New Instructions aka. SSE3
    vmxIntel Vanderpool hardware virtualization technology
    svmAMD "Pacifica" hardware virtualization technology
    lm"Long Mode," which means the chip supports the AMD64 instruction set
    tm"Thermal Monitor" Thermal throttling with IDLE instructions. Usually hardware controlled in response to CPU temperature.
    tm2"Thermal Monitor 2" Decrease speed by reducing multipler and vcore.
    est"Enhanced SpeedStep"



    Tuesday, January 15, 2013

    Attach multiple attachments to an email in shell

    This is a sample I have written. This may give you an Idea.

    #!/bin/sh
    BOUNDARY="=== This is the boundary between parts of the message. ==="
    DATE=`date +%Y%m%d`

    (
    #echo "To:  ashenavandeh@??????.com"
    echo "Subject: last 24 hours high priority Syslog messages"
    echo "MIME-Version: 1.0"
    echo "Content-Type: MULTIPART/MIXED; "
    echo "    BOUNDARY="\"$BOUNDARY\"
    echo
    echo "--${BOUNDARY}"
    echo "Content-Type: TEXT/html;"
    echo
    echo "<html><body>"
    echo "<H3>Last 24 hours high priority Syslog messages -"
    date
    echo "</H1>"
    psql -U rsyslog syslog --html -c "select devicereportedtime as Date,Priority,fromhost as Source,syslogtag as Proc_Info,message as Message from systemevents where priority < 3 and devicereportedtime < now() - interval '1 day';"
    echo "</body></html>"
    echo
    echo "--${BOUNDARY}"
    echo "Content-Type: application/vnd.ms-excel charset=US-ASCII"
    echo "Content-disposition: attachment; filename=syslog-$DATE.csv"
    echo
    psql -U rsyslog syslog -A -F ',' -c "select devicereportedtime as Date,Priority,fromhost as Source,syslogtag as Proc_Info,message as Message from systemevents where priority < 3 and devicereportedtime < now() - interval '1 day';"
    echo
    echo "--${BOUNDARY}"
    ) | /usr/sbin/sendmail -t

    Outputting to CSV from Postgresql

    If I am going to do it in shell, I will do it like this:

    psql -U user db -A -F ',' -c "select 1+1 as A,2+2 as B;"

    Monday, January 14, 2013

    How to make rsyslog to write syslogs in a database(PostgreSQL):


    I am using rsyslog as it is more common in RHEL enviroments these days but I am sure you can find the equvalent  packages in other OS and distributions:

    Install postgreSQL module for rsyslog:
     # yum install rsyslog-pgsql


    In /etc/rsyslog.conf add following lines:

    $ModLoad imuxsock # provides support for local system logging (e.g. via logger command)
    $ModLoad imklog   # provides kernel logging support (previously done by rklogd)

    # Provides UDP syslog reception
    $ModLoad imudp
    $UDPServerRun 514

    # Provides TCP syslog reception
    $ModLoad imtcp
    $InputTCPServerRun 514


    # Include all config files in /etc/rsyslog.d/
    $IncludeConfig /etc/rsyslog.d/*.conf


    Make the /etc/rsyslog.d/psql.conf file with the following contents:

    $ModLoad ompgsql.so

    $WorkDirectory /var/tmp/rsyslog/work

    # This would queue _ALL_ rsyslog messages, i.e. slow them down to rate of DB ingest.
    # Don't do that...
    # $MainMsgQueueFileName mainq  # set file name, also enables disk mode

    # We only want to queue for database writes.
    $ActionQueueType LinkedList # use asynchronous processing
    $ActionQueueFileName dbq    # set file name, also enables disk mode
    $ActionResumeRetryCount -1   # infinite retries on insert failure

    *.*             :ompgsql:127.0.0.1,syslog,rsyslog,secret;


    The format is:

    *.*           :ompgsql:<DB HOST>,<DB USERNAME>,<DB NAME>,<PASSWORD>;

    Now, to config postgreSQL, do the following changes in postgresql config file:
    In /var/lib/pgsql/data/postgresql.conf :

    listen_addresses = 'localhost'
    port = 5432
    max_connections = 100

    And following changes to /var/lib/pgsql/data/pg_hba.conf to grant the local accesses:

    # "local" is for Unix domain socket connections only
    #local   all         all                               ident sameuser
    local    all         all                               trust
    # IPv4 local connections:
    #host    all         all         127.0.0.1/32          ident sameuser
    host    all         all         127.0.0.1/32          trust
    # IPv6 local connections:
    #host    all         all         ::1/128               ident sameuser
    host    all         all         ::1/128               trust


    Now restart the postgreSQL server:

    # service postgresql restart

    Create the database:

    #su - postgres
    -bash-4.1$ createuser rsyslog;
    Shall the new role be a superuser? (y/n) y
    -bash-4.1$ createdb -T template0 -E SQL_ASCII syslog;


    -bash-4.1$ psql -l
                                      List of databases
       Name    |  Owner   | Encoding  |  Collation  |    Ctype    |   Access privil
    eges
    -----------+----------+-----------+-------------+-------------+----------------
    -------
     postgres  | postgres | UTF8      | en_US.UTF-8 | en_US.UTF-8 |
     syslog    | postgres | SQL_ASCII | en_US.UTF-8 | en_US.UTF-8 |
     template0 | postgres | UTF8      | en_US.UTF-8 | en_US.UTF-8 | =c/postgres
    (3 rows)

    -bash-4.1$


     
     Now we show create the database schema.  The package has a file located at /usr/shade/doc//rsyslog-pgsql-5.8.10/createDB.sql which has the requiered schema. But I had to comment out the first line to make it work:

     -- CREATE DATABASE Syslog WITH ENCODING 'SQL_ASCII';
    \c syslog;
    CREATE TABLE SystemEvents
    (
            ID serial not null primary key,
            CustomerID bigint,
            ReceivedAt timestamp without time zone NULL,
            DeviceReportedTime timestamp without time zone NULL,
            Facility smallint NULL,
            Priority smallint NULL,
            FromHost varchar(60) NULL,
            Message text,
            NTSeverity int NULL,
            Importance int NULL,
            EventSource varchar(60),
            EventUser varchar(60) NULL,
            EventCategory int NULL,
            EventID int NULL,
            EventBinaryData text NULL,
            MaxAvailable int NULL,
            CurrUsage int NULL,
            MinUsage int NULL,
            MaxUsage int NULL,
            InfoUnitID int NULL ,
            SysLogTag varchar(60),
            EventLogType varchar(60),
            GenericFileName VarChar(60),
            SystemID int NULL
    );

    CREATE TABLE SystemEventsProperties
    (
            ID serial not null primary key,
            SystemEventID int NULL ,
            ParamName varchar(255) NULL ,
            ParamValue text NULL
    );


    Use the following line to apply the table schema assuming you are already in the right path:


    #psql -U rsyslog syslog -f ./createDB.sql
     
    Reload the rsyslog service and check if there is any error in /var/log/messages:

    # service rsyslog reload


    Did I say how to set a password for the rsyslog user in postgreSQL ?

    # su - postgres
    -bash-4.1$ psql
    psql (8.4.13)
    Type "help" for help.

    postgres=# Alter user rsyslog with password 'secret';
    ALTER ROLE
    postgres=# \q
    -bash-4.1$



    This should work. you can see the logs in systemevents table:

    # psql -W -Ursyslog syslog
    Password for user rsyslog:
    psql (8.4.13)
    Type "help" for help.

    syslog=# select count(*) from systemevents;
     count
    -------
      6596
    (1 row)

    syslog=#