Bash


Bash get script file name and location 1

The following code will populate the variables SCRIPT_NAME and SCRIPT_DIR with the name of the script currently being execute and the location this script is in:

SCRIPT_NAME=$(basename $(test -L "$0" && readlink "$0" || echo "$0"));
SCRIPT_DIR=$(cd $(dirname "$0") && pwd);

Notes for SCRIPT_NAME:

  • $0 expands to the name of the shell or shell script
  • test -L "$0" checks that input is a file that exists and is a symbolic link
  • && readlink "$0" will be executed if the above statement is true and it will print the resolved symbolic link
  • || echo "$0" will be executed if the test for symbolic link fails
  • finally, basename will strip directory and suffix from whatever is returned from the above statements

Notes for SCRIPT_DIR:

  • Will not resolve the correct folder if the last component of the path is a symbolic link (symlink). It will return the location of the symlink instead of the location of the file the symlink is pointing to
  • cd will return 0 if it successfully navigates to a directory or 1 when it fails to navigate to the directory
  • cd "$( dirname "$0" )" will use dirname to strip the last component from the expanded name and try to navigate to that location
  • if the above cd fails, we get the current location using && pwd. pwd will print name of current/working directory

In case you have a problem with $0, it is overwritten or the above function is called by a child script in another folder you can replace $0 with ${BASH_SOURCE[0]}.

SCRIPT_NAME=$(basename $(test -L "${BASH_SOURCE[0]}" && readlink "${BASH_SOURCE[0]}" || echo "${BASH_SOURCE[0]}"));
SCRIPT_DIR=$(cd $(dirname "${BASH_SOURCE[0]}") && pwd);

Bash: Extract data from files both filtering filename, the path and doing internal processing

The following code will find all files that match the pattern 2016_*_*.log (all the log files for the year 2016).

To avoid finding log files from other services than the Web API service, we filter only the files that their path contains the folder webapi. Specifically, we used "/ServerLogs/*/webapi/*" with the following command to match all files that are under the folder /ServerLogs/ and somewhere in the path there is another folder named webapi, we do that to match files that are like /ServerLogs/Production/01/webapi/* only. The way we coded our regular expression, it will not match if there is a folder called webapi directly under the /ServerLogs/ (e.g. /ServerLogs/webapi/*).

For each result, we execute an awk script that will split the lines using the comma (FS=",";) character, then check if the line contains exactly 4 tokens (if (NF == 4) {). Later, we get the 4th token and check if it contains the substring "MASTER=" (if (match($4,"MASTER=")) {), if it does contain it we split it using the space character and assign the result to the variable named tokens. From tokens, we get the first token and use substr to remove the first character. Finally, we use the formatted result to create an array where the keys are the values we just created and it is used as a hashmap to keep record of all unique strings. In the end clause, we print all the elements of our hash map.

Finally, we sort all the results from all the awk executions and remove duplicates using sort --unique.


find /ServerLogs/ \
    -iname "2016_*_*.log" \
    -ipath "/ServerLogs/*/webapi/*" \
    -exec awk '
        BEGIN {
            FS=",";
        }
        {
            if (NF == 4) {
                if (match($4,"MASTER=")) {
                    split($4, tokens, " ");
                    instances[substr(tokens[1], 2)];
                }
            }
        }
        END {
            for (element in instances) {
                print element;
            }
        }
    ' \
    '{}' \; | sort --unique;

Following is the same code in one line.

 find /ServerLogs/ -iname "2016_*_*.log" -ipath "/ServerLogs/*/webapi/*" -exec awk 'BEGIN {FS=",";} {if (NF == 4) {if (match($4,"MASTER=")){split($4, tokens, " "); instances[substr(tokens[1], 2)];}}} END {for (element in instances) {print element;}}' '{}' \; | sort --unique 

Another way

Another way to do similar functionality would be the following


find /ServerLogs/ \
    -iname "2016_*_*.log" \
    -ipath "/ServerLogs/*/webapi/*" \
    -exec sh -c '
        grep "MASTER=" -s "$0" | awk "BEGIN {FS=\",\";} NF==4" | cut -d "," -f4 | cut -c 3- | cut -d " " -f1 | sort --unique
    ' \
    '{}' \; | sort --unique;

What we changed is the -exec part. Instead of calling a awk script, we create a new sub-shell using sh -c, then we define the source to be executed inside the single codes and we pass as the first parameter of the shell the filename that matched.

Inside the shell, we find all lines that contain the string MASTER= using the grep command. Later we filter out all lines that do not have four columns when we tokenize using the comma character using awk. Then, we get the 4th column using cut and delimiter the comma. We remove the first two characters of the input string using cut -c 3- and later we get only the first column by reusing cut and changing the delimiter to be the space character. With those results we perform a sort that eliminates duplicates and we pass the results to the parent process to perform other operations.

Following is the same code in one line


find /ServerLogs/ -iname "2016_*_*.log" -ipath "/ServerLogs/*/webapi/*" -exec sh -c 'grep "MASTER=" -s "$0" | awk "BEGIN {FS=\",\";} NF==4" | cut -d "," -f4 | cut -c 3- | cut -d " " -f1 | sort --unique' '{}' \; | sort --unique;


Ignore SSL certificates for GIT

The background

So, recently a new firewall was installed, this firewall performs SSL/TLS decryption on all encrypted traffic…

In order for machines to continue operating normally, a custom certificate was issued and installed on each one. On certain machines though, the certificate was not installed and this caused verification problems.

The story

While trying to clone a git project from github we got the following output


$ git clone https://github.com/ioi/translation.git
Cloning into 'translation'...
fatal: unable to access 'https://github.com/ioi/translation.git/': server certificate verification failed. CAfile: /etc/ssl/certs/ca-certificates.crt CRLfile: none

The horrible solution

To mitigate the problem (not solve it), we directed git to ignore the SSL certificates and not verify them using the following call right before the clone command.


export GIT_SSL_NO_VERIFY=true

As expected, the execution went smoothly after this change


$ git clone https://github.com/ioi/translation.git
Cloning into 'translation'...
remote: Counting objects: 297, done.
remote: Total 297 (delta 0), reused 0 (delta 0), pack-reused 297
Receiving objects: 100% (297/297), 4.40 MiB | 1.50 MiB/s, done.
Resolving deltas: 100% (39/39), done.
Checking connectivity... done.


bash: Simple way to get n-th column

Using cut you can select any column and define a custom delimiter to support multiple input formats you can select a column (or more) with barely minimum code.

cut -d',' -f2 myFile.csv

The above command will read the file myFile.csv (which is a CSV file) break it down to columns using the ‘,‘ character and then get the second column.

The option -f specifies which field (column) you want to extract, and the option -d specifies what is the field delimiter (column) that is used in the input file.

The -f parameter allows you to select multiple columns at the same time. You can achieve that by defining multiple columns separated using the ‘,‘ and by defining ranges using the - character.

Examples

  • -f1 selects the first column
  • -f1,3,4 selects columns 1, 3 and 4
  • -f1-4 selects all columns in the range 1-4
  • -f1,3,5-7,9 selects columns 1,3,8 and all the columns in the range 5-7