GNU/Linux


How to add automatically all empty folders in git repository 4

Since you are searching for this issue, you must have realised that git does not support storing empty folders/directories.

Currently the design of the Git index (staging area) only permits files to be listed, and nobody competent enough to make the change to allow empty directories has cared enough about this situation to remedy it.

Directories are added automatically when adding files inside them. That is, directories never have to be added to the repository, and are not tracked on their own.
— From https://git.wiki.kernel.org/index.php/Git_FAQ#Can_I_add_empty_directories.3F

All the content is stored as tree and blob objects, with trees corresponding to UNIX directory entries and blobs corresponding more or less to inodes or file contents. A single tree object contains one or more tree entries, each of which contains a SHA-1 pointer to a blob or subtree with its associated mode, type, and filename.
— From https://git-scm.com/book/en/v2/Git-Internals-Git-Objects

Below we propose two solutions, depending on how you want to use those empty folders.

Solution A – The folders will always be empty

There are scenarios where the empty folders should always remain empty on git no matter what the local copy has inside them.
Such a scenario would be, wanting to add on git folders where you will build your objects and/or put temporary/cached data.
In such scenarios it is important to have the structure available but never to add those files in git.

To achieve that, we add a .gitignore file in every empty folder containing the following:

# git does not allow empty directories.
# Yet, we need to add this empty directory on git.
# To achieve that, we created this .gitignore file, so that the directory will not be empty thus enabling us to commit it.
# Since we want all generated files/folders in this directory to be ignored by git, we add a rule for this.
*
# And then add an exception for this specifc file (so that we can commit it).
!.gitignore

[download id=”2322″]

The above .gitignore file, instructs git to add this file on the repository, and thus add the folder itself while ignoring ALL other files.
You can always update your .gitignore file to allow additional files to be added on the repository at any time.
This way you will never get prompted for temporary files that they were modified/created as part of the status of your repository.

Automation

A way to achieve this automatically, and place a copy of the .gitignore file in every empty folder would be to use the following command to copy an existing .gitignore file in all empty folders.

find $PATH_TO_REPOSITORY -type d ! -path "*.git*" -empty -exec cp .gitignore '{}'/ \;

The above command assumes that there is a .gitignore file in the folder that we are executing from, which it will copy in every empty directory inside the folder that the variable $PATH_TO_REPOSITORY is pointing to.

[download id=”2322″]

Solution B – Files will be added eventually to the folders

There are scenarios where the empty folders will be filled at a later stage and we want allow those files on git.
Such a scenario would be, wanting to add on git folders where right now are empty but in some time we will add new source code or resources there.

To achieve that, we add an empty .gitkeep file in every empty folder.

The above .gitkeep file is nothing more than a placeholder.
It is not documented, because it’s not a feature of Git.
It’s a dummy file, so git will not process the empty directory, since git tracks only files.
Once you add other files to the folder, you can safely delete it.
This way you will always get prompted for files that they were modified/created as part of the status of your repository.

Automation

A way to achieve this automatically, and place a copy of the .gitkeep file in every empty folder would be to use the following command to create an empty .gitkeep file in all empty folders.

find $PATH_TO_REPOSITORY -type d ! -path "*.git*" -empty -exec touch '{}'/.gitkeep \;

The above command will create in every empty directory inside the folder that the variable $PATH_TO_REPOSITORY is pointing to a new .gitkeep file.

Finally, push the changes to the git repository

After you create/copy the files, navigate to the repository, add all the new files to the commit, commit them and push them to the repository.

cd $PATH_TO_REPOSITORY;
# Create a new branch
git checkout -b empty_folders;
# Add all modified files to the next commit.
git add .;
git commit -m "Minor change: Adding all empty folders to the repository.";
git push -u origin empty_folders;

Delete all empty directories and all directories containing empty directories

Assuming you have a complex filesystem from which you want to delete all empty directories and all directories containing empty directories recursively, while leaving intact the rest.
You can use the following command.

find . -type d -empty -delete;

The configuration we used is the following:

  • -type d restricts the results to directories only
  • -empty restricts to empty directories only
  • -delete removes the directories that matched

The above code will delete all directories that are either empty or contain empty directories in the current directory.
It will successfully delete all empty directories even if they contain a large number of empty directories in any structure inside them.


Status of an executing dd 1

Recently, we were cloning a large hard disk on another using dd.
This operation took a really long time, at some point we got curious on what the status of the execution was.
Due to the minimal output dd offers, there was no indication for us whether the system was still copying and if it had a long way to go or not.

Fortunately, the developers of dd added a feature where sending a USR1 signal to a running dd process makes it print I/O statistics to standard error and then resume copying.

To achieve that we used a second terminal and followed these steps:

  1. We used pgrep to look up the running process based on its name and get the dd running process ID (PID): pgrep ^dd$ .
  2. We passed that PID to kill -USR1 which triggered the printing of the statistics on the terminal where dd was executing: kill -USR1 $(pgrep ^dd$).

Solution

kill -USR1 $(pgrep ^dd$);

Bonus

Additionally, we wanted to have dd statistics printed automatically every minute.
To achieve that, we used watchwatch executes a program periodically, showing it’s output in full-screen.
We defined the interval in seconds using the parameter -n. (Please note that, the command will not allow less than 0.1 second interval.)

In the end, our command became as follows:

watch -n 60 kill -USR1 $(pgrep ^dd$)

The above command was sending a USR1 signal to dd via the kill application every minute (60 seconds) forcing it to print on standard output the I/O statistics.

Example

On terminal 1, we executed the command dd if=/dev/sda of=/dev/sdb;, which will copy disk sda over sdb.

On terminal 2, we executed the command kill -USR1 $(pgrep ^dd$);, which forced dd to print I/O statistics back on terminal 1.

0+49728 records in
7218+0 records out
3695616 bytes (3.7 MB) copied, 2.85812 s, 1.3 MB/s
0+78673 records in
11443+0 records out
5858816 bytes (5.9 MB) copied, 4.49477 s, 1.3 MB/s
0+99003 records in
14386+0 records out
7365632 bytes (7.4 MB) copied, 5.75575 s, 1.3 MB/s
^C0+172104 records in
24918+0 records out
12758016 bytes (13 MB) copied, 10.197 s, 1.3 MB/s

Using .tgz files 1

Create

To create a .tgz file, we used tar with the following parameters -czf:

  • -c or --create will create a new archive.
  • -z– or --gzip or --gunzip or --ungzip will filter the archive through gzip and compress it.
  • -f or --file=ARCHIVE will use archive file or device ARCHIVE. If this option is not given, tar will first examine the environment variable TAPE. If it is set, its value will be used as the archive name. Otherwise, tar will assume the compiled-in default.

Example:


tar -czf $ARCHIVE_FILE_NAME.tgz $PATH_TO_COMPRESS;

Please note that the order of the parameters will not change the result.

Extract

To extract a .tgz or .tar.gz file using tar we used the following parameters -xzf:

  • -x or --extract --get will extract the files from the archive. Arguments are optional. When given, they specify names of the archive members to be extracted.
  • -z– or --gzip or --gunzip or --ungzip will filter the archive through gzip and decompress it.
  • -f or --file=ARCHIVE will use archive file or device ARCHIVE. If this option is not given, tar will first examine the environment variable TAPE. If it is set, its value will be used as the archive name. Otherwise, tar will assume the compiled-in default.

Example:


tar -xzf $ARCHIVE_FILE_NAME.tgz;