Daily Archives: 27 April 2016

Bash/FFMPEG: Batch resize .mp4 videos to fixed resolution

We needed to shrink a bunch of mp4 videos so that they would have the same size as the screen of an android device.
We did that both to save space on the internal memory of the device and to make the device perform as efficient as possible as it would not have to shrink the video on the fly.

The command we used was the following:

find . -type f -name "*.mp4" -exec bash -c 'FILE="$1"; ffmpeg -i "${FILE}" -s 1280x720 -acodec copy -y "${FILE%.mp4}.shrink.mp4";' _ '{}' \;

What this command does is the following:

  • Find all files in current folder (and sub-folders) that have the extension .mp4
  • For each file, create a new bash instance in which it will call ffmpeg taking as first parameter the filename that matched
  • -i "${FILE}"ffmpeg will take as input the filename we matched
  • -s 1280x720 – Then change the video size to 1280x720
  • -acodec copy – It will keep the audio as is
  • -y "${FILE%.mp4}.shrink.mp4 – Finally, create a new file (or overwrite existing) that has the extension .shrink.mp4 in the same folder

PHP: Convert JavaScript-escaped Unicode characters to HTML hex references

There are cases where one might receive in PHP, escaped Unicode characters from the client side JavaScript. According to the RFC it is normal for JavaScript to convert characters to that format and in effect that we receive any character in the escaped format of \uXXXX in PHP.

Any character may be escaped.
If the character is in the Basic Multilingual Plane (U+0000 through U+FFFF),
then it may be represented as a six-character sequence:
a reverse solidus, followed by the lowercase letter u, followed by four hexadecimal digits that encode the character's code point.
The hexadecimal letters A though F can be upper or lowercase.

A sample input you might receive could look like this George\u2019s treasure box instead of George’s treasure box.

This kind of input should not be stored as is as it does not make sense to the HTML language, instead we should fix it up using preg_replace.

$decoded = preg_replace('/\\\\u([a-fA-F0-9]{4})/', '&#x\\1;', $input);

The above command will look for all instances of \uXXXX in the $input and it will replace each one with the appropriate character using the XXXX value that it will match.

What this part '/\\\\u([a-fA-F0-9]{4})/' of the code do is the following:

  • \\\\ – Find the character \ in the string, the reason we have four \ instead of one, is because it has special meaning in the regular expression and we have to escape it. For that reason we need to use two of them and get \\. After that, we need to escape each of them again due to the special meaning they have in PHP and we end up with four of them.
  • u – The previous step must be followed by a u character.
  • ([a-fA-F0-9]{4}) – After the previous step has matched, we need to match 4 characters. Each of them must be either a character from A-Z or a-z or 0-9.

This part '&#x\\1;' will:

  • &#x – Is a constant string that will print the characters &#x. These characters will instruct HTML to print the character that will occur using hexadecimal entity reference that will follow.
  • \\1 – Contains the reference of the 1st parenthesized pattern. In this case we only have a parenthesis around the XXXX part of the \uXXXX so \\1 will be replaced with the XXXX value.