Monthly Archives: November 2021


Playing with shadow(s)

The passwords are stored in a digital fingerprint using salt (salted hash) in the Linux system through the crypt function. You can find relevant information about the implementation in Ubuntu at http://manpages.ubuntu.com/manpages/bionic/man3/crypt.3.html (how to call this function, what the arguments are etc.) Assume that you have managed to access a secret shadow file of an Ubuntu Linux system (see the shadow file contents below). The system has two users, bob and alice.

alice:$6$.s6xaWmE$m9KjrSJ1dgZ20M5IhEyXORNV.KZwBk5hp1XZ0mpOyTe.dGET.EdMCFgPimkeM7nWEW4wejMoVV.40Cg6w9XJ..:17470:0:99999:7:::
bob:$6$aACNZdTj$GYrSPRP.ieCiUfmFFRwKwEByU2rdSdfP4gCij1asUgT.dpmmu3NIDLAAde5cfvNtacI9JUGQUgrBciUWAUWNY1:17470:0:99999:7:::

Tasks

Based on the information you will obtain from the above link, which hash function did the system use to generate the salted hashed passwords?

From the documentation, we get the following information:

The glibc2 version of this function supports additional encryption algorithms.

If salt is a character string starting with the characters "$id$"  followed  by  a  string
optionally terminated by "$", then the result has the form:

      $id$salt$encrypted

id  identifies  the encryption method used instead of DES and this then determines how the
rest of the password string is interpreted.  The following values of id are supported:

      ID  | Method
      ─────────────────────────────────────────────────────────
      1   | MD5
      2a  | Blowfish (not in mainline glibc; added in some
          | Linux distributions)
      5   | SHA-256 (since glibc 2.7)
      6   | SHA-512 (since glibc 2.7)

We can see in both lines of our shadow file, that between the first and second colon (:) the hashed value is following the format described above. This means that the id of the function used is between the first and second dollar sign ($) and that value is in both cases the value 6. From this result, we know that SHA-512 was used.

Which salt was used for bob and which for alice?

The salt that was used for bob is .s6xaWmE and the salt for alice aACNZdTj. We get this data on each line, between the second dollar sign ($) and the third.

From the shadow file, can you see for sure if alice and bob have chosen different passwords? Explain your answer.

Since the system used different salt while hashing each password, we cannot infer any information regarding the similarity of the passwords of bob and alice.

Your goal is to discover the passwords of these two users. To do this, you have some information about the personal life of bob and alice:

For the bob user, you know that he is now starting to learn how to use computers. He is unaware of security risks and has probably chosen an easy password (one that many users generally select).

For alice, you know some of her personal information: her phone number is 6955345671, her license plate is ZKA4221, and she likes the Rolling Stones. Alice has only a limited knowledge of security. She knows it is suitable for a password to combine letters with numbers and have several characters. Still, she does not fully understand how to choose a secure password.

Please describe the actions you will take to guess their passwords, doing your tests on the shadow file. For bob, use lists of common passwords, while for alice, use the above personal information.

To get the password of bob, we used a simple approach using John the Ripper (lovely tool by the way). As seen in the following commands, we installed John the Ripper and using the default settings we asked John to crack any passwords in our shadow file.

sudo snap install john-the-ripper;
john shadow ;

IIn seconds, John the Ripper produced the password for the user bob, which was 1234567890. John the Ripper was so quick that it felt like magic! Unfortunately (or fortunately, depending on the context), John the Ripper cannot do magic… The reason it was so fast, in this case, is because the default settings for John the Ripper instruct the tool to try many simple passwords first using wordlist files. We already knew that bob was using one of those passwords, so it made the selection of options trivial.

At this stage, it would appear that the password for alice was harder to guess, and John the Ripper did not produce a result after we let it execute for a few minutes. Something that is worth mentioning about the case of bob is the following; We could have asked John the Ripper to try and crack the password using all alphanumeric characters only ([a-z][A-Z][0-9]). It would have taken longer, but it could work since we knew from Human Intelligence (HumInt) that bob is using fairly simple passwords.

Below is the raw output of the console for the case of cracking the password of bob. Please note that we used ctrl+c to kill the execution as we did not expect John the Ripper to get the password for alice as well.

$ john shadow 
Created directory: /home/bob/snap/john-the-ripper/459/.john
Warning: detected hash type "sha512crypt", but the string is also recognized as "HMAC-SHA256"
Use the "--format=HMAC-SHA256" option to force loading these as that type instead
Warning: detected hash type "sha512crypt", but the string is also recognized as "sha512crypt-opencl"
Use the "--format=sha512crypt-opencl" option to force loading these as that type instead
Using default input encoding: UTF-8
Loaded 2 password hashes with 2 different salts (sha512crypt, crypt(3) $6$ [SHA512 256/256 AVX2 4x])
Cost 1 (iteration count) is 5000 for all loaded hashes
Will run 12 OpenMP threads
Proceeding with single, rules:Single
Press 'q' or Ctrl-C to abort, almost any other key for status
Warning: Only 46 candidates buffered for the current salt, minimum 48 needed for performance.
Warning: Only 45 candidates buffered for the current salt, minimum 48 needed for performance.
Almost done: Processing the remaining buffered candidate passwords, if any.
Warning: Only 43 candidates buffered for the current salt, minimum 48 needed for performance.
Warning: Only 33 candidates buffered for the current salt, minimum 48 needed for performance.
Proceeding with wordlist:/snap/john-the-ripper/current/run/password.lst, rules:Wordlist
1234567890       (bob)
Proceeding with incremental:ASCII

To tackle the case of alice, we had to improvise a bit. First of all, we made a text file that on each line it contained the personal information that Human Intelligence gathered for us about alice. To ensure that John the Ripper would have more material to work with, we also did some variations to some data, like splitting the license plate into two tokens. The file (keywords.lst) looked like so:

6955345671
ZKΑ4221
Rolling Stones
alice
zka4221
zka
4221
Stones
Rolling

Then, we used hashcat to create a new list that would combine the previous tokens making more complex passwords. We had this hint from the Human Intelligence analysis, so it was an easy choice to make. The new and amazing list (advanced.lst) that we created had the following data in it:

sudo apt-get install hashcat;
#Type in the contents of the list above
nano keywords.lst;
#Not sure why it requires sudo
sudo hashcat -a 1 --stdout keywords.lst keywords.lst > advanced.lst;
69553456716955345671
6955345671ZKΑ4221
6955345671Rolling Stones
6955345671alice
6955345671zka4221
6955345671zka
69553456714221
6955345671Stones
6955345671Rolling
ZKΑ42216955345671
ZKΑ4221ZKΑ4221
ZKΑ4221Rolling Stones
ZKΑ4221alice
ZKΑ4221zka4221
ZKΑ4221zka
ZKΑ42214221
ZKΑ4221Stones
ZKΑ4221Rolling
Rolling Stones6955345671
Rolling StonesZKΑ4221
Rolling StonesRolling Stones
Rolling Stonesalice
Rolling Stoneszka4221
Rolling Stoneszka
Rolling Stones4221
Rolling StonesStones
Rolling StonesRolling
alice6955345671
aliceZKΑ4221
aliceRolling Stones
alicealice
alicezka4221
alicezka
alice4221
aliceStones
aliceRolling
zka42216955345671
zka4221ZKΑ4221
zka4221Rolling Stones
zka4221alice
zka4221zka4221
zka4221zka
zka42214221
zka4221Stones
zka4221Rolling
zka6955345671
zkaZKΑ4221
zkaRolling Stones
zkaalice
zkazka4221
zkazka
zka4221
zkaStones
zkaRolling
42216955345671
4221ZKΑ4221
4221Rolling Stones
4221alice
4221zka4221
4221zka
42214221
4221Stones
4221Rolling
Stones6955345671
StonesZKΑ4221
StonesRolling Stones
Stonesalice
Stoneszka4221
Stoneszka
Stones4221
StonesStones
StonesRolling
Rolling6955345671
RollingZKΑ4221
RollingRolling Stones
Rollingalice
Rollingzka4221
Rollingzka
Rolling4221
RollingStones
RollingRolling

To use the above custom wordlist on the shadow file, we issued the following command to John the Ripper:

john --wordlist=advanced.lst --rules shadow;

A few moments later, John the Ripper produced the following output indicating that the password for alice was rollingstones4221.

$ john --wordlist=advanced.lst --rules shadow
Warning: detected hash type "sha512crypt", but the string is also recognized as "HMAC-SHA256"
Use the "--format=HMAC-SHA256" option to force loading these as that type instead
Warning: detected hash type "sha512crypt", but the string is also recognized as "sha512crypt-opencl"
Use the "--format=sha512crypt-opencl" option to force loading these as that type instead
Using default input encoding: UTF-8
Loaded 2 password hashes with 2 different salts (sha512crypt, crypt(3) $6$ [SHA512 256/256 AVX2 4x])
Remaining 1 password hash
Cost 1 (iteration count) is 5000 for all loaded hashes
Will run 12 OpenMP threads
Press 'q' or Ctrl-C to abort, almost any other key for status
rollingstones4221 (alice)
1g 0:00:00:00 DONE (2021-11-26 17:09) 3.846g/s 3765p/s 3765c/s 3765C/s 69553456716955345671..Rollingstonesing
Use the "--show" option to display all of the cracked passwords reliably
Session completed

Bonus Material

Finding previously cracked passwords

A tip for people that are new to John the Ripper. In case you forgot to write down all passwords that were produced; you can issue the following command that will show you all the passwords that John the Ripper knows for the specific input file:

john --show shadow;
$ john --show shadow
alice:rollingstones4221:17470:0:99999:7:::
bob:1234567890:17470:0:99999:7:::

2 password hashes cracked, 0 left

Trying to crack a shadow file using Python

In case you would like to use programming and manually crack the shadow file, there are ways.
Using the Python language and the crypt package, we can write a simple program. The program will accept the salt and the unencrypted input text and produce the hashed output. It would be the same result; as a result, a Linux machine would make while creating its shadow file.

import crypt
from hmac import compare_digest as compare_hash

crypt.crypt("1234567890", "$6$aACNZdTj$")
#It would produce the following, which is the salted hash for the password of bob
#'$6$aACNZdTj$GYrSPRP.ieCiUfmFFRwKwEByU2rdSdfP4gCij1asUgT.dpmmu3NIDLAAde5cfvNtacI9JUGQUgrBciUWAUWNY1'

Notes on how to get HarpyTM running on an Ubuntu 20.04LTS GNU/Linux

Easy Setup – Slow Execution by using the CPUs only

Getting CUDA support in OpenCV for Python can be tricky as you will need to make several changes to your PC. On many occasions, it can fail if you are not comfortable troubleshooting the procedure. For this reason, we are presenting an alternative solution that is easy to implement and should work on most machines. The problem with this solution is that it will ignore your GPU and utilize your CPUs only. This means that executions will most probably be slower than the GPU-enabled one.

Conda / Anaconda

First of all, we installed and activated anaconda on an Ubuntu 20.04LTS desktop. To do so, we installed the following dependencies from the repositories:

sudo apt-get install libgl1-mesa-glx libegl1-mesa libxrandr2 libxrandr2 libxss1 libxcursor1 libxcomposite1 libasound2 libxi6 libxtst6;

Then, we downloaded the 64-Bit (x86) Installer from (https://www.anaconda.com/products/individual#linux).

Using a terminal, we followed the instructions here (https://docs.anaconda.com/anaconda/install/linux/) and performed the installation.

Python environment and OpenCV for Python

Following the previous step, we used the commands below to create a virtual environment for our code. We needed python version 3.9 (as highlighted here https://www.anaconda.com/products/individual#linux) and OpenCV for python.

source ~/anaconda3/bin/activate;
conda create --yes --name HarpyTM python=3.9;
conda activate HarpyTM;
pip install numpy scipy scikit-learn opencv-python==4.5.1.48;

Please note that we needed to limit the package for opencv-python to 4.5.1.48 because we were getting the following error on newer versions:

python3 Monitoring.py 
./data/DJI_0406_cut.MP4
[INFO] setting preferable backend and target to CUDA...
Traceback (most recent call last):
  File "/home/bob/Traffic/HarpyTM/Monitoring.py", line 51, in <module>
    detectNet = detector(weights, config, conf_thresh=conf_thresh, netsize=cfg_size, nms_thresh=nms_thresh, gpu=use_gpu, classes_file=classes_file)
  File "/home/bob/Traffic/HarpyTM/src/detector.py", line 24, in __init__
    self.layers = [ln[i[0] - 1] for i in self.net.getUnconnectedOutLayers()]
  File "/home/bob/Traffic/HarpyTM/src/detector.py", line 24, in <listcomp>
    self.layers = [ln[i[0] - 1] for i in self.net.getUnconnectedOutLayers()]
IndexError: invalid index to scalar variable.

Usage

git clone https://github.com/rafcy/HarpyTM;
cd HarpyTM/;
# We manually create the following folders and set their priviledges as we were  getting errors when they were autogenerated by the code of HarpyTM.
mkdir -p results/Detections/videos;
chmod 777 results/Detections/videos;

After successfully cloning the project, we modified the config.ini to meet our needs, specifically, we changed all paths to be inside the folder of HarpyTM and used the full resolution for the test video:

[Video]
video_filename 		= ./data/DJI_0406_cut.MP4
resize 			= True
image_width 		= 1920
image_height 		= 1080
display_video		= False
display_width 		= 1920
display_height 		= 1080


[Export]
save_video_results	= True
video_export_path 	= ./results/Detections/videos/
csv_path 		= ./results/
export			= True
display_track		= True

[Detector]
#darknet
darknet_config 		= ./data/Configs/vehicles_ty3.cfg
darknet_weights 	= ./data/Configs/vehicles_ty3.weights
classes_file 		= ./data/Configs/vehicles.names
cfg_size_width		= 608
cfg_size_height		= 608
iou_thresh 		= 0.3
conf_thresh 		= 0.3
nms_thresh 		= 0.4
use_gpu 		= True 

[Tracker]
flight_height 		= 150
sensor_height 		= 0.455
sensor_width		= 0.617
focal_length		= 0.567
reset_boxes_frames	= 40
calc_velocity_n		= 25
draw_tracks		= True
export_data		= True

Finally, we were able to execute the code as follows:

python3 Monitoring.py;

After the execution was done, we found in the folder ./results/Detections/videos/ the video showing the bounding boxes etc. The resulting video was uploaded here:


How to Reset Password on an Ubuntu with LVM

A few days ago, a client tasked us to recover the password of an Ubuntu server 20.04LTS. The machine owner only knew the username but had no idea about the complexity of the password. We’ve asked the client if it was OK for us to reset the password instead of recovering it (meaning that we would not even try to crack the mystery of what the previous password was and just set a new one), and thankfully, the client accepted our request.

The client set up the server using Ubuntu server edition 20.04LTS, and the disk partitions were using LVM (Logical Volume Manager). To our good luck, they were not using encrypted partitions. The procedure we followed to reset the password of that server was like so:

First of all, we shut down the server and booted it with a Live USB of an Ubuntu desktop 20.04LTS. Then we started a terminal and executed the following to get root access on the live system:

sudo su;

Then, we executed pvscan to list all physical volumes and gain some intelligence on which disk we needed to work on:

pvscan;
root@ubuntu:/home/ubuntu# pvscan
  /dev/sdc: open failed: No medium found
  PV /dev/sda3   VG ubuntu-vg       lvm2 [<3.64 TiB / 3.44 TiB free]
  Total: 1 [<3.64 TiB] / in use: 1 [<3.64 TiB] / in no VG: 0 [0   ]

Following that, we used vgscan to search for all volume groups:

vgscan;
root@ubuntu:/home/ubuntu# vgscan
  /dev/sdc: open failed: No medium found
  Found volume group "ubuntu-vg" using metadata type lvm2

From these two commands, it was clear that the disk /dev/sda3 contained an LVM partition with the logical volume group name ubuntu-vg. That logical volume group held the server’s filesystem, and it was the place we needed to access to change the user’s password.

So, we used vgchange to change the attributes of the volume group and activate it like so:

vgchange -a y;
root@ubuntu:/home/ubuntu# vgchange -a y
  /dev/sdc: open failed: No medium found
  /dev/sdc: open failed: No medium found
  1 logical volume(s) in volume group "ubuntu-vg" now active

Using lvscan, we were able to list all logical volumes in all volume groups and verify that we activated the volume group of interest successfully.

lvscan;
root@ubuntu:/home/ubuntu# lvscan
  /dev/sdc: open failed: No medium found
  ACTIVE            '/dev/ubuntu-vg/ubuntu-lv' [200.00 GiB] inherit

After these steps, we were ready to reset the password of the user finally. We continued to mount the logical volume group like any other disk on the /mnt folder:

mount /dev/ubuntu-vg/ubuntu-lv /mnt/;

Then, we used chroot to change the apparent root directory for the currently running process (and its children). This command allowed our terminal to work inside the logical volume as if we had booted the server OS itself.

chroot /mnt/;

Finally, using the passwd command, we changed the user password as so:

passwd -S bob;

To clean up, we exited the chroot environment:

exit;

Then, we unmounted the logical volume group:

umount /mnt;

And finally, we set the active flag of the volume group to no.

vgchange -a n;

After the above steps, we had safely applied all changes, so we rebooted the machine using its hard drive.


Playing the QMIX Two-step game on Ray

We are trying to expand the code of the Two-step game (which is an example from the QMIX paper) using the Ray framework. The changes we want to apply should extract the best checkpoint from some trial of a tune.run(), restore it on a new QMixTrainer, and then use it on a new environment to compute the subsequent actions.

The code we tried to use is the following:

"""The two-step game from QMIX: https://arxiv.org/pdf/1803.11485.pdf

Configurations you can try:
    - normal policy gradients (PG)
    - contrib/MADDPG
    - QMIX

See also: centralized_critic.py for centralized critic PPO on this game.
"""

import argparse
from gym.spaces import Tuple, MultiDiscrete, Dict, Discrete
import os

import ray
from ray import tune
from ray.rllib.agents.qmix import QMixTrainer
from ray.tune import register_env, grid_search
from ray.rllib.env.multi_agent_env import ENV_STATE
from ray.rllib.examples.env.two_step_game import TwoStepGame
from ray.rllib.utils.test_utils import check_learning_achieved

import numpy as np

parser = argparse.ArgumentParser()
parser.add_argument("--run", type=str, default="QMIX")
parser.add_argument("--num-cpus", type=int, default=0)
parser.add_argument("--as-test", action="store_true")
parser.add_argument("--torch", action="store_true")
parser.add_argument("--stop-reward", type=float, default=7.0)
parser.add_argument("--stop-timesteps", type=int, default=50000)

if __name__ == "__main__":
    args = parser.parse_args()

    grouping = {
        "group_1": [0, 1],
    }
    obs_space = Tuple([
        Dict({
            "obs": MultiDiscrete([2, 2, 2, 3]),
            ENV_STATE: MultiDiscrete([2, 2, 2])
        }),
        Dict({
            "obs": MultiDiscrete([2, 2, 2, 3]),
            ENV_STATE: MultiDiscrete([2, 2, 2])
        }),
    ])
    act_space = Tuple([
        TwoStepGame.action_space,
        TwoStepGame.action_space,
    ])
    register_env(
        "grouped_twostep",
        lambda config: TwoStepGame(config).with_agent_groups(
            grouping, obs_space=obs_space, act_space=act_space))

    if args.run == "contrib/MADDPG":
        obs_space_dict = {
            "agent_1": Discrete(6),
            "agent_2": Discrete(6),
        }
        act_space_dict = {
            "agent_1": TwoStepGame.action_space,
            "agent_2": TwoStepGame.action_space,
        }
        config = {
            "learning_starts": 100,
            "env_config": {
                "actions_are_logits": True,
            },
            "multiagent": {
                "policies": {
                    "pol1": (None, Discrete(6), TwoStepGame.action_space, {
                        "agent_id": 0,
                    }),
                    "pol2": (None, Discrete(6), TwoStepGame.action_space, {
                        "agent_id": 1,
                    }),
                },
                "policy_mapping_fn": lambda x: "pol1" if x == 0 else "pol2",
            },
            "framework": "torch" if args.torch else "tf",
            # Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
            "num_gpus": int(os.environ.get("RLLIB_NUM_GPUS", "0")),
        }
        group = False
    elif args.run == "QMIX":
        config = {
            "rollout_fragment_length": 4,
            "train_batch_size": 32,
            "exploration_config": {
                "epsilon_timesteps": 5000,
                "final_epsilon": 0.05,
            },
            "num_workers": 0,
            "mixer": grid_search([None, "qmix", "vdn"]),
            "env_config": {
                "separate_state_space": True,
                "one_hot_state_encoding": True
            },
            # Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
            "num_gpus": int(os.environ.get("RLLIB_NUM_GPUS", "0")),
            "framework": "torch" if args.torch else "tf",
        }
        group = True
    else:
        config = {
            # Use GPUs iff `RLLIB_NUM_GPUS` env var set to > 0.
            "num_gpus": int(os.environ.get("RLLIB_NUM_GPUS", "0")),
            "framework": "torch" if args.torch else "tf",
        }
        group = False

    ray.init(num_cpus=args.num_cpus or None)

    stop = {
        "episode_reward_mean": args.stop_reward,
        "timesteps_total": args.stop_timesteps,
    }

    config = dict(config, **{
        "env": "grouped_twostep" if group else TwoStepGame,
    })

    results = tune.run(args.run, stop=stop, config=config, verbose=1, checkpoint_freq=1, checkpoint_at_end=True)

    if args.as_test:
        check_learning_achieved(results, args.stop_reward)

    best_checkpoint = results.get_best_checkpoint(results.trials[0], mode="max")
    print(f".. best checkpoint was: {best_checkpoint}")

    env = TwoStepGame(config).with_agent_groups(grouping, obs_space=obs_space, act_space=act_space)
    obs = env.reset()

    rllib_config = config.copy()
    rllib_config["mixer"] = "qmix"
    new_trainer = QMixTrainer(config=rllib_config)
    new_trainer.restore(best_checkpoint)

    a1 = new_trainer.compute_action(observation=obs['group_1'])
    a2 = new_trainer.compute_action(observation=np.concatenate([obs['group_1'], [1]]))

    ray.shutdown()

To make it easier for you to see the changes from the original, this is the patch of the changes:

Index: main.py

<+>UTF-8
===================================================================
diff --git a/main.py b/main.py
--- a/main.py	(revision 80b3473ef3eede5f94e4805797556940bee91bc8)
+++ b/main.py	(date 1637485442837)
@@ -14,13 +14,16 @@
 
 import ray
 from ray import tune
+from ray.rllib.agents.qmix import QMixTrainer
 from ray.tune import register_env, grid_search
 from ray.rllib.env.multi_agent_env import ENV_STATE
 from ray.rllib.examples.env.two_step_game import TwoStepGame
 from ray.rllib.utils.test_utils import check_learning_achieved
 
+import numpy as np
+
 parser = argparse.ArgumentParser()
-parser.add_argument("--run", type=str, default="PG")
+parser.add_argument("--run", type=str, default="QMIX")
 parser.add_argument("--num-cpus", type=int, default=0)
 parser.add_argument("--as-test", action="store_true")
 parser.add_argument("--torch", action="store_true")
@@ -120,9 +123,23 @@
         "env": "grouped_twostep" if group else TwoStepGame,
     })
 
-    results = tune.run(args.run, stop=stop, config=config, verbose=1)
+    results = tune.run(args.run, stop=stop, config=config, verbose=1, checkpoint_freq=1, checkpoint_at_end=True)
 
     if args.as_test:
         check_learning_achieved(results, args.stop_reward)
 
+    best_checkpoint = results.get_best_checkpoint(results.trials[0], mode="max")
+    print(f".. best checkpoint was: {best_checkpoint}")
+
+    env = TwoStepGame(config).with_agent_groups(grouping, obs_space=obs_space, act_space=act_space)
+    obs = env.reset()
+
+    rllib_config = config.copy()
+    rllib_config["mixer"] = "qmix"
+    new_trainer = QMixTrainer(config=rllib_config)
+    new_trainer.restore(best_checkpoint)
+
+    a1 = new_trainer.compute_action(observation=obs['group_1'])
+    a2 = new_trainer.compute_action(observation=np.concatenate([obs['group_1'], [1]]))
+
     ray.shutdown()

When we execute, we get the following errors:

a1 = new_trainer.compute_action(observation=obs['group_1'])

Produces:

ValueError: ('Observation ({}) outside given space ({})!', [0, 3], Tuple(Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2])), Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2]))))
a2 = new_trainer.compute_action(observation=np.concatenate([obs['group_1'], [1]]))

Produces:

ValueError: ('Observation ({}) outside given space ({})!', array([0, 3, 1]), Tuple(Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2])), Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2]))))

We are currently trying to figure out how we should change the observation to get accepted by the check_shape() function of the preprocessor.

def check_shape(self, observation: Any) -> None:
"""Checks the shape of the given observation."""
if self._i % VALIDATION_INTERVAL == 0:
    if type(observation) is list and isinstance(
            self._obs_space, gym.spaces.Box):
        observation = np.array(observation)
    try:
        if not self._obs_space.contains(observation):
            raise ValueError(
                "Observation ({}) outside given space ({})!",
                observation, self._obs_space)
    except AttributeError:
        raise ValueError(
            "Observation for a Box/MultiBinary/MultiDiscrete space "
            "should be an np.array, not a Python list.", observation)
self._i += 1

When calling the check_shape() function, these are the values that are processed:

observation:
value = [0, 3]
type = <class 'list'>

self._obs_space:
value = Tuple(Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2])), Dict(obs:MultiDiscrete([2 2 2 3]), state:MultiDiscrete([2 2 2])))
type = <class 'gym.spaces.tuple.Tuple'>

and this line fails:

if not self._obs_space.contains(observation)

Any positive feedback is welcome!