EV5_Beeldherk_Bomen/README.md

# Tree Recogniser 7000

This repository contains all files for the Image recognition course of HU Electrical Engineering year 3.

---

### Directories and files:
```
.
├── example/                                        (training assignments course)
├── out/
│   ├── img                                         (images exported using CVSuite)/
│   │   └── (tag)_(preprocessor)_(date/time).png
│   ├── log                                         (preprocessor export from CVSuite)/
│   │   └── result_(date/time).csv
│   └── models                                      (exported OpenCV ML models for usage in CVSuite tests)/
│       └── model_(name).yaml
├── res/
│   ├── dataset                                     (dataset for CVSuite, see README)/
│   │   ├── testing
│   │   ├── training
│   │   ├── validation
│   │   └── *.png
│   ├── essay/                                      (export photo's and graphs for report)
│   ├── trees/                                      (initial dataset)
│   └── *.png                                       (photos required by assignments in .example/)
├── src/
│   ├── config/
│   │   └── config.json                             (CVSuite config, alter config.template.json to desired settings)
│   ├── experiments/                                (standalone python scripts for experimentation)
│   ├── helpers/
│   │   ├── gui/
│   │   │   └── main.ui                             (pygubu ui configuration)
│   │   ├── test/                                   (ML test classes for CVSuite)
│   │   └── *.py                                    (other CVSuite helper classes)
│   └── suite.py                                    (CVSuite main script)
├── README.md                                       (this file)
└── requirements.txt                                (pip install file)
```

---

## How to:
### Use the virtual environment
1. Make sure you have the Python extension in VSCode
2. Create a virtual environment using VSCode by entering the Command Palette, selecting "Python: Create Environment..." and choosing venv.
3. VSCode will automatically include the venv in the integrated terminal, if you want to open it in another terminal, use the appropriate activation script in the `.venv` folder
```sh
$ ./.venv/Scripts/activate(.bat/.ps1)
```
4. Install required packages using pip
```sh
$ pip install -r ./requirements.txt
```

### Fix relative imports
1. Install the local package as editable using `pip`:
```sh
$ pip install -e .
```

### Create a dataset
1. Rename all images to include a tag and unique id, seperated by an underscore '_'
    - e.g. `accasia_1210262`
2. Put all images into `./res/dataset`
3. Run the dataset tool:
```bash
$ python ./src/experiments/dataset.py
```
4. (optional) run the template extraction tool
5. (optional) run the dataset splitter tool

### Run CVSuite (for the first time)
1. Create `config.json` in the `./src/config/` folder and copy the contents of the template
2. Edit `config.json` to fit your system, use full paths 
    - `path` should point to the dataset directory
    - `models` should point to trained ML models in YAML format
    - `out` should point to the respective folders in the `./out` folder
    - `size` determines the display size in the suite
3. Run CVSuite:
```sh
$ python ./src/suite.py
``` 

### Train and export a KNN model
1. Open CVSuite and select the desired training set
2. Press 'Run analysis for entire dataset(!)', this will export a CSV file with all preprocessed data in the `./out` directory
    - Based on your system configuration, this might take a while
3. Run the CVSuiteTestKNN CLI tool, the following arguments are required:
    - `-i` Input CSV file
    - `-o` Output folder, likely `./out/models`
```sh
$ python ./src/helpers/test/knn.py -i ./out/result-(date/time).csv -o ./out/models/ 
```
4. The script generates two files; A fitted scaler to use with other models (`.pkl` file) and the model itself (`.yaml` file)
5. Edit your `config.json` to include the newly created model

### Train and export a Decision tree model
> :memo: **Please note:**<br>
> The KNN Training script also generates the scaler required to make the decision tree model

1. Run the CVSuiteTestTree CLI Tool using the following arguments:
    - `-i` Input CSV file
    - `-o` Output folder, likely `./out/models`
    - `-m` Model to train; `dectree`, `randforest` or `extratree`
    - `-s` Scaler file to use (`.pkl` file)
```sh
$ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o ./out/models/ -m 'dectree' -s ./out/models/scale_(date/time).pkl
```
2. The script generates one `.pkl` file based on the chosen model
3. Edit your `config.json` to include the newly created model

### Template extraction
> :warning: **Please note:** <br>
> This tool uses the legacy format for datasets.<br>
> Images are sorted using folders, instead of by name.

1. Images should have four standard Aruco markers clearly visible
2. Run the template extraction tool with an input directory as argument
```sh
$ python ./src/experiments/template_extraction/script.py ./dataset
```
3. The script generates new folders, ending with `_out`
4. The paths to any failed images are saved in `skipped.txt`

### Dataset splitting
1. Ensure that the dataset is in `./res/dataset`
2. Run the dataset splitter tool:
```sh
$ python ./src/experiments/dataset.py
```
3. Three new folders will be created, containing the following percentage of images:
    - `./res/dataset/training`, 70%
    - `./res/dataset/validation`, 20%
    - `./res/dataset/testing`, 10%
4. Images are split pseudorandomly, thus will create the same datasets on different machines.
---

Arne van Iterson<br>
Tom Selier
README 2023-10-21 13:40:21 +02:00			`# Tree Recogniser 7000`

Import 2023-09-13 22:07:17 +02:00			`This repository contains all files for the Image recognition course of HU Electrical Engineering year 3.`

			`---`

README 2023-10-21 13:40:21 +02:00			`### Directories and files:`
			```
			`.`
			`├── example/ (training assignments course)`
			`├── out/`
			`│ ├── img (images exported using CVSuite)/`
			`│ │ └── (tag)_(preprocessor)_(date/time).png`
			`│ ├── log (preprocessor export from CVSuite)/`
			`│ │ └── result_(date/time).csv`
			`│ └── models (exported OpenCV ML models for usage in CVSuite tests)/`
			`│ └── model_(name).yaml`
			`├── res/`
			`│ ├── dataset (dataset for CVSuite, see README)/`
			`│ │ ├── testing`
			`│ │ ├── training`
			`│ │ ├── validation`
			`│ │ └── *.png`
			`│ ├── essay/ (export photo's and graphs for report)`
			`│ ├── trees/ (initial dataset)`
			`│ └── *.png (photos required by assignments in .example/)`
			`├── src/`
			`│ ├── config/`
			`│ │ └── config.json (CVSuite config, alter config.template.json to desired settings)`
			`│ ├── experiments/ (standalone python scripts for experimentation)`
			`│ ├── helpers/`
			`│ │ ├── gui/`
			`│ │ │ └── main.ui (pygubu ui configuration)`
			`│ │ ├── test/ (ML test classes for CVSuite)`
			`│ │ └── *.py (other CVSuite helper classes)`
			`│ └── suite.py (CVSuite main script)`
			`├── README.md (this file)`
			`└── requirements.txt (pip install file)`
			```

			`---`

Windows paths are dumb af 2023-10-21 13:43:45 +02:00			`## How to:`
			`### Use the virtual environment`
README 2023-10-21 13:40:21 +02:00			`1. Make sure you have the Python extension in VSCode`
			`2. Create a virtual environment using VSCode by entering the Command Palette, selecting "Python: Create Environment..." and choosing venv.`
			3. VSCode will automatically include the venv in the integrated terminal, if you want to open it in another terminal, use the appropriate activation script in the `.venv` folder
			```sh
			`$ ./.venv/Scripts/activate(.bat/.ps1)`
			```
			`4. Install required packages using pip`
			```sh
			`$ pip install -r ./requirements.txt`
			```

Fixed double code and scaling for dectree 2023-10-22 13:50:29 +02:00			`### Fix relative imports`
			1. Install the local package as editable using `pip`:
			```sh
			`$ pip install -e .`
			```

Windows paths are dumb af 2023-10-21 13:43:45 +02:00			`### Create a dataset`
README 2023-10-21 13:40:21 +02:00			`1. Rename all images to include a tag and unique id, seperated by an underscore '_'`
			- e.g. `accasia_1210262`
			2. Put all images into `./res/dataset`
			`3. Run the dataset tool:`
			```bash
			`$ python ./src/experiments/dataset.py`
			```
added template extraction to readme 2023-10-22 18:59:50 +02:00			`4. (optional) run the template extraction tool`
			`5. (optional) run the dataset splitter tool`
README 2023-10-21 13:40:21 +02:00
Windows paths are dumb af 2023-10-21 13:43:45 +02:00			`### Run CVSuite (for the first time)`
README 2023-10-21 13:40:21 +02:00			1. Create `config.json` in the `./src/config/` folder and copy the contents of the template
			2. Edit `config.json` to fit your system, use full paths
			- `path` should point to the dataset directory
			- `models` should point to trained ML models in YAML format
			- `out` should point to the respective folders in the `./out` folder
			- `size` determines the display size in the suite
			`3. Run CVSuite:`
			```sh
			`$ python ./src/suite.py`
			```

Windows paths are dumb af 2023-10-21 13:43:45 +02:00			`### Train and export a KNN model`
README 2023-10-21 13:40:21 +02:00			`1. Open CVSuite and select the desired training set`
			2. Press 'Run analysis for entire dataset(!)', this will export a CSV file with all preprocessed data in the `./out` directory
			`- Based on your system configuration, this might take a while`
"Fixed" imports for CLI tools 2023-10-22 14:51:19 +02:00			`3. Run the CVSuiteTestKNN CLI tool, the following arguments are required:`
			- `-i` Input CSV file
			- `-o` Output folder, likely `./out/models`
README 2023-10-21 13:40:21 +02:00			```sh
Scaler Export 2023-10-21 18:58:17 +02:00			`$ python ./src/helpers/test/knn.py -i ./out/result-(date/time).csv -o ./out/models/`
README 2023-10-21 13:40:21 +02:00			```
"Fixed" imports for CLI tools 2023-10-22 14:51:19 +02:00			4. The script generates two files; A fitted scaler to use with other models (`.pkl` file) and the model itself (`.yaml` file)
			5. Edit your `config.json` to include the newly created model
README 2023-10-21 13:40:21 +02:00
"Fixed" imports for CLI tools 2023-10-22 14:51:19 +02:00			`### Train and export a Decision tree model`
Fixed double code and scaling for dectree 2023-10-22 13:50:29 +02:00			`> :memo: Please note:<br>`
			`> The KNN Training script also generates the scaler required to make the decision tree model`

"Fixed" imports for CLI tools 2023-10-22 14:51:19 +02:00			`1. Run the CVSuiteTestTree CLI Tool using the following arguments:`
			- `-i` Input CSV file
			- `-o` Output folder, likely `./out/models`
			- `-m` Model to train; `dectree`, `randforest` or `extratree`
			- `-s` Scaler file to use (`.pkl` file)
			```sh
added template extraction to readme 2023-10-22 18:59:50 +02:00			`$ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o ./out/models/ -m 'dectree' -s ./out/models/scale_(date/time).pkl`
"Fixed" imports for CLI tools 2023-10-22 14:51:19 +02:00			```
			2. The script generates one `.pkl` file based on the chosen model
			3. Edit your `config.json` to include the newly created model

added template extraction to readme 2023-10-22 18:59:50 +02:00			`### Template extraction`
			`> :warning: Please note: <br>`
			`> This tool uses the legacy format for datasets.<br>`
added dataset splitter to readme 2023-10-22 19:07:57 +02:00			`> Images are sorted using folders, instead of by name.`
added template extraction to readme 2023-10-22 18:59:50 +02:00
			`1. Images should have four standard Aruco markers clearly visible`
			`2. Run the template extraction tool with an input directory as argument`
			```sh
			`$ python ./src/experiments/template_extraction/script.py ./dataset`
			```
			3. The script generates new folders, ending with `_out`
			4. The paths to any failed images are saved in `skipped.txt`
added dataset splitter to readme 2023-10-22 19:07:57 +02:00
			`### Dataset splitting`
			1. Ensure that the dataset is in `./res/dataset`
			`2. Run the dataset splitter tool:`
			```sh
			`$ python ./src/experiments/dataset.py`
			```
			`3. Three new folders will be created, containing the following percentage of images:`
			- `./res/dataset/training`, 70%
			- `./res/dataset/validation`, 20%
spelling mistake aaaaahhh 2023-10-23 15:06:52 +02:00			- `./res/dataset/testing`, 10%
added dataset splitter to readme 2023-10-22 19:07:57 +02:00			`4. Images are split pseudorandomly, thus will create the same datasets on different machines.`
README 2023-10-21 13:40:21 +02:00			`---`

			`Arne van Iterson<br>`
Import 2023-09-13 22:07:17 +02:00			`Tom Selier`