# Tree Recogniser 7000 This repository contains all files for the Image recognition course of HU Electrical Engineering year 3. --- ### Directories and files: ``` . ├── example/ (training assignments course) ├── out/ │ ├── img (images exported using CVSuite)/ │ │ └── (tag)_(preprocessor)_(date/time).png │ ├── log (preprocessor export from CVSuite)/ │ │ └── result_(date/time).csv │ └── models (exported OpenCV ML models for usage in CVSuite tests)/ │ └── model_(name).yaml ├── res/ │ ├── dataset (dataset for CVSuite, see README)/ │ │ ├── testing │ │ ├── training │ │ ├── validation │ │ └── *.png │ ├── essay/ (export photo's and graphs for report) │ ├── trees/ (initial dataset) │ └── *.png (photos required by assignments in .example/) ├── src/ │ ├── config/ │ │ └── config.json (CVSuite config, alter config.template.json to desired settings) │ ├── experiments/ (standalone python scripts for experimentation) │ ├── helpers/ │ │ ├── gui/ │ │ │ └── main.ui (pygubu ui configuration) │ │ ├── test/ (ML test classes for CVSuite) │ │ └── *.py (other CVSuite helper classes) │ └── suite.py (CVSuite main script) ├── README.md (this file) └── requirements.txt (pip install file) ``` --- ## How to: ### Use the virtual environment 1. Make sure you have the Python extension in VSCode 2. Create a virtual environment using VSCode by entering the Command Palette, selecting "Python: Create Environment..." and choosing venv. 3. VSCode will automatically include the venv in the integrated terminal, if you want to open it in another terminal, use the appropriate activation script in the `.venv` folder ```sh $ ./.venv/Scripts/activate(.bat/.ps1) ``` 4. Install required packages using pip ```sh $ pip install -r ./requirements.txt ``` ### Fix relative imports 1. Install the local package as editable using `pip`: ```sh $ pip install -e . ``` ### Create a dataset 1. Rename all images to include a tag and unique id, seperated by an underscore '_' - e.g. `accasia_1210262` 2. Put all images into `./res/dataset` 3. Run the dataset tool: ```bash $ python ./src/experiments/dataset.py ``` 4. (optional) run the template extraction tool 5. (optional) run the dataset splitter tool ### Run CVSuite (for the first time) 1. Create `config.json` in the `./src/config/` folder and copy the contents of the template 2. Edit `config.json` to fit your system, use full paths - `path` should point to the dataset directory - `models` should point to trained ML models in YAML format - `out` should point to the respective folders in the `./out` folder - `size` determines the display size in the suite 3. Run CVSuite: ```sh $ python ./src/suite.py ``` ### Train and export a KNN model 1. Open CVSuite and select the desired training set 2. Press 'Run analysis for entire dataset(!)', this will export a CSV file with all preprocessed data in the `./out` directory - Based on your system configuration, this might take a while 3. Run the CVSuiteTestKNN CLI tool, the following arguments are required: - `-i` Input CSV file - `-o` Output folder, likely `./out/models` ```sh $ python ./src/helpers/test/knn.py -i ./out/result-(date/time).csv -o ./out/models/ ``` 4. The script generates two files; A fitted scaler to use with other models (`.pkl` file) and the model itself (`.yaml` file) 5. Edit your `config.json` to include the newly created model ### Train and export a Decision tree model > :memo: **Please note:**
> The KNN Training script also generates the scaler required to make the decision tree model 1. Run the CVSuiteTestTree CLI Tool using the following arguments: - `-i` Input CSV file - `-o` Output folder, likely `./out/models` - `-m` Model to train; `dectree`, `randforest` or `extratree` - `-s` Scaler file to use (`.pkl` file) ```sh $ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o ./out/models/ -m 'dectree' -s ./out/models/scale_(date/time).pkl ``` 2. The script generates one `.pkl` file based on the chosen model 3. Edit your `config.json` to include the newly created model ### Template extraction > :warning: **Please note:**
> This tool uses the legacy format for datasets.
> Images are sorted using folders, instead of by name. 1. Images should have four standard Aruco markers clearly visible 2. Run the template extraction tool with an input directory as argument ```sh $ python ./src/experiments/template_extraction/script.py ./dataset ``` 3. The script generates new folders, ending with `_out` 4. The paths to any failed images are saved in `skipped.txt` ### Dataset splitting 1. Ensure that the dataset is in `./res/dataset` 2. Run the dataset splitter tool: ```sh $ python ./src/experiments/dataset.py ``` 3. Three new folders will be created, containing the following percentage of images: - `./res/dataset/training`, 70% - `./res/dataset/validation`, 20% - `./res/dataset/training`, 10% 4. Images are split pseudorandomly, thus will create the same datasets on different machines. --- Arne van Iterson
Tom Selier