This repository contains all files for the Image recognition course of HU Electrical Engineering year 3.
Go to file
2023-10-25 10:56:30 +02:00
.vscode Testing exercises working 2023-09-06 13:35:15 +02:00
example Testing exercises working 2023-09-06 13:35:15 +02:00
out Enforce directory structure 2023-10-21 12:29:11 +02:00
res added assay images 2023-09-27 18:14:53 +02:00
src No models fix 2023-10-25 10:56:30 +02:00
.gitignore added models to git ignore 2023-10-21 12:51:28 +02:00
.pylintrc Canvas logic split from main suite 2023-10-13 16:51:40 +02:00
README.md spelling mistake aaaaahhh 2023-10-23 15:06:52 +02:00
requirements.txt Fixed double code and scaling for dectree 2023-10-22 13:50:29 +02:00
setup.py Fixed double code and scaling for dectree 2023-10-22 13:50:29 +02:00

Tree Recogniser 7000

This repository contains all files for the Image recognition course of HU Electrical Engineering year 3.


Directories and files:

.
├── example/                                        (training assignments course)
├── out/
│   ├── img                                         (images exported using CVSuite)/
│   │   └── (tag)_(preprocessor)_(date/time).png
│   ├── log                                         (preprocessor export from CVSuite)/
│   │   └── result_(date/time).csv
│   └── models                                      (exported OpenCV ML models for usage in CVSuite tests)/
│       └── model_(name).yaml
├── res/
│   ├── dataset                                     (dataset for CVSuite, see README)/
│   │   ├── testing
│   │   ├── training
│   │   ├── validation
│   │   └── *.png
│   ├── essay/                                      (export photo's and graphs for report)
│   ├── trees/                                      (initial dataset)
│   └── *.png                                       (photos required by assignments in .example/)
├── src/
│   ├── config/
│   │   └── config.json                             (CVSuite config, alter config.template.json to desired settings)
│   ├── experiments/                                (standalone python scripts for experimentation)
│   ├── helpers/
│   │   ├── gui/
│   │   │   └── main.ui                             (pygubu ui configuration)
│   │   ├── test/                                   (ML test classes for CVSuite)
│   │   └── *.py                                    (other CVSuite helper classes)
│   └── suite.py                                    (CVSuite main script)
├── README.md                                       (this file)
└── requirements.txt                                (pip install file)

How to:

Use the virtual environment

  1. Make sure you have the Python extension in VSCode
  2. Create a virtual environment using VSCode by entering the Command Palette, selecting "Python: Create Environment..." and choosing venv.
  3. VSCode will automatically include the venv in the integrated terminal, if you want to open it in another terminal, use the appropriate activation script in the .venv folder
$ ./.venv/Scripts/activate(.bat/.ps1)
  1. Install required packages using pip
$ pip install -r ./requirements.txt

Fix relative imports

  1. Install the local package as editable using pip:
$ pip install -e .

Create a dataset

  1. Rename all images to include a tag and unique id, seperated by an underscore '_'
    • e.g. accasia_1210262
  2. Put all images into ./res/dataset
  3. Run the dataset tool:
$ python ./src/experiments/dataset.py
  1. (optional) run the template extraction tool
  2. (optional) run the dataset splitter tool

Run CVSuite (for the first time)

  1. Create config.json in the ./src/config/ folder and copy the contents of the template
  2. Edit config.json to fit your system, use full paths
    • path should point to the dataset directory
    • models should point to trained ML models in YAML format
    • out should point to the respective folders in the ./out folder
    • size determines the display size in the suite
  3. Run CVSuite:
$ python ./src/suite.py

Train and export a KNN model

  1. Open CVSuite and select the desired training set
  2. Press 'Run analysis for entire dataset(!)', this will export a CSV file with all preprocessed data in the ./out directory
    • Based on your system configuration, this might take a while
  3. Run the CVSuiteTestKNN CLI tool, the following arguments are required:
    • -i Input CSV file
    • -o Output folder, likely ./out/models
$ python ./src/helpers/test/knn.py -i ./out/result-(date/time).csv -o ./out/models/ 
  1. The script generates two files; A fitted scaler to use with other models (.pkl file) and the model itself (.yaml file)
  2. Edit your config.json to include the newly created model

Train and export a Decision tree model

📝 Please note:
The KNN Training script also generates the scaler required to make the decision tree model

  1. Run the CVSuiteTestTree CLI Tool using the following arguments:
    • -i Input CSV file
    • -o Output folder, likely ./out/models
    • -m Model to train; dectree, randforest or extratree
    • -s Scaler file to use (.pkl file)
$ python ./src/helpers/test/decision_tree.py -i ./out/result-(date/time).csv -o ./out/models/ -m 'dectree' -s ./out/models/scale_(date/time).pkl
  1. The script generates one .pkl file based on the chosen model
  2. Edit your config.json to include the newly created model

Template extraction

⚠️ Please note:
This tool uses the legacy format for datasets.
Images are sorted using folders, instead of by name.

  1. Images should have four standard Aruco markers clearly visible
  2. Run the template extraction tool with an input directory as argument
$ python ./src/experiments/template_extraction/script.py ./dataset
  1. The script generates new folders, ending with _out
  2. The paths to any failed images are saved in skipped.txt

Dataset splitting

  1. Ensure that the dataset is in ./res/dataset
  2. Run the dataset splitter tool:
$ python ./src/experiments/dataset.py
  1. Three new folders will be created, containing the following percentage of images:
    • ./res/dataset/training, 70%
    • ./res/dataset/validation, 20%
    • ./res/dataset/testing, 10%
  2. Images are split pseudorandomly, thus will create the same datasets on different machines.

Arne van Iterson
Tom Selier